X4170 ILOM reset and firmware update

From FnordWiki
Jump to navigation Jump to search

This is a quick HOWTO for regaining access to and updating the firmware on the ILOM (management processor) of a Sun Microsystems SunFire X4170.

What's an ILOM?

This is Sun's server management processor. It can control the power of the server it sits inside as well as some other nice things like reporting on hardware problems. Provide a virtual KVM (keyboard, video, mouse) console, virtual media to support OS installation, all over an Ethernet IP network. Power control and the like are also available over a serial connection.

Re-setting the ILOM config

prep work

  1. disconnect all power from the server (there are some big capacitors in the power supplies. It will take a few seconds for them to drain completely.)
  2. connect a serial cable to the 8P8C modular connector ("RJ-45" if you like calling it that) labeled SER MGT outlined in blue at the back of the chassis. Cisco console cables work for this.
  3. set up a terminal on the other end of the SER MGT connection. (I like Kermit, but use whatever you may be comfortable with.) The serial connection runs at 9600 bits/sec, 8 data bits, no parity, 1 stop bit (9600 8n1) with no hardware or software flow control.
  4. hold the Locate/Identify button (the button is clear and is used to ensure one is working on the correct chassis when moving from its front to back. There is an icon below it that looks a bit like a bullseye with a triangle located at the top) and connect the mains power. Count 3 seconds and release the Locate button.
  5. If things go properly, the ILOM's boot loader should print some text similar to the following before it loads the ILOM OS.
Primary Bootstrap.
  Hold Locate button for 2 seconds to display Pre-boot Menu... (yes).

At the ILOM Pre-boot Menu

Here's where we can actually change things. You'll notice a Preboot> prompt. Here follows a transcript of wiping the ILOM config, restoring the ILOM defaults, and changing some of the annoying ILOM boot loader settings:

Preboot> unconfig all
  This command erases the writeable ILOM filesystems in flash, and it reverts
  most pre-boot settings to defaults.  (See "help unconfig".)

 Enter 'y[es]' to continue: [no] y
  Erasing flash filesystems.

......... done
Erased 9 sectors
    Erased MTD filesystem 'www' at 0x1800000, 1112 KiB.

................................................................ done
Erased 64 sectors
    Erased MTD filesystem 'coredump' at 0x1000000, 8192 KiB.

................ done
Erased 16 sectors
    Erased MTD filesystem 'persist' at 0xe00000, 2048 KiB.

........ done
Erased 8 sectors
    Erased MTD filesystem 'params' at 0xd00000, 1024 KiB.
  Erasing unused flash sectors.
    Erasing unused flash region 0xbc0000..bfffff.

.. done  
Erased 2 sectors
    Erasing unused flash region 0xc00000..cfffff.

........ done
Erased 8 sectors
    Erasing unused flash region 0x1920000..1bfffff.

....................... done
Erased 23 sectors
    Erasing unused flash region 0x1c00000..1ffffff.

................................ done
Erased 32 sectors

  Reverting pre-boot settings to defaults.
Un-Protected 1 sectors 

. done   
Erased 1 sectors
*** Warning - bad CRC, using default environment 

readonly: ethaddr=00:21:28:6A:DF:6E
readonly: eth1addr=00:21:28:6A:DF:6F
  Setting env_reset_build = 'r48729 (Sep 28 2009) #000000'.
Saving environment to flash.
Protect off 10020000 ... 1003FFFF
Un-Protected 1 sectors
Erasing Flash...
. done   
Erased 1 sectors
Writing to Flash... done
Protected 1 sectors
Done resetting configuration.
  Optionally use "edit" or "net config" to change settings.
  Use "vers" to check that images are intact, then "reset" to reboot.
Preboot> edit

Press Enter by itself to reach the next question.
  Press control-C to discard changes and quit.

 Values for baudrate are {[ 9600 ]| 19200 | 38400 | 57600 | 115200 }.
  Set baudrate?                [9600] 
 Values for serial_is_host are {[ 0 ]| 1 }.
  Set serial_is_host?          [0] 
 Values for bootdelay are { -1 | 3 | 10 | 30 }.
  Set bootdelay?               [3] 30
  Set bootdelay?               [30] 
 Values for bootretry are { -1 | 30 | 300 | 3000 }.
  Set bootretry?               [<not set>] 
 Values for preferred are {[ 0 ]| 1 }.
  Set preferred?               [<not set>] 
 Values for preserve_conf are {[ yes ]| no }.
  Set preserve_conf?           [yes] no
  Set preserve_conf?           [no] 
 Values for check_physical_presence are {[ yes ]| no }.
  Set check_physical_presence? [<not set>] no
  Set check_physical_presence? [no] 
 Enter 'y[es]' to commit changes: [no] yes
Summary: Changed 3 settings.
Preboot> vers
  Main ILOM image at 0x100a0000:
    Service Processor Firmware
    Version = 3.0.6.10, for SP type 29
    Date ='Mon Sep 28 21:30:54 EDT 2009', Build ='r48729'
      Name ='sysbios'

      Bad www CRC=4224313c for data =[ *11800000, len=116000, sum=2db0fa1c ] in pkg *10100000
    uboot @0a0000 OK, kernel @102000 OK, root @20cb68 OK,
      sysbios @a8fba8 OK, pwrseq @b8fba8 OK, pbsw @b9da93 OK,
      www @1800000 Bad
    coredump @1000000 8192 KiB, persist @e00000 2048 KiB, params @d00000 1024 KiB
Preboot> reset

Next time the ILOM boot loader runs, you will be able to get the Preboot menu by typing xyzzy when the Booting linux in 30 seconds... message is printed. This is what the check_physical_presence setting change accomplishes.

The ILOM OS is an embedded Linux, so the startup messages passing by may well look familiar. The ILOM login prompt looks like so

SUNSP-1004XF510D login: 

That's the system's serial number after SUNSP-. When the ILOM boots, the default credentials (root/changeme) will be in place.

firmware update time

Oracle has made acquiring firmware updates difficult. (You'll need a service contract or something for access to the firmware update blobs. They'll be in a file named something like p19971464_2651_Generic.zip) After the update image has been acquired, installing it is most easily done with the web interface.


Sadly, though, X4170 s/n 1004XF510D is unable to run the web server for some reason. We'll have to update the firmware by command line using TFTP instead.

Procedure:

  1. Locate the .pkg file inside the ZIPed distribution of the firmware update from Oracle
  2. place the .pkg file at the root of a TFTP server's file tree
  3. Log in to the SP (service processor, another name for ILOM) as root and do the following...
-> cd /SP/firmware
/SP/firmware

-> load -o verbose -source tftp://172.16.0.1/ILOM-3_0_16_15_h_r93405-Sun_Fire_X4170_X4270_X4275.pkg

NOTE: An upgrade takes about 6 minutes to complete. ILOM
      will enter a special mode to load new firmware. No
      other tasks can be performed in ILOM until the
      firmware upgrade is complete and ILOM is reset.

      You can choose to postpone the server BIOS upgrade until the
      next server poweroff. If you do not do that, you should
      perform a clean shutdown of the server before continuing.

Are you sure you want to load the specified file (y/n)? y
Preserve existing configuration (y/n)? n
Delay BIOS upgrade until next server poweroff (note: host poweroff will always happen when upgrading from 2.x) (y/n)? n
Version: 3.0.16.15.h
Build: r93405
Date: Fri Oct 10 14:24:19 CST 2014
...........
Start update of: Service Processor Firmware
.............................................................................
Finish update of: Service Processor Firmware
.........
Start update of: Service Processor BIOS
........
Finish update of: Service Processor BIOS
.
Finished updates


Firmware update is complete.
ILOM will now be restarted with the new firmware.


-> /sbin/reboot

Something in the update process has made for a working HTTPS server on the ILOM. It may have not had the x509 certificate and key it needed to start. Regardless, it's up now.