EMC branded Mellanox SX60xx switch
Used enterprise IT stuff is stupid cheap in the US. Stupid cheap like 18 port FDR (56Gbits/sec) Infiniband switches shipped to your door for 6.67USD per port. Only a couple of issues with these:
- They run a very limited EMC OS instead of the somewhat more open Linux-based Mellanox MLNX-OS
- No VPI (Ethernet over Infiniband, more or less)
- No subnet manager (facilitates IP over Infiniband)
- Management by telnet ("SSH? That's so not going to happen!")
- It's not Linux
The devices at FnordNet are Mellanox SX6018s wearing EMC colors. (Mellanox ships switches with blue fronts and EMC switches are black.) On the connector side are:
- 18 QSFP 56Gbits/sec Infiniband connectors
- 2 8P8C modular 1000baseT Ethernet connectors (Labelled MGT)
- 1 USB connector
- 1 8P8C modular connector for the serial console (labelled CONSOLE)
- a small recessed reset button switch (small hole on the left side of the connector panel)
Mellanox makes a number of similar devices with the same generation of silicon:
- SX6005 and SX6012 -- 1U, half width, 12 QSFP connectors arranged in two rows of six. SX6005 is the unmanaged version. SX6012 is the managed switch with the subnet manager and single port Ethernet connectivity
- SX6015 and SX6018 -- 1U, full width, 18 QSFP connectors in one row of 18. SX6018 is the managed version with a subnet manager and dual Ethernet management ports
- SX6025 and SX6036 -- 1U, full width, 36 QSFP connectors in two rows of 18. SX6025 is unmanaged. SX6036 is the managed version.
All of these are built on the same Infiniband silicon -- Mellanox SwitchX-2. The management functions on the SX6012, SX6018, and SX6036 are done by an embedded Linux system running on a PowerPC M460EX series CPU attached to the SwitchX-2 silicon over a PCI interface of some sort.
Before breaking it, a diversion into EMC's SwitchOS
At power on time, a tiny bit of hardware initialization is done and then U-Boot (a free as in freedom) bootloader is given control.
Here's what it looks is dumped to the console when power is applied and the automatic boot is interrupted:
U-Boot 2009.01 SX_PPC_M460EX SX_3.2.0330-82-EMC ppc (Feb 27 2013 - 12:13:42) CPU: AMCC PowerPC 460EX Rev. B at 1000 MHz (PLB=166, OPB=83, EBC=83 MHz) Security/Kasumi support Bootstrap Option H - Boot ROM Location I2C (Addr 0x52) Internal PCI arbiter disabled 32 kB I-Cache 32 kB D-Cache Board: Mellanox PPC460EX Board FDEF: No I2C: ready DRAM: 2 GB (ECC enabled, 333 MHz, CL3) FLASH: 16 MB NAND: 1024 MiB PCI: Bus Dev VenId DevId Class Int PCIE0: link is not up. PCIE1: successfully set as root-complex 01 00 15b3 c738 0c06 00 Net: ppc_4xx_eth0, ppc_4xx_eth1 Hit any key to stop autoboot: 0 =>
Nothing too super exciting there, but it does tell us what we're looking at. Note that u-Boot is running and we have its CLI available to use. A little bit more info about the hardware is available using the bdinfo U-Boot command:
=> bdinfo memstart = 0x00000000 memsize = 0x80000000 flashstart = 0xFF000000 flashsize = 0x01000000 flashoffset = 0x00000000 sramstart = 0x00000000 sramsize = 0x00000000 bootflags = 0xFFFE0218 intfreq = 1000 MHz busfreq = 166.667 MHz ethaddr = 00:02:C9:63:EF:EE eth1addr = 00:02:C9:63:EF:EF IP addr = 172.17.255.120 baudrate = 9600 bps =>
printenv will give a feel for what EMC (before Dell bought them all up) has done pre-boot software wise:
=> printenv bootcmd=run flash_jffs2 baudrate=9600 loads_echo= autoload=n hostname=mlnx460ex netdev=eth0 nfsargs=setenv bootargs root=/dev/nfs rw nfsroot=${serverip}:${rootpath} ramargs=setenv bootargs root=/dev/ram rw addip=setenv bootargs ${bootargs} ip=${ipaddr}:${serverip}:${gatewayip}:${netmask}:${hostname}:${netdev}:off panic=1 addtty=setenv bootargs ${bootargs} console=ttyS0,${baudrate} addmisc=setenv bootargs ${bootargs} initrd_high=30000000 kernel_addr_r=400000 fdt_addr_r=800000 ramdisk_addr_r=C00000 hostname=mlnx460ex ramdisk_file=mlnx460ex/uRamdisk rootpath=/opt/eldk/ppc_4xxFP flash_self=run ramargs addip addtty addmisc;bootm ${kernel_addr} ${ramdisk_addr} ${fdt_addr} flash_nfs=run nfsargs addip addtty addmisc;bootm ${kernel_addr} - ${fdt_addr} net_nfs=tftp ${kernel_addr_r} ${bootfile}; tftp ${fdt_addr_r} ${fdt_file}; run nfsargs addip addtty addmisc;bootm ${kernel_addr_r} - ${fdt_addr_r} net_self_load=tftp ${kernel_addr_r} ${bootfile};tftp ${fdt_addr_r} ${fdt_file};tftp ${ramdisk_addr_r} ${ramdisk_file}; net_self=run net_self_load;run ramargs addip addtty addmisc;bootm ${kernel_addr_r} ${ramdisk_addr_r} ${fdt_addr_r} fdt_file=mlnx460ex/mlnx460ex.dtb flash_self_old=run ramargs addip addtty addmisc;bootm ${kernel_addr} ${ramdisk_addr} flash_nfs_old=run nfsargs addip addtty addmisc;bootm ${kernel_addr} net_nfs_old=tftp ${kernel_addr_r} ${bootfile};run nfsargs addip addtty addmisc;bootm ${kernel_addr_r} load=tftp 200000 mlnx460ex/u-boot.bin update=protect off 0xFFFA0000 FFFFFFFF;era 0xFFFA0000 FFFFFFFF;cp.b ${fileaddr} 0xFFFA0000 ${filesize};setenv filesize;saveenv upd=run load update dhcp_vendor-class-identifier=bootmfg:hwname:mlnx460ex: clear_filesize=setenv filesize mfg_dir=mlnx460ex mfg_args=setenv bootargs root=/dev/ram rw ramdisk_size=${mfg_ramdisk_size} ${mfg_extra_args} mfg_common_args=run addtty addmisc mfg_load=tftp ${kernel_addr_r} ${mfg_root}${mfg_dir}/${mfg_kernel_file};tftp ${fdt_addr_r} ${mfg_root}${mfg_dir}/${mfg_fdt_file};tftp ${ramdisk_addr_r} ${mfg_root}${mfg_dir}/${mfg_ramdisk_file} mfg_nodhcp=echo "Manufacture will TFTP from directory ${mfg_root}${mfg_dir}, and boot";echo; run clear_filesize ; run mfg_load;if test 0${filesize} -gt 0; then echo Booting mfg ; run mfg_args mfg_common_args;bootm ${kernel_addr_r} ${ramdisk_addr_r} ${fdt_addr_r} ; else ; echo Failed mfg load ; fi mfg=echo "Manufacture will DHCP, TFTP from directory ${mfg_root}${mfg_dir}, and boot";echo; dhcp; run clear_filesize ; run mfg_load;if test 0${filesize} -gt 0; then echo Booting mfg ; run mfg_args mfg_common_args;bootm ${kernel_addr_r} ${ramdisk_addr_r} ${fdt_addr_r} ; else ; echo Failed mfg load ; fi menu_file=menu.img menu_load=tftp ${menu_addr_r} ${mfg_root}${mfg_dir}/${menu_file}; if test $? -ne 0; then run clear_filesize ; echo Download failed ;fi menu_usb_load_ext2=usb start; ext2load usb ${mfg_usb_dev}:${mfg_usb_part} ${menu_addr_r} ${mfg_usb_root}${mfg_usb_dir}/${menu_file}; menu_usb_load_fat=usb start; fatload usb ${mfg_usb_dev}:${mfg_usb_part} ${menu_addr_r} ${mfg_usb_root}${mfg_usb_dir}/${menu_file}; menu_usb_load=if test "x${mfg_usb_fstype}" = "xext2"; then run menu_usb_load_ext2 ; else ; run menu_usb_load_fat ; fi menu_nodhcp=run clear_filesize ; run menu_load ; if test 0${filesize} -gt 0; then autoscr ${menu_addr_r}; else ; echo Failed menu load ; fi menu=dhcp ; run clear_filesize ; run menu_load ; if test 0${filesize} -gt 0; then autoscr ${menu_addr_r}; else ; echo Failed menu load ; fi fw_file=u-boot.bin fw_load=tftp ${fw_addr_r} ${mfg_root}${mfg_dir}/${fw_file}; if test $? -ne 0; then run clear_filesize ; echo Download failed ;fi fw_usb_load_ext2=usb start; ext2load usb ${mfg_usb_dev}:${mfg_usb_part} ${fw_addr_r} ${mfg_usb_root}${mfg_usb_dir}/${fw_file}; fw_usb_load_fat=usb start; fatload usb ${mfg_usb_dev}:${mfg_usb_part} ${fw_addr_r} ${mfg_usb_root}${mfg_usb_dir}/${fw_file}; fw_usb_load=if test "x${mfg_usb_fstype}" = "xext2"; then run fw_usb_load_ext2 ; else ; run fw_usb_load_fat ; fi fw_update_raw=protect off 0xFFFA0000 FFFFFFFF;erase 0xFFFA0000 FFFFFFFF;cp.b ${fw_addr_r} 0xFFFA0000 ${filesize};cmp.b ${fw_addr_r} 0xFFFA0000 ${filesize};setenv filesize; saveenv fw_usb_update=run clear_filesize ; run fw_usb_load ; if test 0${filesize} -gt 0; then run fw_update_raw ; else ; echo Failed update load ; fi fw_update_nodhcp=run clear_filesize ; run fw_load ; if test 0${filesize} -gt 0; then run fw_update_raw ; else ; echo Failed update load ; fi fw_update=dhcp ; run clear_filesize ; run fw_load ; if test 0${filesize} -gt 0; then run fw_update_raw ; else ; echo Failed update load ; fi boot_common_args=run addtty addmisc mfg_usb_dir=mlnx460ex mfg_usb_load_ext2=usb start; echo "Loading ${mfg_kernel_file}";ext2load usb ${mfg_usb_dev}:${mfg_usb_part} ${kernel_addr_r} ${mfg_usb_root}${mfg_usb_dir}/${mfg_kernel_file};echo "Loading ${mfg_fdt_file}"; ext2load usb ${mfg_usb_dev}:${mfg_usb_part} ${fdt_addr_r} ${mfg_usb_root}${mfg_usb_dir}/${mfg_fdt_file};echo "Loading ${mfg_ramdisk_file}"; ext2load usb ${mfg_usb_dev}:${mfg_usb_part} ${ramdisk_addr_r} ${mfg_usb_root}${mfg_usb_dir}/${mfg_ramdisk_file} mfg_usb_load_fat=usb start; echo "Loading ${mfg_kernel_file}";fatload usb ${mfg_usb_dev}:${mfg_usb_part} ${kernel_addr_r} ${mfg_usb_root}${mfg_usb_dir}/${mfg_kernel_file};echo "Loading ${mfg_fdt_file}"; fatload usb ${mfg_usb_dev}:${mfg_usb_part} ${fdt_addr_r} ${mfg_usb_root}${mfg_usb_dir}/${mfg_fdt_file};echo "Loading ${mfg_ramdisk_file}"; fatload usb ${mfg_usb_dev}:${mfg_usb_part} ${ramdisk_addr_r} ${mfg_usb_root}${mfg_usb_dir}/${mfg_ramdisk_file} mfg_usb_load=if test "x${mfg_usb_fstype}" = "xext2"; then run mfg_usb_load_ext2 ; else ; run mfg_usb_load_fat ; fi mfg_usb=echo "Manufacture will load from USB directory ${mfg_usb_root}${mfg_usb_dir}, and boot"; echo; run clear_filesize ; run mfg_usb_load; if test 0${filesize} -gt 0; then echo Booting mfg ; run mfg_args mfg_common_args;bootm ${kernel_addr_r} ${ramdisk_addr_r} ${fdt_addr_r} ; else ; echo Failed mfg load ; fi kernel_addr=ff000000 fdt_addr=ff1e0000 ramdisk_addr=ff200000 fw_addr_r=400000 menu_addr_r=B00000 pciconfighost=1 pcie_mode=RP:RP autoload=no rootdev=/dev/mtdblock6 boot_usb_ext2_loc_1=run usb_args_loc_1 boot_common_args;echo "Loading ${boot_kernel_file}";ext2load usb ${boot_usb_dev}:${boot_usb_part_loc_1} ${kernel_addr_r} ${boot_usb_root}${boot_usb_dir}/${boot_kernel_file};echo "Loading ${boot_fdt_file}";ext2load usb ${boot_usb_dev}:${boot_usb_part_loc_1} ${fdt_addr_r} ${boot_usb_root}${boot_usb_dir}/${boot_fdt_file};bootm ${kernel_addr_r} - ${fdt_addr_r} boot_usb_ext2_loc_2=run usb_args_loc_2 boot_common_args;echo "Loading ${boot_kernel_file}";ext2load usb ${boot_usb_dev}:${boot_usb_part_loc_2} ${kernel_addr_r} ${boot_usb_root}${boot_usb_dir}/${boot_kernel_file};echo "Loading ${boot_fdt_file}";ext2load usb ${boot_usb_dev}:${boot_usb_part_loc_2} ${fdt_addr_r} ${boot_usb_root}${boot_usb_dir}/${boot_fdt_file};bootm ${kernel_addr_r} - ${fdt_addr_r} boot_usb_fat_loc_1=run usb_args_loc_1 boot_common_args;echo "Loading ${boot_kernel_file}";fatload usb ${boot_usb_dev}:${boot_usb_part_loc_1} ${kernel_addr_r} ${boot_usb_root}${boot_usb_dir}/${boot_kernel_file};echo "Loading ${boot_fdt_file}";fatload usb ${boot_usb_dev}:${boot_usb_part_loc_1} ${fdt_addr_r} ${boot_usb_root}${boot_usb_dir}/${boot_fdt_file};bootm ${kernel_addr_r} - ${fdt_addr_r} boot_usb_fat_loc_2=run usb_args_loc_2 boot_common_args;echo "Loading ${boot_kernel_file}";fatload usb ${boot_usb_dev}:${boot_usb_part_loc_2} ${kernel_addr_r} ${boot_usb_root}${boot_usb_dir}/${boot_kernel_file};echo "Loading ${boot_fdt_file}";fatload usb ${boot_usb_dev}:${boot_usb_part_loc_2} ${fdt_addr_r} ${boot_usb_root}${boot_usb_dir}/${boot_fdt_file};bootm ${kernel_addr_r} - ${fdt_addr_r} mfg_kernel_file=vmlinuz mfg_ramdisk_file=rootfs mfg_ramdisk_size=180224 mfg_fdt_file=fdt mfg_usb_dev=0 mfg_usb_part=1 mfg_usb_fstype=fat mfg_usb_root=/ boot_kernel_file=vmlinuz boot_fdt_file=fdt boot_usb_dev=0 boot_usb_part_loc_1=2 boot_usb_part_loc_2=3 boot_usb_root_loc_1=/dev/sda5 boot_usb_root_loc_2=/dev/sda6 usb_args_loc_1=setenv bootargs root=${boot_usb_root_loc_1} ro reset_button=${reset_button} rootdelay=8 ${image_kernel_args} ${extra_args} usb_args_loc_2=setenv bootargs root=${boot_usb_root_loc_2} ro reset_button=${reset_button} rootdelay=8 ${image_kernel_args} ${extra_args} jffs2_args=setenv bootargs root=${rootdev} rootfstype=jffs2 ro reset_button=${reset_button} ${image_kernel_args} ${extra_args} mfg_extra_args=ramdisk=65536 ethaddr=00:02:C9:63:EF:EE eth1addr=00:02:C9:63:EF:EF ethact=ppc_4xx_eth0 bootfile=pxelinux.0 emc_fabric=B autoscr=no filesize=175B fileaddr=400000 gatewayip=10.10.4.1 netmask=255.255.252.0 ipaddr=172.17.255.120 serverip=172.17.255.252 script_rev=3.3.0 script_date=7.12.13 autostart=yes bootdelay=5 emcram_addr=400000 emcfl_one_start=ff400000 emcfl_one_end=ff5fffff emcfl_two_start=ff600000 emcfl_two_end=ff7fffff emcload_addr=ff400000 load_dir=. binary_name=ibsw.bin.load restore_defaults=setenv load_dir .; setenv binary_name ibsw.bin.load; saveenv; setip_176=setenv ipaddr 192.168.176.190; setenv serverip 192.168.176.253; saveenv setip_177=setenv ipaddr 192.168.177.191; setenv serverip 192.168.177.253; saveenv setip_linux=setenv ipaddr 192.168.10.10; setenv serverip 192.168.10.1; saveenv setip_16=setenv ipaddr 172.16.255.120; setenv serverip 172.16.255.252; saveenv setip_17=setenv ipaddr 172.17.255.120; setenv serverip 172.17.255.252; saveenv set_fab_a=setenv emc_fabric A; saveenv set_fab_b=setenv emc_fabric B; saveenv emcpxe=tftp ${emcram_addr} ${load_dir}/${binary_name} emcflash=ping $serverip; bootm ${emcload_addr} mlxlinux=run jffs2_args boot_common_args;bootm ${kernel_addr} - ${fdt_addr} flash_jffs2=run emcflash menu_usb=run emcflash emcburn_one=prot off ${emcfl_one_start} ${emcfl_one_end}; erase ${emcfl_one_start} ${emcfl_one_end}; setenv autostart no; tftp ${emcram_addr} ${load_dir}/${binary_name}; iminfo ${emcram_addr}; cp.b ${emcram_addr} ${emcfl_one_start} ${filesize}; iminfo ${emcfl_one_start}; setenv autostart yes; prot on ${emcfl_one_start} ${emcfl_one_end}; emcburn_two=prot off ${emcfl_two_start} ${emcfl_two_end}; erase ${emcfl_two_start} ${emcfl_two_end}; setenv autostart no; tftp ${emcram_addr} ${load_dir}/${binary_name}; iminfo ${emcram_addr}; cp.b ${emcram_addr} ${emcfl_two_start} ${filesize}; iminfo ${emcfl_two_start}; setenv autostart yes; prot on ${emcfl_two_start} ${emcfl_two_end}; emcload_one=setenv emcload_addr ${emcfl_one_start}; saveenv emcload_two=setenv emcload_addr ${emcfl_two_start}; saveenv script_get=setenv autostart no; tftp ${emcram_addr} uboot_start_script.img; setenv autostart yes; imi ${emcram_addr}; script_exe=autoscr ${emcram_addr} boot_emcpxe=setenv flash_jffs2 run emcpxe; setenv menu_usb run emcpxe; saveenv boot_emcflash=setenv flash_jffs2 run emcflash; setenv menu_usb run emcflash; saveenv boot_mlxlinux=setenv flash_jffs2 run mlxlinux; setenv menu_usb run mlxlinux; saveenv emchelp=echo All commands are preceeded by \"run \"; echo setip_176 ...... set ip and server ip to 192.176 subnet; echo setip_177 ...... set ip and server ip to 192.177 subnet; echo setip_16 ....... set ip and server ip to 172.16 subnet; echo setip_17 ....... set ip and server ip to 172.17 subnet;echo setip_linux .... set ip and server ip to linux default (10.10); echo emcburn_one .... burn ibsw.bin.load to the first symmk flash location;echo emcburn_two .... burn ibsw.bin.load to the second symmk flash location;echo emcload_one .... set emcflash to point to the first symmk flash location;echo emcload_two .... set emcflash to point to the second symmk flash location;echo script_get ..... Load a new version of the uboot_start_script.img;echo script_exe ..... Execute the script loaded by script_get (check CRC first!);echo script_status .. Dump revision and build date of last run script; echo boot_emcpxe .... Set PXE boot to be the default boot;echo emcpxe ......... Run PXE boot (just this time); echo boot_emcflash .. Set the Symmk in flash to be the default boot;echo emcflash ....... Boot (just this time) from flash;echo boot_mlxlinux .. Set the MLX linux version to be the default boot;echo mlxlinux ....... Run MLX linux (just this time);echo set_fab_a ...... Set switch to be fabric A; echo set_fab_b ...... Set switch to be fabric B script_status=echo Last Script Executed; echo Script Revision: 3.3.0; echo Script Built on: 7.12.13; stdin=serial stdout=serial stderr=serial reset_button=0 ver=U-Boot 2009.01 SX_PPC_M460EX SX_3.2.0330-82-EMC ppc (Feb 27 2013 - 12:13:42) Environment size: 12502/16380 bytes =>
All that was just the bootloader. Where's SwitchOS?
So, yeah. Last section is poorly named. Remember how the autoboot process was stopped last time? Well, here's what we get without interrupting it...
[...] Hit any key to stop autoboot: 0 Waiting for PHY auto negotiation to complete... done ENET Speed is 1000 Mbps - FULL duplex connection (EMAC0) Using ppc_4xx_eth0 device ping failed; host 172.17.255.252 is not alive Loading Kernel Image ... OK ABCDEFdrv_table_install: syslog at 0x00033d84 with minor 0 (DRV_SETUP) (DRV_INIT) drv_table_install: isrlog at 0x00033d84 with minor 1 (DRV_INIT) drv_table_install: userinterface at 0x00004db0 with minor 0 (DRV_SETUP) (DRV_INIT) drv_table_install: stty0 at 0x0006e804 with minor 0 (DRV_SETUP) (DRV_INIT) drv_table_install: eth0 at 0x0006f194 with minor 0 (DRV_SETUP) (DRV_INIT) drv_table_install: i2c0 at 0x00079d94 with minor 0 (DRV_SETUP) (DRV_INIT) drv_table_install: i2c1 at 0x00079d94 with minor 1 (DRV_INIT) drv_table_install: itcpip at 0x0003c05c with minor 0tcpipInit: Starting internal TCP/IP stack. (DRV_SETUP) (DRV_INIT) kernel_main: Drivers installed, installing INIT process with stack size = 8192. Bar [ 0 ] is the first memory bar sk_init_main: Starting process based initialization - 8590. 02:60:48:10:ff:78 UDP socket 3 created TCP socket 4 created sk_init_main: Process based initialization complete - 8590. sk_init_main: Installing task table. task_table_install: console at 0x00015c24 stack 0x00452410/26624 : 4 task_table_install: inetd at 0x000359a0 stack 0x00470c10/8192 : 5 task_table_install: poll_cqs at 0x000aa44c stack 0x00458c10/8192 : 6 task_table_install: poll_ports at 0x000aa584 stack 0x0045ac10/8192 : 7 task_table_install: env_mon at 0x00082f94 stack 0x0045ec10/8192 : 8 task_table_install: env_bin_api at 0x000143a0 stack 0x00460c10/8192 : 9 task_table_install: incoming_fw_files at 0x00013e84 stack 0x00462c10/8192 : 10 sk_init_main: Task table installed. Starting tasks and exiting. ----------------------------- Board Info ----------------------------- * Chasis Type : STINGRAY * Number of Ports : 18 * U-Boot Revision : 3.2.330 * Firmware Revision : 9.9.1030 * INI file Revision : 0x21010009 * SwitchOS Revision : 1.282 * SwitchOS Build Date : 2013-07-23 * SwitchOS Build Time : 15:40:30 * SwitchOS Build Path : /emc/tdowning/ppc460_release/july_23_2013 ---------------------------------------------------------------------- 15:33:09 12/16/2017 Switch-B(4)>
It almost looks like it's going to netboot if it can find a server to give it an image. But we don't have one and that's totally OK. The Switch-B(4)>
is the CLI prompt. There's a help
command that lists available commands, but it does not provide a lot of detail. But here are a few potentially useful commands:
env show inventory
lists out vital product data (VPD) for the replaceable components: switch chassis, power supplies, and fan boardsenv show fans
how fast are they spinning? 0 is probably not goodenv set uid led blink
will blink the blue LED on the left side on the switch's connector panel -- useful for finding the switch you want to work onenv set port bad led on
is much the same, but activates an orange LED just above the blue UID indicatorbaz swportinfo 1
will provide port statistics and the like. Change 1 to the number of the switch port of interestbuild
info about the software buildboard
a little bit about the hardwareinfo
a bit about the hardware and SwitchOSenv show alarms
pretty much self explanatory
Putting MLNX-OS on it
As mentioned above, I desire to have a real OS on these switches. Found some leads over on https://forums.servethehome.com/ which said conversion is possible, but "PM me for more details." The following is a translation of the nice gentleman's "Mellanox Switch Conversion Guide" document into the procedure I have used.
Pre-requisiteses
(Yes. I know I misspelled that.)
- console cable for switch (Cisco cables work fine) 9600 bps 8n1 works here.
- serial comms program (C-Kermit and I get along just fine. Others may use screen. Or minicom. Or Teraterm. Or HyperTerminal. Or for people with long memories, even {COMMO} or Telix or Qmodem.)
- Willingness to root around typing stuff into a text console fiddle with unsupported softwares
- A MLNX-OS binary distribution for the switch in question. Try:
- a hex editor (we'll be picking apart some binary archives at some point)
- some time
- some willingness to violate software licenses
- attention to detail: breaking the U-Boot image inside the switch is likely to result in a need for I2C tools to make things work again. This is not something I want to deal with.
Make switch not boot automatically
Apply power, interrupt autoboot, run
=> setenv autostart off => setenv autoload off => saveenv
We'll revert that by running
=> setenv autostart yes => setenv autoload n => saveenv
at some point later on. (Those are the settings as the switch arrived.)
Get bootable bits from MLNX-OS
The image-PPC_M460EX-3.6.1002.img
OS images are actually ZIP archives containing a compressed tar file image-PPC_M460EX-ppc-m460ex-20160609-202426.tgz
and some package meta-data that we do not need.
From image-PPC_M460EX-ppc-m460ex-20160609-202426.tgz
, extract the kernel image and device tree files:
$ gzip -dc image-PPC_M460EX-ppc-m460ex-20160609-202426.tgz | tar xvvf - ./boot/vmlinuz-uni ./boot/fdt-uni
And copy boot/vmlinuz-uni
and boot/fdt-uni
to a TFTP server that the switch will be able to retrieve these files from. Since my switch has an IP address of 172.17.255.120
as it was delivered, I'm simply adding a secondary IP address to the existing TFTP server instead of changing the switch's IP address.
Let's run Linux!
"Linux" in this section is just the kernel, not an OS -- it will panic the first time through and really not do anything useful.
One these SX6018 switches, the on-board flash has areas for 2 OS images. We're going to overwrite the first one (using U-Boot to TFTP and then flash it) and then boot it. The kernel will not have a root filesystem yet and it will not do anything useful. But it's a step toward having a working MLNX-OS.
We need both the kernel and the matching device tree file from the MLNX-OS distro. The strategy is "use TFTP to transfer kernel image to RAM, save it to flash; use TFTP to transfer the device tree, save it to flash." Here we go:
=> tftp 400000 172.17.255.118:MLNX-OS_PPC_M460EX_3.6.1002/boot/vmlinuz-uni Using ppc_4xx_eth0 device TFTP from server 172.17.255.118; our IP address is 172.17.255.120 Filename 'MLNX-OS_PPC_M460EX_3.6.1002/boot/vmlinuz-uni'. Load address: 0x400000 Loading: ################################################################# ################################################################# ## done Bytes transferred = 1927366 (1d68c6 hex) => protect off FF000000 FF1FFFFF ................ done Un-Protected 16 sectors => erase FF000000 FF1FFFFF ................ done Erased 16 sectors => cp.b ${fileaddr} FF000000 ${filesize} Copy to Flash... done => tftp 800000 172.17.255.118:MLNX-OS_PPC_M460EX_3.6.1002/boot/fdt-uni Using ppc_4xx_eth0 device TFTP from server 172.17.255.118; our IP address is 172.17.255.120 Filename 'MLNX-OS_PPC_M460EX_3.6.1002/boot/fdt-uni'. Load address: 0x800000 Loading: # done Bytes transferred = 10017 (2721 hex) => cp.b ${fileaddr} FF1E0000 ${filesize} Copy to Flash... done => protect on FF000000 FF1FFFFF ................ done Protected 16 sectors => imls Legacy Image at FF000000: Verifying Checksum ... OK Legacy Image at FF400000: Verifying Checksum ... OK Legacy Image at FF600000: Verifying Checksum ... OK =>
File transfers have been finished. The Linux kernel and device tree files have been saved to flash. U-Boot has found the newly transferred image and thinks it's valid (this is what the imls
was about.) Let us see what happens when it is run...
=> run mlxlinux Using mlnx460ex machine description Linux version 3.10.27-MELLANOXuni-m460ex (@) (gcc version 4.7.2 (GCC) ) PPC_M460EX 3.6.1002 #1 2016-06-09 20:24:26 Zone ranges: DMA [mem 0x00000000-0x2fffffff] Normal empty HighMem [mem 0x30000000-0x7fffffff] Movable zone start for each node Early memory node ranges node 0: [mem 0x00000000-0x7fffffff] MMU: Allocated 1088 bytes of context maps for 255 contexts Built 1 zonelists in Zone order, mobility grouping on. Total pages: 522752 Kernel command line: root=/dev/mtdblock6 rootfstype=jffs2 ro reset_button=0 console=ttyS0,9600 PID hash table entries: 4096 (order: 2, 16384 bytes) Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Sorting __ex_table... Memory: 2075976k/2097152k available (3556k kernel code, 21176k reserved, 196k data, 109k bss, 148k init) Kernel virtual memory layout: * 0xfffcf000..0xfffff000 : fixmap * 0xffc00000..0xffe00000 : highmem PTEs * 0xffa00000..0xffc00000 : consistent mem * 0xffa00000..0xffa00000 : early ioremap * 0xf1000000..0xffa00000 : vmalloc & ioremap SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 Preemptible hierarchical RCU implementation. NR_IRQS:512 nr_irqs:512 16 UIC0 (32 IRQ sources) at DCR 0xc0 UIC1 (32 IRQ sources) at DCR 0xd0 UIC2 (32 IRQ sources) at DCR 0xe0 UIC3 (32 IRQ sources) at DCR 0xf0 clocksource: timebase mult[800000] shift[23] registered pid_max: default: 32768 minimum: 301 Mount-cache hash table entries: 512 devtmpfs: initialized NET: Registered protocol family 16 256k L2-cache enabled PCIE0: Checking link... PCIE0: No device detected. PCI host bridge /plb/pciex@d00000000 (primary) ranges: MEM 0x0000000e00000000..0x0000000e7fffffff -> 0x0000000080000000 MEM 0x0000000f00000000..0x0000000f000fffff -> 0x0000000000000000 IO 0x0000000f80000000..0x0000000f8000ffff -> 0x0000000000000000 4xx PCI DMA offset set to 0x00000000 4xx PCI DMA window base to 0x0000000000000000 DMA window size 0x0000000080000000 PCIE0: successfully set as root-complex PCIE1: Checking link... PCIE1: Device detected, waiting for link... PCIE1: link is up ! PCI host bridge /plb/pciex@d20000000 (primary) ranges: MEM 0x0000000e80000000..0x0000000effffffff -> 0x0000000080000000 MEM 0x0000000f00100000..0x0000000f001fffff -> 0x0000000000000000 IO 0x0000000f80010000..0x0000000f8001ffff -> 0x0000000000000000 4xx PCI DMA offset set to 0x00000000 4xx PCI DMA window base to 0x0000000000000000 DMA window size 0x0000000080000000 PCIE1: successfully set as root-complex PCI: Probing PCI hardware PCI host bridge to bus 0000:40 pci_bus 0000:40: root bus resource [io 0xfffe0000-0xfffeffff] (bus address [0x0000-0xffff]) pci_bus 0000:40: root bus resource [mem 0xe00000000-0xe7fffffff] (bus address [0x80000000-0xffffffff]) pci_bus 0000:40: root bus resource [mem 0xf00000000-0xf000fffff] (bus address [0x00000000-0x000fffff]) pci_bus 0000:40: root bus resource [bus 40-ff] PCI: Hiding 4xx host bridge resources 0000:40:00.0 pci 0000:40:00.0: PCI bridge to [bus 41-7f] PCI host bridge to bus 0001:80 pci_bus 0001:80: root bus resource [io 0x0000-0xffff] pci_bus 0001:80: root bus resource [mem 0xe80000000-0xeffffffff] (bus address [0x80000000-0xffffffff]) pci_bus 0001:80: root bus resource [mem 0xf00100000-0xf001fffff] (bus address [0x00000000-0x000fffff]) pci_bus 0001:80: root bus resource [bus 80-ff] PCI: Hiding 4xx host bridge resources 0001:80:00.0 pci 0001:80:00.0: PCI bridge to [bus 81-bf] pci 0000:40:00.0: disabling bridge window [io 0x0000-0xffffffffffffffff] to [bus 41-7f] (unused) pci 0000:40:00.0: disabling bridge window [mem 0x00000000-0xffffffffffffffff pref] to [bus 41-7f] (unused) pci 0000:40:00.0: disabling bridge window [mem 0x00000000-0xffffffffffffffff] to [bus 41-7f] (unused) pci 0001:80:00.0: disabling bridge window [io 0x0000-0xffffffffffffffff] to [bus 81-bf] (unused) pci 0000:40:00.0: PCI bridge to [bus 41-7f] pci 0001:80:00.0: BAR 8: assigned [mem 0xe80000000-0xe800fffff] pci 0001:81:00.0: BAR 0: assigned [mem 0xe80000000-0xe800fffff 64bit] pci 0001:80:00.0: PCI bridge to [bus 81-bf] pci 0001:80:00.0: bridge window [mem 0xe80000000-0xe800fffff] bio: create slab <bio-0> at 0 Switching to clocksource timebase NET: Registered protocol family 2 TCP established hash table entries: 8192 (order: 4, 65536 bytes) TCP bind hash table entries: 8192 (order: 3, 32768 bytes) TCP: Hash tables configured (established 8192 bind 8192) TCP: reno registered UDP hash table entries: 512 (order: 1, 8192 bytes) UDP-Lite hash table entries: 512 (order: 1, 8192 bytes) Configuring USB GPIO 16 + 19 bounce pool size: 64 pages jffs2: version 2.2. (NAND) (SUMMARY) © 2001-2006 Red Hat, Inc. msgmni has been set to 1494 alg: No test for stdrng (krng) io scheduler noop registered io scheduler deadline registered io scheduler cfq registered (default) Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled serial8250.0: ttyS0 at MMIO 0x4ef600300 (irq = 20) is a U6_16550A console [ttyS0] enabled 4ef600300.serial: ttyS0 at MMIO 0x4ef600300 (irq = 20) is a 16550 brd: module loaded 4ff000000.nor_flash: Found 1 x16 devices at 0x0 in 16-bit bank. Manufacturer ID 0x000089 Chip ID 0x00881e Intel/Sharp Extended Query Table at 0x010A Intel/Sharp Extended Query Table at 0x010A Intel/Sharp Extended Query Table at 0x010A Intel/Sharp Extended Query Table at 0x010A Intel/Sharp Extended Query Table at 0x010A Using buffer write method Using auto-unlock on power-up/resume cfi_cmdset_0001: Erase suspend on write enabled 6 ofpart partitions found on MTD device 4ff000000.nor_flash Creating 6 MTD partitions on "4ff000000.nor_flash": 0x000000000000-0x0000001e0000 : "KERNEL_1" 0x0000001e0000-0x000000200000 : "FDT_1" 0x000000200000-0x0000003e0000 : "KERNEL_2" 0x0000003e0000-0x000000400000 : "FDT_2" 0x000000f80000-0x000000fa0000 : "UBOOTENV" 0x000000fa0000-0x000001000000 : "UBOOT" NAND device: Manufacturer ID: 0xec, Chip ID: 0xd3 (Samsung NAND 1GiB 3,3V 8-bit), 1024MiB, page size: 2048, OOB size: 64 2 NAND chips detected Scanning device for bad blocks Bad eraseblock 898 at 0x000007040000 Bad eraseblock 11548 at 0x00005a380000 Bad eraseblock 14499 at 0x000071460000 4 ofpart partitions found on MTD device 4e0000000.ndfc.nand Creating 4 MTD partitions on "4e0000000.ndfc.nand": 0x000000000000-0x000020000000 : "ROOT_1" 0x000020000000-0x000040000000 : "ROOT_2" 0x000040000000-0x000046400000 : "CONFIG" 0x000046400000-0x00007c000000 : "VAR" tun: Universal TUN/TAP device driver, 1.6 tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com> PPC 4xx OCP EMAC driver, version 3.54 MAL v2 /plb/mcmal, 2 TX channels, 16 RX channels ZMII /plb/opb/emac-zmii@ef600d00 initialized RGMII /plb/opb/emac-rgmii@ef601500 initialized with MDIO support TAH /plb/opb/emac-tah@ef601350 initialized TAH /plb/opb/emac-tah@ef601450 initialized /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode eth0: EMAC-0 /plb/opb/ethernet@ef600e00, MAC 00:02:c9:63:ef:ee eth0: found Generic MII PHY (0x00) /plb/opb/emac-rgmii@ef601500: input 1 in RGMII mode eth1: EMAC-1 /plb/opb/ethernet@ef600f00, MAC 00:02:c9:63:ef:ef eth1: found Generic MII PHY (0x01) i2c /dev entries driver ibm-iic 4ef600700.i2c: using standard (100 kHz) mode rtc-ds1307 0-0068: SET TIME! rtc-ds1307 0-0068: rtc core: registered ds1338 as rtc0 rtc-ds1307 0-0068: 56 bytes nvram ibm-iic 4ef600800.i2c: using standard (100 kHz) mode TCP: cubic registered rtc-ds1307 0-0068: setting system clock to 2017-12-17 15:10:43 UTC (1513523443) jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x00000000: 0xaa55 instead jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x00000004: 0xaa55 instead jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x00000008: 0xaa55 instead jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x0000000c: 0xaa55 instead jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x00000010: 0xaa55 instead jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x00000014: 0xaa55 instead jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x00000018: 0xaa55 instead jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x0000001c: 0xaa55 instead jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x00000020: 0xaa55 instead jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x00000024: 0xaa55 instead jffs2: Further such events for this erase block will not be printed jffs2: Cowardly refusing to erase blocks on filesystem with no valid JFFS2 nodes jffs2: empty_blocks 4094, bad_blocks 1, c->nr_blocks 4096 VFS: Cannot open root device "mtdblock6" or unknown-block(31,6): error -5 Please append a correct "root=" boot option; here are the available partitions: 1f00 1920 mtdblock0 (driver?) 1f01 128 mtdblock1 (driver?) 1f02 1920 mtdblock2 (driver?) 1f03 128 mtdblock3 (driver?) 1f04 128 mtdblock4 (driver?) 1f05 384 mtdblock5 (driver?) 1f06 524288 mtdblock6 (driver?) 1f07 524288 mtdblock7 (driver?) 1f08 102400 mtdblock8 (driver?) 1f09 880640 mtdblock9 (driver?) Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(31,6) CPU: 0 PID: 1 Comm: swapper Not tainted 3.10.27-MELLANOXuni-m460ex PPC_M460EX Call Trace(ef840000): name=swapper, state=0 [ef845e00] [c0005614] show_stack+0x54/0x150 (unreliable) [ef845e40] [c02758d8] panic+0xcc/0x204 [ef845e90] [c0354d00] mount_block_root+0x274/0x284 [ef845ee0] [c0355048] prepare_namespace+0x1a0/0x1e8 [ef845ef0] [c0354930] kernel_init_freeable+0x1b0/0x1c4 [ef845f30] [c0001ac0] kernel_init+0x18/0xf8 [ef845f40] [c000afcc] ret_from_kernel_thread+0x5c/0x64 Rebooting in 180 seconds..
after a 3 minute delay, the switch resets (very noticeably -- the fans spin up to max RPM briefly)
Some interesting tidbits gleaned from this:
- It's Linux 3.10.27 plus Mellanox modifications
- 2Gibytes of RAM on board
- U-Boot eventually passed
root=/dev/mtdblock6 rootfstype=jffs2 ro reset_button=0 console=ttyS0,9600
as kernel command line arguments - There's a real time clock which seems to be functioning
- And finally, Linux doesn't do much if it doesn't have a root filesystem -- if it can't start the first user-land process (init/upstart/systemd/whatever) it will panic.
User-mode stuffs (making a switch bootstrap OS taking bits from MLNX-OS)
Above we've noted that the kernel is being told to look for a JFFS2 image. We'll need to build one now. The JFFS2 creation utility, mkfs.jffs2
takes a directory containing the desired files as input and creates a JFFS2 image as output. (Much like creating an ISO9660 image.) On Debian, mkfs.jffs2
is contained in the mtd-utils
package.
Remember that MLNX-OS image we downloaded from Mellanox's web site? And how we pulled a kernel and flattened device tree from it? Well now we're going to build something for the kernel to run.
Bits of shell script here documenting the contents of the JFFS2 directory
# TAKE NOTE: # $jffs2root is the destination directory -- the jffs2 image file will be created from its contents # $mlnxosroot is the unpacked MLNX-OS distribution -- some files from here will be copied into ${jffs2root} # # Some of these items will need to be run with elevated privileges -- make use of sudo as needed. cd ${jffs2root} mkdir -p -m 775 bin dev etc/init.d etc/rc.d lib sbin usr var/run var/lib mkdir -p -m 0 proc sys mkdir -p -m 1777 tmp var/tmp mkdir -p -m 700 root cd usr ln -s ../../bin ./bin # makes scp and sftp work a little easier cd .. cp -p ${mlnxosroot}/sbin/busybox bin/ # busybox decides what it should act as based on its argv[0], so all these symlinks to busybox let us # use a single binary to do many different things cd ${jffs2root}/bin for bb_applet in \ [ [[ arp arping ash awk base64 basename bbconfig blkid blockdev brctl \ cal cat chgrp chmod chown chroot chrt chvt cksum clear cmp comm cp cttyhack cut \ date dc dd deallocvt depmod devmem df diff dirname dmesg dnsdomainname dos2unix du dumpkmap \ echo ed egrep eject env expand expr \ false fbsplash fdformat fdisk fgconsole fgrep find findfs flock fold free freeramdisk fsync fuser \ getopt getty grep groups gunzip gzip halt hd hdparm head hexdump hostid hostname hwclock \ id ifconfig ifdown ifenslave ifup init insmod install ionice iostat ip ipaddr ipcalc ipcrm ipcs iplink iproute iprule \ kbd_mode kill killall killall5 klogd \ last less linux32 linux64 linuxrc ln loadfont loadkmap logger login logname logread losetup ls lsmod lsof lspci lsusb \ makedevs md5sum mdev mesg mkdir mkfifo mknod mkswap mktemp modinfo modprobe more mount mountpoint mpstat mv \ nameif netstat nice nohup nslookup ntpd od openvt \ patch pgrep pidof ping ping6 pivot_root pkill pmap poweroff printenv printf ps pstree pwd pwdx \ rdate rdev readahead readlink realpath reboot renice reset resize rev rm rmdir rmmod route run-parts runlevel \ script sed seq setarch setconsole setfont setkeycodes setlogcons setserial setsid sh \ sha1sum sha256sum sha512sum showkey sleep sort split stat strings stty su sulogin sum \ swapoff swapon switch_root sync sysctl syslogd \ tac tail tee telnet test tftp time timeout top touch tr traceroute traceroute6 true tty \ udhcpc udhcpc6 umount uname unexpand uniq unix2dos uptime users usleep uudecode uuencode \ vconfig vi volname wall watch wc wget which who whoami xargs yes zcat zcip; do rm ./${bb_applet} ln -s busybox ${bb_applet} done # Linux kernel execs /sbin/init as process 1 cd ${jffs2root}/sbin ln -s ../bin/busybox getty ln -s ../bin/busybox init # I prefer GNU bash to the Busybox /bin/sh cd ${jffs2root}/bin cp -p ${mlnxosroot}/bin/bash . # bash also needs libtinfo and libdl cd ${jffs2root}/lib cp -p ${mlnxosroot}/lib/libdl-2.17.so . cp -p ${mlnxosroot}/lib/libtinfo.so.5.9 . ln -s libdl-2.17.so libdl.so.2 ln -s libtinfo.so.5.9 libtinfo.so.5 # and a very minimal set of terminfo files cd ../etc cp -pr ${mlnxosroot}/etc/terminfo .
# These utilities will useful in putting a JFFS2 image directly onto the flash storage cp -p ${mlnxosroot}/usr/sbin/flash* ${jffs2root}/bin cp -p ${mlnxosroot}/usr/sbin/nand* ${jffs2root}/bin # And some SSH client goodness would be nice, too cd ${jffs2root} mkdir -m 755 etc etc/ssh cd ${jffs2root}/bin cp -p ${mlnxosroot}/usr/bin/ssh . cp -p ${mlnxosroot}/usr/bin/scp . cp -p ${mlnxosroot}/usr/bin/sftp . cd ${jffs2root}/lib cp -p ${mlnxosroot}/usr/lib/libcrypto.so.1.0.1e . cp -p ${mlnxosroot}/usr/lib/libcrypt-2.17.so . cp -p ${mlnxosroot}/lib/libdl-2.17.so . cp -p ${mlnxosroot}/usr/lib/libz.so.1.2.7 . cp -p ${mlnxosroot}/lib/libresolv-2.17.so . cp -p ${mlnxosroot}/lib/librt-2.17.so . ln -s libcrypto.so.1.0.1e libcrypto.so.10 ln -s libcrypt-2.17.so libcrypt.so.1 ln -s libz.so.1.2.7 libz.so.1 ln -s libresolv-2.17.so libresolv.so.2 ln -s librt-2.17.so librt.so.1 ln -s libdl-2.17.so libdl.so.2 # SSH also needs working NSS libraries cp -p ${mlnxosroot}/lib/libnss_compat-2.17.so . cp -p ${mlnxosroot}/lib/libnss_dns-2.17.so . cp -p ${mlnxosroot}/lib/libnss_files-2.17.so . ln -s libnss_compat-2.17.so libnss_compat.so.2 ln -s libnss_dns-2.17.so libnss_dns.so.2 ln -s libnss_files-2.17.so libnss_files.so.2 # kernel modules so that we can DHCP and other things... cd ${mlnxosroot}/lib/modules/3.10.27-MELLANOXuni-m460ex sudo find kernel -depth -print0 | sudo cpio -pdmv0a ${jffs2root}/lib/modules/3.10.27-MELLANOXuni-m460ex/ sudo cp -p modules.* ${jffs2root}/lib/modules/3.10.27-MELLANOXuni-m460ex/ # GNU tar is here instead of the busybox version cd ${jffs2root} cp -p ${mlnxosroot}/bin/tar bin/ cp -p ${mlnxosroot}/usr/sbin/flash_eraseall sbin/ # make an /etc/init.d/rcS that gets /proc and /sys mounted cp /dev/null etc/init.d/rcS (echo '#!/bin/sh'; echo; echo 'mount -t proc none /proc'; echo 'mount -t sysfs none /sys') | tee etc/init.d/rcS chmod 755 etc/init.d/rcS ln -s ../init.d/rcS etc/rc.d/rc.sysinit # make an inittab file -- this will get us a working terminal cp /dev/null etc/inittab echo '::sysinit:/etc/init.d/rcS' >> etc/inittab echo '#::askfirst:/bin/sh' >> etc/inittab echo 'ttyS0:2345:respawn:/sbin/getty -L 9600 ttyS0 vt102' >> etc/inittab echo '#::respawn:cttyhack sh -l' >> etc/inittab chmod 644 etc/inittab # /etc/passwd listing for 'root' The password a classic UNIX (DES based) hash of "changeme" a common default root password on Sun gear. cp /dev/null etc/passwd echo 'root:Xxxgg2TS4senE:0:0::/root:/bin/sh' > etc/passwd chmod 644 etc/passwd # /etc/nsswitch.conf tells libc where to look up things like users cp /dev/null etc/nsswitch.conf echo passwd: files >> etc/nsswitch.conf echo group: files >> etc/nsswitch.conf chmod 644 etc/nsswitch.conf # Set up the shell's environment echo "PS1='switch-bootstrap:\w\\\$\ '" > etc/profile # Copy name make symlinks for shared libraries (ldconfig doesn't want to help here. :( ) cd ${mlnxosroot}/lib cp -p ld-2.17.so libc-2.17.so libcrypt-2.17.so libm-2.17.so libpthread-2.17.so librt-2.17.so ${jffs2root}/lib/ cd ${jffs2root}/lib ln -s ld-2.17.so ld.so.1 ln -s ld-2.17.so ld-linux-so.2 ln -s libc-2.17.so libc.so.6 ln -s libcrypt-2.17.so libcrypt.so.1 ln -s libm-2.17.so libm.so.6 ln -s libpthread-2.17.so libpthread.so.0 ln -s librt-2.17.so librt.so.1 # device files (we don't need no steeenking udev) These will definitely need root privileges to create. cd ${jffs2root}/dev mknod -m 622 console c 5 1 mknod -m 666 null c 1 3 mknod -m 444 random c 1 8 mknod -m 644 ttyS0 c 4 64 mknod -m 444 urandom c 1 9 mknod -m 666 zero c 1 5
And now, let's create the actual JFFS2 image file.
cd ${jffs2root}/.. sudo /usr/sbin/mkfs.jffs2 --verbose --eraseblock=0x20000 --pagesize=2048 --pad \ --no-cleanmarkers --big-endian --squash-uids --root=jffs2-root \ --output=bootstrap-mlnx-os-image-ppc-m460ex-20160609-202426.jffs2
Copy it to the TFTP server (172.17.255.118, remember?) And then, on the switch, see about pulling it down, flashing it, and trying to get the MLNX-OS kernel to run it. U-Boot conversation with the switch follows:
U-Boot 2009.01 SX_PPC_M460EX SX_3.2.0330-82-EMC ppc (Feb 27 2013 - 12:13:42) CPU: AMCC PowerPC 460EX Rev. B at 1000 MHz (PLB=166, OPB=83, EBC=83 MHz) Security/Kasumi support Bootstrap Option H - Boot ROM Location I2C (Addr 0x52) Internal PCI arbiter disabled 32 kB I-Cache 32 kB D-Cache Board: Mellanox PPC460EX Board FDEF: No I2C: ready DRAM: 2 GB (ECC enabled, 333 MHz, CL3) FLASH: 16 MB NAND: 1024 MiB PCI: Bus Dev VenId DevId Class Int PCIE0: link is not up. PCIE1: successfully set as root-complex 01 00 15b3 c738 0c06 00 Net: ppc_4xx_eth0, ppc_4xx_eth1 Hit any key to stop autoboot: 0 => setenv ipaddr 172.16.10.80 => setenv gatewayip 172.16.10.3 => tftp 400000 172.17.0.16:mellanox-sx6018/bootstrap-mlnx-os-image-ppc-m460ex-20160609-202426.jffs2 Using ppc_4xx_eth0 device TFTP from server 172.17.255.118; our IP address is 172.17.255.120 Filename 'MLNX-OS_PPC_M460EX_3.6.1002/bootstrap-mlnx-os-image-ppc-m460ex-20160609-202426.jffs2'. Load address: 0x400000 Loading: ################################################################# ################################################################# ############################### done Bytes transferred = 2359296 (240000 hex) => nand erase clean 0 1FFFFFFF NAND erase: device 0 offset 0x0, size 0x1fffffff Skipping bad block at 0x07040000 Erasing at 0x1ffe0000 -- 100% complete. Cleanmarker written at 0x1ffe0000. OK => nand write ${fileaddr} 0 ${filesize} NAND write: device 0 offset 0x0, size 0x240000 2359296 bytes written: OK =>
There was a single bad block of NAND reported on this switch. Small numbers of bad blocks in the NAND area are OK. The U-Boot tftp
command filled out the fileaddr and filesize variables for us. Note that we cleared 512Mibytes of the flash storage with the nand erase ...
command, but the file we transferred was only about 2.25Mibytes in size. Next up are 2 minor changes to the kernel command lines:
=> printenv image_kernel_args jffs2_args ## Error: "image_kernel_args" not defined jffs2_args=setenv bootargs root=${rootdev} rootfstype=jffs2 ro reset_button=${reset_button} ${image_kernel_args} ${extra_args} => setenv rootdev /dev/mtdblock6 => setenv image_kernel_args loglevel=2 => setenv jffs2_args "setenv bootargs root=${rootdev} rootfstype=jffs2 rw reset_button=${reset_button} ${image_kernel_args} ${extra_args}" =>
And, to boot it up:
=> run mlxlinux (none) login: root Password: changeme login[859]: root login on 'ttyS0' switch-bootstrap:~#
And there's a root shell prompt in a very basic Linux OS running on the switch. The jiggering with getty in the inittab lets us have a fully configured terminal to play around with. Job control and having Ctrl-C are both very nice things.
Putting MLNX-OS on the switch
There is now a very basic Linux OS installed and bootable (with a little manual jiggering) Next up is getting MLNX-OS copied to one of the flash areas and some tweaking done.
We're going to put the MLNX-OS root filesystem in the "ROOT_2" flash partition -- cat /proc/mtd
will list these out for us. We'll also be taking over the CONFIG and VAR partitions. Note that the flash layout information comes from the flattened device tree structure we pulled from the MLNX-OS distribution a long time ago.
We're going to need some network
(at least) 2 options here. First one is to set things by hand:
switch-bootstrap:/# ip link set up dev eth0 switch-bootstrap:/# ip addr add 172.16.10.80/24 dev eth0 switch-bootstrap:/# ip route add default via 172.16.10.3
Alternatively, you can do some DHCP:
switch-bootstrap:/# modprobe af_packet switch-bootstrap:/# ip link set up dev eth0 switch-bootstrap:/# udhcpc -i eth0 DHCP client started on eth0 Sending discover... Sending select for 172.16.10.80... Lease of 172.16.10.80 obtained, lease time 3600 switch-bootstrap:/#
Cleaning out the flash
Erase the MTD devices:
switch-bootstrap:/# flash_eraseall -j /dev/mtd7 Erasing 128 Kibyte @ 1ffe0000 -- 99 % complete. Cleanmarker written at 1ffe0000. switch-bootstrap:/# flash_eraseall -j /dev/mtd8 Erasing 128 Kibyte @ 63e0000 -- 99 % complete. Cleanmarker written at 63e0000. switch-bootstrap:/# flash_eraseall -j /dev/mtd9 Erasing 128 Kibyte @ 13f60000 -- 37 % complete. Cleanmarker written at 13f60000. Skipping bad block at 0x13f80000 Erasing 128 Kibyte @ 2b040000 -- 80 % complete. Cleanmarker written at 2b040000. Skipping bad block at 0x2b060000 Erasing 128 Kibyte @ 35be0000 -- 99 % complete. Cleanmarker written at 35be0000. switch-bootstrap:/#
Small numbers of bad blocks are expected and nothing to worry about.
Make some directories where we can deposit mount the soon-to-be-create filesystems:
switch-bootstrap:/# mkdir -m 755 /mnt switch-bootstrap:/# mkdir -m 0 /mnt/root2
And mount the flash there. (JFFS2 magic: No mkfs step is needed. Which is a lie. flash_eraseall
took care of that for us.)
switch-bootstrap:/# mount -t jffs2 /dev/mtdblock7 /mnt/root2 switch-bootstrap:/# mkdir -m 0 /mnt/root2/config /mnt/root2/var switch-bootstrap:/# mount -t jffs2 /dev/mtdblock8 /mnt/root2/config switch-bootstrap:/# mount -t jffs2 /dev/mtdblock9 /mnt/root2/var switch-bootstrap:/# busybox df -m /dev/mtdblock[789] Filesystem 1M-blocks Used Available Use% Mounted on /dev/mtdblock7 512 11 501 2% /mnt/root2 /dev/mtdblock8 100 2 98 2% /mnt/root2/config /dev/mtdblock9 860 18 842 2% /mnt/root2/var switch-bootstrap:/#
MLNX-OS preparation
Way up above, we downloaded a MLNX-OS distribution from Mellanox's web server. The kernel and device tree and busybox were all extracted from it. We want to put the whole shebang on the switch so we can do all the cool things.
Extract a working version of the OS image like so:
$ wget http://www.mellanox.com/downloads/Software/image-PPC_M460EX-3.6.1002.img [...] $ mkdir Mlnx-OS-3.6.1002 $ cd Mlnx-OS-3.6.1002 $ unzip -p ../image-PPC_M460EX-3.6.1002.img image-PPC_M460EX-ppc-m460ex-20160609-202426.tgz | gzip -dc | sudo tar xvvpf - --numeric-owner [...] $
tar
is run as root so that device nodes and permissions are restored correctly at extract time. --numeric-owner
tells tar not to translate the Mlnx-OS user and group names to the host OS's equivalents.
Let's save some space!
Skip this for now
26% or so. Observe:
$ du -sm . 886 . $ sudo find . -depth -type f -perm -111 -print | xargs -n1 sudo strip --strip-unneeded --preserve-dates --strip-debug [...] $ sudo cp -pr ../Mlnx-OS-3.6.1002.orig/lib/modules/* lib/modules/ $ du -sm . 840 . $
It's too big for the 512Mibyte flash partition, but we'll take care of that later on when we create the JFFS2 image for it.
MLNX-OS hacks
Wherein we'll change a few things in the MLNX-OS image before putting it onto the switch...
/etc/fstab change
Fix up the fstab entries for /
, /var
, and /config
to just use the /dev/mtdblockN
device names instead of filesystem labels. Additionally, we don't have swap and don't care about the bootloader files... Also also, we're using JFFS2 instead of ext3 for the filesystem type:
/dev/mtdblock7 / jffs2 defaults,noatime 1 1 # LABEL=BOOT_1 /boot ext3 defaults,noatime 1 2 # LABEL=BOOTMGR /bootmgr ext3 defaults,noatime 1 2 /dev/mtdblock8 /config jffs2 defaults,noatime 1 2 /dev/mtdblock9 /var jffs2 defaults,noatime 1 2 # LABEL=SWAP_1 swap swap defaults,noatime 0 0 tmpfs /dev/shm tmpfs defaults 0 0 devpts /dev/pts devpts gid=5,mode=620 0 0 sysfs /sys sysfs defaults 0 0 proc /proc proc defaults 0 0 /dev/cdrom /mnt/cdrom iso9660 noauto,ro 0 0 /dev/fd0 /mnt/floppy auto noauto 0 0
VPI enablement
The VPI functionality is governed by a trio of programs that live in /opt/tms/bin/. Putting the patched versions of chad
, hwd
, and ibd
will make sure that's working as we want it to.
$ sudo install -o root -g root -m 755 --backup --verbose ../customized-binaries/{chad,hwd,ibd} opt/tms/bin/ '../customized-binaries/chad' -> 'opt/tms/bin/chad' (backup: 'opt/tms/bin/chad~') '../customized-binaries/hwd' -> 'opt/tms/bin/hwd' (backup: 'opt/tms/bin/hwd~') '../customized-binaries/ibd' -> 'opt/tms/bin/ibd' (backup: 'opt/tms/bin/ibd~') $
Ethernet name mangling
Small tweaks here to get the eth0 and eth1 interfaces named mgmt0 and mgmt1 as MLNX-OS sees them. This could be done with the switch's configuration database somehow, but we haven't got that reverse-engineered yet. So for now, let us be expedient instead of perfect.
Create an /etc/mactab similar to the following. Your switch's Ethernet interfaces' MAC addresses are available by running ip link list
from the bootstrap OS.
mgmt0 00:02:c9:64:16:fc mgmt1 00:02:c9:64:16:fd
And some updates to the rename_ifs init script and needed, too. Apply the following patch to it:
--- etc/rc.d/init.d/rename_ifs.mlnx-os.orig 2016-06-09 11:43:05.000000000 -0600 +++ etc/rc.d/init.d/rename_ifs 2019-01-28 10:30:01.228780822 -0700 @@ -379,7 +379,8 @@ start() { echo $"Running renaming interfaces" - do_rename_ifs + # do_rename_ifs + ${NAMEIF} [ $RETVAL -eq 0 ] && touch /var/lock/subsys/rename_ifs return $RETVAL
customer_rootflop.sh needs changes, too
Apply this diff to it:
--- etc/customer_rootflop.sh.mlnx-os.orig 2016-09-28 12:16:17.000000000 -0600 +++ etc/customer_rootflop.sh 2019-02-15 14:24:55.097140577 -0700 @@ -1561,7 +1561,8 @@ oldifs=$IFS IFS=`echo \n` - system_profile=`echo $out | grep FEATURE_EN_1: | awk {'print $NF'}` + # system_profile=`echo $out | grep FEATURE_EN_1: | awk {'print $NF'}` + system_profile="3" IFS=$oldifs mlx_check_err 0 "$system_profile" "cant parse system profile" @@ -1569,7 +1570,8 @@ oldifs=$IFS IFS=`echo \n` - num_swids=`echo $out | grep FEATURE_EN_10: | awk {'print $NF'}` + # num_swids=`echo $out | grep FEATURE_EN_10: | awk {'print $NF'}` + num_swids="0" IFS=$oldifs mlx_check_err 0 "$num_swids" "cant parse number of swids"
Root logins are good!
The distribution's /etc/shadow file is a symlink to one maintained by an proprietary tool. Break the symlink and set root's password to the same as admin's. (admin
)
$ sudo rm etc/shadow # it's a symlink $ sudo tee etc/shadow << __EOF__ root:$6$NFLgdAQn$eZXt2gnpaJsxf3Hy5OwUoX.0yAw6QBVtyvZL48YmEDGtNI6zijqqyBnqeC10DmWb.WghOBjgAOtbivx9C5ZUL/:10000:0:99999:7::: admin:$6$NFLgdAQn$eZXt2gnpaJsxf3Hy5OwUoX.0yAw6QBVtyvZL48YmEDGtNI6zijqqyBnqeC10DmWb.WghOBjgAOtbivx9C5ZUL/:10000:0:99999:7::: apache:*:10000:0:99999:7::: avahi:!!:10000:0:99999:7::: dbus:!!:10000:0:99999:7::: haldaemon:!!:10000:0:99999:7::: monitor:$6$NlsGJ1Mh$ZAJmr0o4/8ZsG5r9L/W0PA9u3dPC6WL4/DDkkpDPnyQbqeCLoRqyH4X35HHD2AxGQORKCs58bB/FjnPhunril0:10000:0:99999:7::: nfsnobody:!!:10000:0:99999:7::: nobody:*:10000:0:99999:7::: ntp:*:10000:0:99999:7::: pcap:*:10000:0:99999:7::: qemu:!!:10000:0:99999:7::: rpc:!!:10000:0:99999:7::: rpcuser:!!:10000:0:99999:7::: sshd:*:10000:0:99999:7::: statsd:*:10000:0:99999:7::: tcpdump:!!:10000:0:99999:7::: vcsa:!!:10000:0:99999:7::: xmladmin:$6$nHVLuh/.$nkTB77KylkvyyjnHlfKiLzEJvzOCM2PWYLHuyV/grWi417KfCmZ0C2maEua8amzfe8P/Np3M32dbSEnrVmlsD0:10000:0:99999:7::: xmluser:$6$Z9Nazq9n$fPXUf.qAIDvisF6cAyXYje1OueJwtJMTjYcnhVxYASxL8jpcOZG3G4dXBxfON3BNB.8lDWaNtSqAKN23RRX6z1:10000:0:99999:7::: __EOF__ $ sudo chmod 600 etc/shadow $ sudo chown 0:0 etc/shadow
Ensure our nice, unlocked bootloader doesn't get replaced
Apply this patch to sbin/aiset.sh
--- sbin/aiset.sh~ 2016-09-25 07:38:19.000000000 -0600 +++ sbin/aiset.sh 2019-02-23 05:43:06.339255148 -0700 @@ -1,4 +1,5 @@ #!/bin/sh +exit 0 # # Filename: $Source: /windy/home/scm/CVS_TMS/src/base_os/common/script_files/aiset.sh,v $
Make layout_settings.sh empty
$ sudo cp /dev/null etc/layout_settings.sh $
Customize your image_layout.sh
This file was provided by a nice person across the ocean.
$ sudo install -o root -g root -m 755 --backup --verbose ../customized-binaries/image_layout.sh etc/
Two more empty files
$ sudo cp /dev/null etc/.firstboot $ sudo cp /dev/null var/opt/tms/.firstmfg $
Break out config and var filesystems
$ sudo mv config ../Mlnx-OS-3.6.1002-config $ sudo mv var ../Mlnx-OS-3.6.1002-var $ sudo mkdir -m 0 config var
These will each be going onto their own JFFS2 filesystems to be flashed individually to the switch.
Put MLNX-OS on the switch
Create JFFS2 images
One for each of the flash partitions we'll be using: /, /config, and /var.
$ sudo /usr/sbin/mkfs.jffs2 --verbose --eraseblock=0x20000 --pagesize=2048 --pad --no-cleanmarkers --big-endian --root=Mlnx-OS-3.6.1002 -m size -X zlib --output=mlnx-os-image-ppc-m460ex-3.6.1002-20160609-202426-root.jffs2 $ sudo /usr/sbin/mkfs.jffs2 --verbose --eraseblock=0x20000 --pagesize=2048 --pad --no-cleanmarkers --big-endian --root=Mlnx-OS-3.6.1002-config -m size -X zlib --output=mlnx-os-image-ppc-m460ex-3.6.1002-20160609-202426-config.jffs2 $ sudo /usr/sbin/mkfs.jffs2 --verbose --eraseblock=0x20000 --pagesize=2048 --pad --no-cleanmarkers --big-endian --root=Mlnx-OS-3.6.1002-var -m size -X zlib --output=mlnx-os-image-ppc-m460ex-3.6.1002-20160609-202426-var.jffs2
Remember how the unpacked image was ~850Mibytes? Well, our JFFS2 image for / is now down to ~370Mibytes:
$ ls -larth *1002*jffs2 -rw-r--r-- 1 root root 467M Feb 25 10:48 mlnx-os-image-ppc-m460ex-3.6.1002-20160609-202426-root.jffs2 -rw-r--r-- 1 root root 128K Feb 25 10:48 mlnx-os-image-ppc-m460ex-3.6.1002-20160609-202426-config.jffs2 -rw-r--r-- 1 root root 128K Feb 25 10:48 mlnx-os-image-ppc-m460ex-3.6.1002-20160609-202426-var.jffs2 $
Copy JFFS2 images to the switch
Back on the switch now...
switch-bootstrap:/# cd /tmp switch-bootstrap:/# scp -S /bin/ssh user@172.17.0.16:proj/mellanox-sx6018/mlnx-os-image-ppc-m460ex-3.6.8010*jffs2 . user@172.17.0.16's password: **************** mlnx-os-image-ppc-m460ex-3.6.8010.20180820-180416-config.jffs2 100% 128KB 128.0KB/s 00:00 mlnx-os-image-ppc-m460ex-3.6.8010.20180820-180416-root.jffs2 100% 370MB 755.2KB/s 08:21 mlnx-os-image-ppc-m460ex-3.6.8010.20180820-180416-var.jffs2 100% 128KB 128.0KB/s 00:00 switch-bootstrap:/tmp#
And write them to the flash
switch-bootstrap:/tmp# nandwrite -q -p /dev/mtd7 /tmp/mlnx-os-image-ppc-m460ex-3.6.8010.20180820-180416-root.jffs2 switch-bootstrap:/tmp# nandwrite -q -p /dev/mtd9 /tmp/mlnx-os-image-ppc-m460ex-3.6.8010.20180820-180416-var.jffs2
Also write the config image if this is your first time...
switch-bootstrap:/tmp# nandwrite -q -p /dev/mtd8 /tmp/mlnx-os-image-ppc-m460ex-3.6.8010.20180820-180416-config.jffs2
Use switch's TFTP client to suck it down
First we gotta get our switch's IP layer going.
# ip link set up dev eth0 # ip addr add 172.16.10.81/24 dev eth0 # ip route add default via 172.16.10.3 #
Suck down and unpack the tarball
# cd /mnt/root2/var # tftp -g -r mellanox-sx6018/Mlnx-OS-3.6.1002.tar.gz 172.17.0.16 # cd .. # gzip -dc var/Mlnx-OS-3.6.1002.tar.gz | tar xvvpf - [...] #
Switch configuration database entries
Run these in the switch before rebooting.
# cd /mnt/root2/opt/tms/bin/ # ./mddbreq -c /mnt/root2/config/mfg/mfdb set modify - /mfg/mfdb/cluster/config/enable bool false # ./mddbreq -c /mnt/root2/config/mfg/mfdb set modify - /mfg/mfdb/cluster/config/expected_nodes uint16 0 # ./mddbreq -c /mnt/root2/config/mfg/mfdb set modify - /mfg/mfdb/cluster/config/id string "" # ./mddbreq -c /mnt/root2/config/mfg/mfdb set modify - /mfg/mfdb/cluster/config/interface string mgmt0 # LD_LIBRARY_PATH=/mnt/root2/lib:/mnt/root2/usr/lib ../../../usr/bin/openssl rand -hex 24 WARNING: can't open config file: /etc/pki/tls/openssl.cnf aa1115b42438bc122ffc4a3c346abd2c41889ade78461c2b # ./mddbreq -c /mnt/root2/config/mfg/mfdb set modify - /mfg/mfdb/cluster/config/shared-secret string aa1115b42438bc122ffc4a3c346abd2c41889ade78461c2b # ./mddbreq -c /mnt/root2/config/mfg/mfdb set modify - /mfg/mfdb/net/interface/config/mgmt0/addr/ipv4/dhcp bool true # ./mddbreq -c /mnt/root2/config/mfg/mfdb set modify - /mfg/mfdb/system/hostid string MT1311X05279 # ./mddbreq -c /mnt/root2/config/mfg/mfdb set modify - /mfg/mfdb/system/hostname string mellanox-sx6018-1 # ./mddbreq -c /mnt/root2/config/mfg/mfdb set modify - /mfg/mfdb/system/hwname string M460EX # ./mddbreq -c /mnt/root2/config/mfg/mfdb set modify - /mfg/mfdb/system/layout string MFL1 # ./mddbreq -c /mnt/root2/config/mfg/mfdb set modify - /mfg/mfdb/system/model string ppc
Reboot into MLNX-OS
Unmount the JFFS2 filesystems, reboot, tell the kernel to run the stuff in /dev/mtdblock7 instead of /dev/mtdblock6. Profit! If things are correct here, the switch will DHCP an address on its mgmt0
interface. Still running on the switch...
# cd / # umount /mnt/root2/var # umount /mnt/root2/config # umount /mnt/root2 # sync # reboot -f [...] U-Boot 2009.01 SX_PPC_M460EX SX_3.2.0330-82-EMC ppc (Feb 27 2013 - 12:13:42) [...] Hit any key to stop autoboot: 0 => setenv rootdev /dev/mtdblock7 => setenv image_kernel_args loglevel=2 => setenv jffs2_args "setenv bootargs root=${rootdev} rootfstype=jffs2 rw reset_button=${reset_button} ${image_kernel_args} ${extra_args}" => run mlxlinux
Update switch ASIC firmware
The EMC switch is an unmanaged switch. Meaning it has no subnet manager functionality. MLNX-OS needs more smarts in the SwitchX ASIC's brain.
Boot the switch into the EMC OS
We need this to get the SwitchX ASIC up and running. So that a client system can connect to it. So that we can put new firmware onto the ASIC.
Pre-reqs
- Get a machine with
mstflint
installed. For Debian systems, dosudo apt-get install mstflint
. We also need some more, low level Infiniband tools to make the flashing process work, sosudo apt-get install ibutils infiniband-diags opensm
those while we're at it.
Back up the existing SwitchX ASIC firmware
Backups are good!
Let's see how our connection to the switch is looking...
server-with-IB-card$ sudo ibstat CA 'mlx4_0' CA type: MT4099 Number of ports: 2 Firmware version: 2.40.5030 Hardware version: 0 Node GUID: 0x0002c903003e2180 System image GUID: 0x0002c903003e2183 Port 1: State: Active Physical state: LinkUp Rate: 56 Base lid: 1 LMC: 0 SM lid: 1 Capability mask: 0x0251486a Port GUID: 0x0002c903003e2181 Link layer: InfiniBand Port 2: State: Down Physical state: Polling Rate: 10 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x0251486a Port GUID: 0x0002c903003e2182 Link layer: InfiniBand server-with-IB-card$
Port 1 of our card has a 56Gbps connection to the switch. That's promising. :)
The opensm
subnet manager was started as soon as it was installed (that's a Debian thing). So let's see if we can see the switch on the network:
server-with-IB-card$ sudo ibswitches Switch : 0x0002c90300e404e0 ports 18 "SwitchX - Mellanox Technologies" base port 0 lid 2 lmc 0 server-with-IB-card$
Note here that the switch has been given Lid 2 here.
Query firmware on the switch ASIC. We address it by Lid, which isn't assigned without a subnet manager:
server-with-IB-card$ sudo mstflint -d lid-0x2 q full Image type: FS2 FW Version: 9.9.1260 FW Release Date: 5.6.2014 MIC Version: 1.5.0 Device ID: 51000 Description: Node Sys image GUIDs: 0002c90300e404e0 0002c90300e404e0 Description: Base Switch MACs: 0002c9e404e0 0002c9e40540 VSD: n/a PSID: EMC1240110020 server-with-IB-card$
Back up the SwitchX ASIC firmware image to somewhere safe:
server-with-IB-card$ sudo mstflint -d lid-0x2 ri EMC1240110020.fw server-with-IB-card$ mstflint -i EMC1240110020.fw q full Image type: FS2 FW Version: 9.9.1260 FW Release Date: 5.6.2014 MIC Version: 1.5.0 Device ID: 51000 Description: Node Sys image GUIDs: 0002c90300e404e0 0002c90300e404e0 Description: Base Switch MACs: 0002c9e404e0 0002c9e40540 VSD: n/a PSID: EMC1240110020 server-with-IB-card$
Burn the SwitchX firmware that MLNX-OS needs:
server-with-IB-card$ sudo mstflint --allow_psid_change -i MT_1240212020.fw -d lid-0x2 b Current FW version on flash: 9.9.1260 New FW version: 9.3.8170 Note: The new FW version is older than the current FW version on flash. Do you want to continue ? (y/n) [n] : y You are about to replace current PSID on flash - "EMC1240110020" with a different PSID - "MT_1240212020". Note: It is highly recommended not to change the PSID. Do you want to continue ? (y/n) [n] : y Burning FS2 FW image without signatures - OK Restoring signature - OK server-with-IB-card$
One last reboot (I promise!)
Now tell the switch to start MLNX-OS and watch it go! Interrupt the autoboot sequence and tell it to run mlxlinux
. MLNX-OS should start. The Infiniband port should come to life. The bits will flow!
Switch configuration
That's a whole other subject. Go see Mlnx-OS switch configuration