Saturday, October 9, 2010

IBM p550 at home with Debian and Infiniband


It's time for yet another blog post on getting once expensive machines running at home.

Introduction

I'm working with an IBM pSeries machine called a p550 (specifically, a model 9113-550 in IBM speak). It was built in 2004, had a list price of some 10s of kilo-bucks new, has 4 x 1.5GHz POWER5, 8GB RAM, and runs AIX up through the most recent releases of AIX 7.

It has a built-in hypervisor and what IBM calls "LPAR" support, which is a mode of virtualization which gives you "Logical PARtitions" of the memory and CPUs in the machine, with a granularity of 1/10th of a CPU. LPAR support requires a desktop machine that IBM calls an HMC, or "Hardware Management Console", which breaks out all of the logical consoles on the machine, and allows you to configure resource allocation and things like virtual ethernet switches and virtual SCSI adapters. In addition, a piece of software for the machine called VIOS or "Virtual I/O Server" is required for LPAR mode if you want to share hardware adapters (eg, ethernet, SCSI or Fiberchannel adapters) between OSes. Since I have neither of those, I am just running the machine in "bare metal" mode, with only one OS instance.

For I/O, the system has a built-in SCSI raid, gigabit ethernet, a Service Processor which controls functions like power on the machine, 5 internal hot-plug PCI-X slots, and an external link that allows for more I/O trays with disk and PCI-X cards to be added. I have installed a Mellanox Infinihost Infiniband card, to hook up to my Infiniband fabric.

Making the Service Processor work for you

In addition to the serial console port, the system has a pair of ethernet ports, which are designed to connect to a system HMC, but which also allow https-based access to the service processor menus. By default, it will try to get an address via dhcp, or you can configure it through the serial port. The Service processor requires you to log in to do anything. I believe that the default username/password combination is admin/admin. That's what we had it set to on the machines at work.

To set the IP address, you need to navigate through the menus:
5. Network Services
1. Network Configuration
1. Configure interface Eth0
Then, chose either static or dynamic, and enter information as needed.

In order to get this to work for me, I had to use Firefox, and enable an SSL option, because while it uses https, it uses a somewhat insecure method of doing SSL that is disabled by default. To enable this, put "about:config" in the address bar, and change the option "security.ssl3.rsa_null_md5" to "true". Once you do that, you can get to the web version of the service processor menus (ASPI in IBM-speak) at https://1.2.3.4 (replacing 1.2.3.4 with the IP address you set above).

One additional thing you will probably want to set up is "Serial Port Snoop" under System Service Aids -> Serial Port Snoop. Setting a "Snoop String" will all you to enter a string through the serial console to force reboot the machine if it locks up, or you do something wrong while booting, and the console isn't set to the right place.

Installing Debian

I net-booted the installer. To do this, set up the host in dhcpd.conf with an entry like this:

host p550 {
hardware ethernet 00:02:55:df:d5:dd;
fixed-address p550.blah;
next-server storage.blah;
filename "/tftpboot/debian-squeeze-ppc64-vmlinuz-chrp.initrd";
}

Boot the machine into OpenFirmware (hit "8" at the firmware "IBM IBM IBM IBM ..." screen), and net-boot from there:

0> boot net console=hvsi0

If you don't boot with the right args from openfirmware, you won't get a working console when you boot into the installer. That's where the "serial port snoop" option from the service processor comes in handy.

Once you get to the end of the installer, you will need to do some magic to get the bootloader (yaboot) installed. Hopefully, the Debian people will get some of this sorted out before the release of Squeeze. Tell the installer that you want a shell, then do this:

# mount --bind /dev /target/dev
# chroot /target
# mount -t proc proc proc
# yabootconfig
# ybin

Upgrading firmware

Debian doesn't include binary update_flash in its powerpc-utils package. Download the latest binary release in RPM format.

Convert that to an rpm with Alien (apt-get install alien if you don't have it):

# alien powerpc-utils-1.2.3-0.ppc.rpm

then

# apt-get remove powerpc-ibm-utils powerpc-utils
# dpkg -i powerpc-utils_1.2.3-1_powerpc.deb

Now, you can download a new flash image from IBM. Once you get it, use alien to convert and unpack the rpm, and do "update_flash ./tmp/fwupdate/01SF240_403_382", where 01SF240_403_382 is the flash image name from the RPM you downloaded. When you reboot the system, Linux will update the system flash just before rebooting.

Infiniband and beyond

I had some problems initially getting Infiniband set up and going. I'm using a Topspin SDR Infiniband adapter, which is basically a stock Mellanox InfiniHost . It seems that the hypervisor on the machine wasn't allocating all of the resources that the card was asking for.

After some discussion on the linuxppc-dev mailing list, it was pointed out that there are certain slots in the machine which the system calls "super slots", and which the firmware is willing to allocate more resources than a typical PCI-X card requests. This Redbook (PDF) on IBM's redbook site details Infiniband usage on pSeries systems, Section 3.4.3 indicates which slots you may install an infiniband adapter into on certain machines. On a p550, these are slots C2 and C5. I had plugged my IB adapter into slot C1, which is why I was having problems.

After getting it into the slot, it was just a matter of getting the right drivers loaded on the host OS. In order to use IP over Infiniband, you'll want the ib_ipoib module. To use RDMA and the Verbs interface, you'll want ib_umad and ib_uverbs modules to be loaded. At this point, it basically acts like a typical Linux system with Infiniband, just with more I/O bandwidth than you can get out of a typical PCI-X based system.

What next?

Setting up an HMC, and playing around with virtualization on the machine sounds like it could be a good time.

Saturday, October 2, 2010

Running an Altix 4700 supercomputer at home

Introduction

At the beginning of the year, work decomissioned the SGI Altix 4700 system that we put into production around January 2007. It sat around unused, and we had little luck finding a buyer for the system - it seems that no one is really commercially interested in running Linux on big Itanium systems anymore.

What's an Altix 4700?

Briefly, an SGI Altix 4700 is a large multi-processor SSI (single-system image) supercomputer, which uses Intel Itanium 2 (in my case, 1.6GHz, dual-core Montecito) processors, and memory on blades, which are interconnected using a "ccNUMA" architecture. This stands for "cache-coherent Non-Uniform Memory Access" - basically, a method of making large SMP-like machines by gluing processors with local memory together with a system interconnect.

With NUMA, unlike SMP, there is memory that is closer to (and thus faster from) each CPU. Like SMP, however, the system is contained in one single address space (unlike, say, a cluster which is connected using Ethernet or Infiniband). It thus runs a single OS image, and looks to the user like one large SMP system.

System Specs

The system that I have is contained within one rack, and has 4 "bricks", each with 8 processor blades, each blade containing 2 dual-core 1.6GHz Montecito and 16GB of RAM, plus one system I/O blade with disks, PCI-X slots, Gigabit ethernet, USB, etc, and assorted NUMA routing blades and system controllers.

That totals 128 system cores, and 512GB RAM. The theoretical peak GFLOP rating of the system as configured is approx 820GF. Of course, 128 CPU and 256 DIMMs draw a bit of power...

Powering/cooling a supercomputer

Running as a full system, the computer draws 9kW of power, and requires 2 x 200-240V, 30A power circuits. That's a lot of power. I pay about $0.072/kWh, so running the system for one hour costs me about $0.65.

One issue with running a system that draws 9kW is that you get 9kW of heat output. As a coworker of mine has said, at work we run heaters, that produce computation as a side-effect. The easiest and most cost-effective solution is to open up some windows, and turn on some fans. With an outside temperature of about 60F, I can set up a few fans, open some windows, and keep the temperature inside below 80F.

It is possible to deal with this problem, to make the machine a bit more friendly and less power-hungry to run -- you can run less than a full system. By pulling out the blades that you don't want to run, you can cut down the power usage by a proportional amount. For testing purposes, I have run the system with either 1/2 or 1/8 of the blades running to reduce my power usage.

One thing I've noticed is that EFI state information (its equivalent of a PC's "CMOS" configuration memory) is stored and updated only on one system blade. So, you really want to make sure that you have blade "0" (the bottom left blade in the chassis marked "001c01") installed, or booting will become much more difficult.

Installing Debian

At work, we ran the machine with SuSE. Due to licensing issues, the fact that SuSE sucks to administer, and that I prefer to run Debian on things, I got to installing Debian. The machine runs EFI, Intel's "next-generation BIOS" that is used on Itanium (IA-64) systems, and some x86 (PC like) systems such as Apple's Intel-based systems. The boot process is pretty close to PXE booting, and Debian seems to have pretty good IA-64 support. The install went smoothly - I ended up installing "Squeeze" - the next version of Debian that will be released.

Kernel changes

In general, the Debian kernel just works. However, it only has support for up to 64 cpus (cores) built in. I downloaded the latest kernel sources from ftp.kernel.org, primed the configuration with the Debian kernel config, adjusted the max number of CPUs, recompiled, and rebooted the system into the new kernel.

I should note that being able to do a make -j 64 does a lot to speed up a kernel compile... :)

Running HPL

HPL is the standard benchmark to test the effective speed of supercomputers, and is used by the ranking on the "Top500" list at http://top500.org.

HPL is also contained within the "hpcc" benchmark collection, which is how I ran the benchmark. At first, I tried the Debian package for hpcc. I got some fairly poor results, because it doesn't use an optimized "BLAS" math library. To speed things up, I went to TACC's web site and downloaded GotoBLAS2, compiled that with the added gcc option "-mtune=itanium2", copied it to /opt, and built hpcc from its source. The instructions here are useful to build these two software packages.

What next?

The systems' CPUs aren't all that fast compared to modern CPUs. For example, the rating in Gigaflops of all 128 cores is about 4 x the rating of a 3.2GHz IBM/Sony Cell BE CPU. One place that the system does have an advantage, though, is its 512GB of DDR2 memory. Someone could easily plug an Infiniband, 10 Gigabit Ethernet, or Fiberchannel card into the machine, and turn it into a pretty snappy solid-state drive, accessible over iSCSI, FCoE, Infiniband SDP, or straight FC.

The next item that I'm going to work on, is getting some FPGA blades from another Altix system working in the 4700, and test out writing code to use the FPGAs to speed up compute intensive tasks. An example of what FPGAs are typically used for is to search a genomics database for a particular DNA sequence. Basically, any algorithm that is applied to a stream of data can be a good candidate for putting into an FPGA.

Tuesday, August 10, 2010

The HTC Evo and a VT125

This follows on from my previous blog post about booting ancient OSes on emulators for ancient computers on your Android phone.

Since I still have an extensive collection of vintage DEC hardware, I decided to extend what I had been working on by connecting some vintage hardware up to the emulated system.

The first thing I had at hand was a DEC VT125 terminal - a close relative of the VT100, which includes some added graphics support. A quick power-up verified that it work. Now, to get it to talk to my phone's emulated VAX.

Now, I didn't have any good way of connecting a serial port directly to my phone, but SIMH does support a telnet connection to the emulator's console by doing a command like this (2301 is the TCP port to accept connections on):

sim> set console telnet=2301

Now, to create a telnet session from my VT125, I grabbed a Xyplex MAXserver 1640 from my bastement. These are similar to a vintage DECserver, but support telnet (and many other things) in addition to the usual DEC-specific LAT protocol. Basically, I can hook a terminal up to this, and use it to telnet to a host. It also works the other way around, so that I can hook a system's serial port up to it (such as a console port), use telnet to connect to the MAXserver, and connect to the serial console on the physical machine. This later setup is something we do with systems at work, and is very commonly done, as opposed to connecting serial terminals up to network-attached hosts, like I am trying to do.

For the purposes of this post, I won't go into detail on how to set up a MAXserver, but I basically placed a boot image on a tftp server, told the MAXserver to boot from that image, gave it an IP address, and told it to reset itself to default settings.

To connect the MAXserver to the VT125, I used a RJ45 serial cable (a "roll over" cable) and DB25 to RJ45 adapter, which has the same pinout as a Cisco RJ45 to DB25 DTE serial cable. Ethernet then connects the MAXserver to my home network.


Next, I set my phone to connect over WiFi to my home network, and noted its IP address. After turning on my VT125, booting the MAXserver, and starting up the emulated VAX on my phone. From there, I told the MAXserver to connect to my emulated VAX console:

Xyplex> connect 172.27.3.150:2301

Now, just boot the emulated VAX, and enjoy!


I have more pictures up on my flickr account.

Saturday, August 7, 2010

Compiling SIMH emulators for Android


In order to make this process easier on my phone, I used a rooted firmware. It will take some more effort to get this to work as a packaged application.
Setting up the development environment.
To get started, you'll need a working native C compiler for Android. After a lot of trial and error, I ended up discovering what I had to do. I used an amd64 architecture version of Debian GNU/Linux, Ubuntu works the same way. Using a Linux host to compile the code if not essential will make your life a lot easier. Follow the official instructions to download the source to Android using git, and then build it.

Along with doing that, I tested this on the Android emulator, which is a part of what you just downloaded and built, under out/host/linux-x86/bin. Put that directory in your path, and run the "android" command, create a virtual platform, and boot it. From the command line, if you built a virtual device named "Android21", for example, you'll want to run "emulator -avd Android21 -shell" so that you can get a shell on the virtual device. To copy files over to the image, the easiest method that I've found is to shut down the emulator, mount the virtual sdcard image (for these examples, I'm using "Android21" as the virtual device name):

$ sudo mount -o loop ~/.android/avd/Android21.avd/sdcard.img /mnt
$ sudo cp whatever /mnt
$ sudo umount /mnt

In addition, you will want the "agcc" script to make it easier to compile things. Download it from here. I modified my copy to use gcc-4.4.0 instead of gcc-4.2.1. You will need to add the location of arm-eabi-gcc to your path, and change all references of "4.2.1" in the agcc script to "4.4.0" and add that to your path. arm-eabi-gcc can be found under the prebuilt/linux-x86/toolchain/arm-eabi-4.4.0/bin path of where you built the android sources.

Fixing the Android SDK and SIMH makefile

The current version of the Android libc (bionic) headers has a problem compiling when bionic/libc/kernel/arch-arm/asm/byteorder.h is included. In order to make the file compile, comment out these lines, lines 22-27 in my copy:

/*#ifndef __thumb__
* if (!__builtin_constant_p(x)) {
*
* asm ("eor\t%0, %1, %1, ror #16" : "=r" (t) : "r" (x));
* } else
*#endif */

Once you do that, you will need to need to download the SIMH sources. Unpack them, and modify the the makefile in the top directory, to make the following changes:

On line 12, remove -lrt from OS_CCDEFS, so that it reads:
OS_CCDEFS = -lm -D_GNU_SOURCE

Change all references of "gcc" to "agcc".

Only some of the simulators actually compile, and I haven't tried to compile network support for any of it. In order to get network support, you would need to compile libpcap as well. For now, that is left as an exercise for the reader. :)

After those changes, I just did make vax vax780 to build the MicroVAX 3900 and VAX-11/780 simulators, which can be copied over to your phone or emulator. You will probably get a bunch of warnings about the use of variable-size enums versus 32-bit enums. They seem to be harmless, and I'm pretty sure that you can ignore those warnings.

If you're too lazy to compile it

If you don't feel like spending the time to set up a development environment, and compile SIMH by yourself, I have a pre-compiled version for Android available. I used SIMH v3.8-1, which is the newest release as of this posting. You can get a pre-compiled copy of the Android emulator to test on from the pre-compiled Android SDK. After you download and unpack that, you will need to put the tools directory inside the sdk into your path.

Preparing your phone and copying things over

You will need a rooted phone to make this work easily. Rooting your phone is left as an exercise for the reader. Once you have root permissions, you will need to USB debugging on your phone. On my Evo, it's under Settings -> Applications -> Development -> USB debugging.

You will need an application that will work as a terminal emulator on your phone. You can use "adb shell" from the Android SDK, which will give you a shell from your phone on your computer. To run completely hosted from the phone, use something like ConnectBot.

From this point you can copy files necessary for the emulator to your phone. I copied the necessary files all to my phone's SD card:

$ adb shell mkdir /sdcard/simh
$ adb push BIN/vax /sdcard/simh
$ adb push VAX/ka655x.bin /sdcard/simh
...

Now, use adb shell to do a few things on the phone itself. Pay attention to what your phone is saying, as the first time you try to "su" on your phone, it may pop up a dialog asking if this is ok. Whether or not you see that will depend on exactly how you rooted your phone. If "adb shell" gives you a "#" prompt straight away, you don't need to use su, as you're already root.

$ adb shell
$ su -
# mkdir /data/simh
# cat /sdcard/simh/vax
# chmod 755 /sdcard/simh/vax

It is necessary to put any executables on the internal storage (eg in /data/simh like I did), because you cannot directly execute binaries from the sdcard, at least in Android 2.1.

Running a SIMH emulator

Once you've done that, as root on the phone, do something like:

# cd /sdcard/simh
# /data/simh/vax

And then just use SIMH as you normally would.

Connectbot allows for multiple simultaneous sessions, so you could do "set console telnet=2300" inside the emulator, and then open another telnet session in connectbot to 127.0.0.1:2300 to connect to a separate console. Connectbot simulates "screen" as a terminal emulator, which seems to do an adequate VT100 emulation for most things. The most I've tested it so far it running /usr/games/worms from 4.3BSD, after "setenv TERM vt100". If you have your device on a WiFi network, you could even use another machine to telnet into SIMH and be the console, or another emulated serial terminal on the system.

That's it! If you don't know what to do from here, take a look around the SIMH site, you can run things like ancient versions of UNIX, OpenVMS through HP's hobbyist program, NetBSD, or play with the other emulators and other OSes. Where you go from here is up to you.

EDIT Aug 8, 2010:

I forgot to add the pictures that I have taken of this. Check them out on flickr. I also now added the picture to the top.

Friday, August 6, 2010

Back again

After a bit of a hiatus from posts, I'm back again.

I've finished a big hurdle in the start of downsizing and creating focus in my computer and interesting technology collection, and after finally buying a house, I have completely finished moving out of a ~2500 sq ft warehouse that I was using to store my collection. At its peak, it was stacked high with equipment, and paths to navigate through the space were sometimes non-existent.

I have since resolved to limit the size of my collection, and thus the number of projects that I want to do. As a direct result of my massive work in downsizing (with critical help from friends for some larger items and a push to finish up towards the end of July - but otherwise mostly done by myself over the January through July of 2010), I now have time to work on projects instead of just spending all of my time moving things back and forth.

I replaced my old iPhone 1.0 with an HTC Evo, running Android, back in June when they came out, and have been exceedingly happy with it. As a direct result of how easy it is to hack and develop on, I have started some projects involving running computer simulators on it. Currently, I have the set of VAX emulators (VAXstation 3900 and VAX-11/780) from Bob Supnik's Simh emulator collection running both 4.3BSD and OpenVMS, with a minimal amount of work.

I have posted pictures of 4.3BSD running both on the Android emulator, and my phone on my Flickr account.

My next post will describe what I had to do to get Simh to compile and run on my phone, and after playing with that, my next target is the Hercules IBM Mainframe emulator.