Wednesday, April 13, 2011

IBM Mainframes at home




Introduction

I recently seized the opportunity to acquire a good-condition IBM zSeries 890 computer, which I knew was taken good care of (including careful de-installation).  This is a model that was sold from April of 2004 up until June of 2008.  It is a newer model of the series of computers that IBM initially released with the IBM System/360 back in 1964.  The instruction set and system features have had some changes and additions in that timeframe, but in the end, most user programs written for the S/360 Model 30 in 1964 still run without any changes on the current systems that are being sold, such as the IBM zEnterprise 196 (though changes have required updates to systems level software, such as operating systems).

Major landmarks are IBM S/360 introduction in 1964 (with a 24-bit address space), Virtual memory becoming standard on S/370 in 1972, 31-bit address space with S/370-XA in 1981, multiple CPU/SMP systems in 1973, LPAR (logical partitions) in 1980, S/390 with ESCON (10 and later 17 MByte/s fiber-based peripheral attachment), CMOS-based CPUs with the S/390 G3 in the 1990s, and 64-bit architecture in 2000 with the IBM z900.  More information about all of these is available via google.

System Details

The system is in a wide, heavy rack, about 30" wide and 76" tall, and about 1500lbs, according to IBM.  This is a typical weight and size for an IBM rack of equipment (similar to older IBM S/390 and RS/6000 SP systems).  Inside the rack, starting at the top are the rack-level power supply, processor and memory cabinet (CEC), and at the bottom, the I/O card enclosure.  On a set of fold-out arms on the front side of the machine are the Service Element (SE) laptops, which are sort of a service processor - they manage loading firmware into the system, turning power on and off, configuring the hypervisor for LPARs, accessing the system consoles, and other low-level system tasks.  They manage features such as Capacity Upgrade on Demand (CUoD), Capacity Backup, limiting the hardware to what you have actually paid for, and are serial number locked to the particular machine they're connected to.

My machine is a 2086-A04 (which is common across all z890 systems), and configured as a 2086-140, with two dual-port fiber Gigabit Ethernet adapters, two dual-port 2Gb Fiberchannel/FICON adapters, and two 15-port ESCON adapters.  In addition, the SE laptops connect to the outside world using a pair of 10/100 PCMCIA ethernet adapters.  Earlier systems used Token Ring instead... I'm happy to not have to deal with Token Ring for this anymore.

Central Electronics Complex - CPU and Memory

The -140 configuration means that there's 1 CP (Central Processor), which runs at speed level 4 (on a scale of 1 to 7), which can run any z890-compatible OS.  It also has one "IFL" (Integrated Facility for Linux), which is one processor core which runs at full speed (level 7), which is microcode locked to only running Linux, and doesn't count towards the speed rating of the machine.

IBM mainframes are given a speed rating, originally in MIPS, and now in MSU, which is used to price software that runs on the mainframe.  Mine is rated 110 MIPS, and 17 MSU. By comparison, a full speed CPU is rated 366 MIPS and 56 MSU, which is the speed of my IFL.  The z890 came in 28 different speed increments (1-4 CPUS x 1-7 speed ratings) from 26 MIPS / 4 MSU to 1365 MIPS / 208 MSU, to closely match the amount of system speed you required, which helps keep your software costs down.

Processor book, with memory removed

The z890 is actually nearly the same as a z990 system, with the exception that it has only one processor book (out of 4 that can fit into a one-rack z990 system), and only scales to 4 user-accessible cores (there is one additional core, called a "System Assist Processor" or SAP, which is dedicated to doing system I/O handling).  The z890 can scale up to 56 total processor cores.

System memory module - 8GB

The system has 8GB of RAM, which is a custom module that plugs into the processor "book".  The memory is available in 8GB increments up to 32GB of ram.  The 24GB option, as there is only one RAM slot, is actually implemented as a 32GB module with 8GB "turned off" by IBM.  As you will learn by working with these systems for a bit, IBM has an annoying habit of charging you to turn on parts of the system which you already own, but haven't paid to unlock yet.

STI cables and connectors

Each processor book (which is only one for the z890) has 8 "STI" (Self-Timed Interface) links, which connect the CPU and memory to I/O adapters.  In the z890, these links run at 2GBps (16Gbps).  There are 7 that you can use in a fully configured system for I/O cards, and you can also use the remaining one (or more) to network (couple) systems together to build a system that is more redundant.  With the apporpriate hardware, you can run these coupling links (but not at the full 2GBps) up to 100km, creating as IBM calls it, a "geographically-diverse" system, which can help provide for a disaster-recovery system link.

Peripherals

Two things that I appreciate about this system compared to past systems are the ethernet on its SE (which also lets me boot the system from an FTP server on its network), and SCSI over Fiberchannel support, which lets me use standard FC raid arrays as storage for a Linux system.

IBM 3174 Terminal Controllers and 3483 Terminal

In addition, the FICON can act as a system channel (like ESCON or Bus & Tag) to appropriate peripherals, at speeds of up to about 200MBps.  It also has ESCON, which is a more traditional connection, and allows up to 17MBps connection to peripherals.  Parallel Channel (aka S/370 Channel, or Bus & Tag) peripherals can be hooked up using a protocol converter such as the IBM 9034.  This will let me connect my 3480 cartridge-style Channel-Attached tape drives, 3174-21L 3270-style terminal controllers, and eventually 3420-8 vacuum-column 9-track tape drives, to the system.

IBM 9034 Escon to Bus & Tag Converter

To Be Continued...

Future posts will document installation and setup of Debian GNU/Linux, and peripherals.

Saturday, October 9, 2010

IBM p550 at home with Debian and Infiniband


It's time for yet another blog post on getting once expensive machines running at home.

Introduction

I'm working with an IBM pSeries machine called a p550 (specifically, a model 9113-550 in IBM speak). It was built in 2004, had a list price of some 10s of kilo-bucks new, has 4 x 1.5GHz POWER5, 8GB RAM, and runs AIX up through the most recent releases of AIX 7.

It has a built-in hypervisor and what IBM calls "LPAR" support, which is a mode of virtualization which gives you "Logical PARtitions" of the memory and CPUs in the machine, with a granularity of 1/10th of a CPU. LPAR support requires a desktop machine that IBM calls an HMC, or "Hardware Management Console", which breaks out all of the logical consoles on the machine, and allows you to configure resource allocation and things like virtual ethernet switches and virtual SCSI adapters. In addition, a piece of software for the machine called VIOS or "Virtual I/O Server" is required for LPAR mode if you want to share hardware adapters (eg, ethernet, SCSI or Fiberchannel adapters) between OSes. Since I have neither of those, I am just running the machine in "bare metal" mode, with only one OS instance.

For I/O, the system has a built-in SCSI raid, gigabit ethernet, a Service Processor which controls functions like power on the machine, 5 internal hot-plug PCI-X slots, and an external link that allows for more I/O trays with disk and PCI-X cards to be added. I have installed a Mellanox Infinihost Infiniband card, to hook up to my Infiniband fabric.

Making the Service Processor work for you

In addition to the serial console port, the system has a pair of ethernet ports, which are designed to connect to a system HMC, but which also allow https-based access to the service processor menus. By default, it will try to get an address via dhcp, or you can configure it through the serial port. The Service processor requires you to log in to do anything. I believe that the default username/password combination is admin/admin. That's what we had it set to on the machines at work.

To set the IP address, you need to navigate through the menus:
5. Network Services
1. Network Configuration
1. Configure interface Eth0
Then, chose either static or dynamic, and enter information as needed.

In order to get this to work for me, I had to use Firefox, and enable an SSL option, because while it uses https, it uses a somewhat insecure method of doing SSL that is disabled by default. To enable this, put "about:config" in the address bar, and change the option "security.ssl3.rsa_null_md5" to "true". Once you do that, you can get to the web version of the service processor menus (ASPI in IBM-speak) at https://1.2.3.4 (replacing 1.2.3.4 with the IP address you set above).

One additional thing you will probably want to set up is "Serial Port Snoop" under System Service Aids -> Serial Port Snoop. Setting a "Snoop String" will all you to enter a string through the serial console to force reboot the machine if it locks up, or you do something wrong while booting, and the console isn't set to the right place.

Installing Debian

I net-booted the installer. To do this, set up the host in dhcpd.conf with an entry like this:

host p550 {
hardware ethernet 00:02:55:df:d5:dd;
fixed-address p550.blah;
next-server storage.blah;
filename "/tftpboot/debian-squeeze-ppc64-vmlinuz-chrp.initrd";
}

Boot the machine into OpenFirmware (hit "8" at the firmware "IBM IBM IBM IBM ..." screen), and net-boot from there:

0> boot net console=hvsi0

If you don't boot with the right args from openfirmware, you won't get a working console when you boot into the installer. That's where the "serial port snoop" option from the service processor comes in handy.

Once you get to the end of the installer, you will need to do some magic to get the bootloader (yaboot) installed. Hopefully, the Debian people will get some of this sorted out before the release of Squeeze. Tell the installer that you want a shell, then do this:

# mount --bind /dev /target/dev
# chroot /target
# mount -t proc proc proc
# yabootconfig
# ybin

Upgrading firmware

Debian doesn't include binary update_flash in its powerpc-utils package. Download the latest binary release in RPM format.

Convert that to an rpm with Alien (apt-get install alien if you don't have it):

# alien powerpc-utils-1.2.3-0.ppc.rpm

then

# apt-get remove powerpc-ibm-utils powerpc-utils
# dpkg -i powerpc-utils_1.2.3-1_powerpc.deb

Now, you can download a new flash image from IBM. Once you get it, use alien to convert and unpack the rpm, and do "update_flash ./tmp/fwupdate/01SF240_403_382", where 01SF240_403_382 is the flash image name from the RPM you downloaded. When you reboot the system, Linux will update the system flash just before rebooting.

Infiniband and beyond

I had some problems initially getting Infiniband set up and going. I'm using a Topspin SDR Infiniband adapter, which is basically a stock Mellanox InfiniHost . It seems that the hypervisor on the machine wasn't allocating all of the resources that the card was asking for.

After some discussion on the linuxppc-dev mailing list, it was pointed out that there are certain slots in the machine which the system calls "super slots", and which the firmware is willing to allocate more resources than a typical PCI-X card requests. This Redbook (PDF) on IBM's redbook site details Infiniband usage on pSeries systems, Section 3.4.3 indicates which slots you may install an infiniband adapter into on certain machines. On a p550, these are slots C2 and C5. I had plugged my IB adapter into slot C1, which is why I was having problems.

After getting it into the slot, it was just a matter of getting the right drivers loaded on the host OS. In order to use IP over Infiniband, you'll want the ib_ipoib module. To use RDMA and the Verbs interface, you'll want ib_umad and ib_uverbs modules to be loaded. At this point, it basically acts like a typical Linux system with Infiniband, just with more I/O bandwidth than you can get out of a typical PCI-X based system.

What next?

Setting up an HMC, and playing around with virtualization on the machine sounds like it could be a good time.

Saturday, October 2, 2010

Running an Altix 4700 supercomputer at home

Introduction

At the beginning of the year, work decomissioned the SGI Altix 4700 system that we put into production around January 2007. It sat around unused, and we had little luck finding a buyer for the system - it seems that no one is really commercially interested in running Linux on big Itanium systems anymore.

What's an Altix 4700?

Briefly, an SGI Altix 4700 is a large multi-processor SSI (single-system image) supercomputer, which uses Intel Itanium 2 (in my case, 1.6GHz, dual-core Montecito) processors, and memory on blades, which are interconnected using a "ccNUMA" architecture. This stands for "cache-coherent Non-Uniform Memory Access" - basically, a method of making large SMP-like machines by gluing processors with local memory together with a system interconnect.

With NUMA, unlike SMP, there is memory that is closer to (and thus faster from) each CPU. Like SMP, however, the system is contained in one single address space (unlike, say, a cluster which is connected using Ethernet or Infiniband). It thus runs a single OS image, and looks to the user like one large SMP system.

System Specs

The system that I have is contained within one rack, and has 4 "bricks", each with 8 processor blades, each blade containing 2 dual-core 1.6GHz Montecito and 16GB of RAM, plus one system I/O blade with disks, PCI-X slots, Gigabit ethernet, USB, etc, and assorted NUMA routing blades and system controllers.

That totals 128 system cores, and 512GB RAM. The theoretical peak GFLOP rating of the system as configured is approx 820GF. Of course, 128 CPU and 256 DIMMs draw a bit of power...

Powering/cooling a supercomputer

Running as a full system, the computer draws 9kW of power, and requires 2 x 200-240V, 30A power circuits. That's a lot of power. I pay about $0.072/kWh, so running the system for one hour costs me about $0.65.

One issue with running a system that draws 9kW is that you get 9kW of heat output. As a coworker of mine has said, at work we run heaters, that produce computation as a side-effect. The easiest and most cost-effective solution is to open up some windows, and turn on some fans. With an outside temperature of about 60F, I can set up a few fans, open some windows, and keep the temperature inside below 80F.

It is possible to deal with this problem, to make the machine a bit more friendly and less power-hungry to run -- you can run less than a full system. By pulling out the blades that you don't want to run, you can cut down the power usage by a proportional amount. For testing purposes, I have run the system with either 1/2 or 1/8 of the blades running to reduce my power usage.

One thing I've noticed is that EFI state information (its equivalent of a PC's "CMOS" configuration memory) is stored and updated only on one system blade. So, you really want to make sure that you have blade "0" (the bottom left blade in the chassis marked "001c01") installed, or booting will become much more difficult.

Installing Debian

At work, we ran the machine with SuSE. Due to licensing issues, the fact that SuSE sucks to administer, and that I prefer to run Debian on things, I got to installing Debian. The machine runs EFI, Intel's "next-generation BIOS" that is used on Itanium (IA-64) systems, and some x86 (PC like) systems such as Apple's Intel-based systems. The boot process is pretty close to PXE booting, and Debian seems to have pretty good IA-64 support. The install went smoothly - I ended up installing "Squeeze" - the next version of Debian that will be released.

Kernel changes

In general, the Debian kernel just works. However, it only has support for up to 64 cpus (cores) built in. I downloaded the latest kernel sources from ftp.kernel.org, primed the configuration with the Debian kernel config, adjusted the max number of CPUs, recompiled, and rebooted the system into the new kernel.

I should note that being able to do a make -j 64 does a lot to speed up a kernel compile... :)

Running HPL

HPL is the standard benchmark to test the effective speed of supercomputers, and is used by the ranking on the "Top500" list at http://top500.org.

HPL is also contained within the "hpcc" benchmark collection, which is how I ran the benchmark. At first, I tried the Debian package for hpcc. I got some fairly poor results, because it doesn't use an optimized "BLAS" math library. To speed things up, I went to TACC's web site and downloaded GotoBLAS2, compiled that with the added gcc option "-mtune=itanium2", copied it to /opt, and built hpcc from its source. The instructions here are useful to build these two software packages.

What next?

The systems' CPUs aren't all that fast compared to modern CPUs. For example, the rating in Gigaflops of all 128 cores is about 4 x the rating of a 3.2GHz IBM/Sony Cell BE CPU. One place that the system does have an advantage, though, is its 512GB of DDR2 memory. Someone could easily plug an Infiniband, 10 Gigabit Ethernet, or Fiberchannel card into the machine, and turn it into a pretty snappy solid-state drive, accessible over iSCSI, FCoE, Infiniband SDP, or straight FC.

The next item that I'm going to work on, is getting some FPGA blades from another Altix system working in the 4700, and test out writing code to use the FPGAs to speed up compute intensive tasks. An example of what FPGAs are typically used for is to search a genomics database for a particular DNA sequence. Basically, any algorithm that is applied to a stream of data can be a good candidate for putting into an FPGA.

Tuesday, August 10, 2010

The HTC Evo and a VT125

This follows on from my previous blog post about booting ancient OSes on emulators for ancient computers on your Android phone.

Since I still have an extensive collection of vintage DEC hardware, I decided to extend what I had been working on by connecting some vintage hardware up to the emulated system.

The first thing I had at hand was a DEC VT125 terminal - a close relative of the VT100, which includes some added graphics support. A quick power-up verified that it work. Now, to get it to talk to my phone's emulated VAX.

Now, I didn't have any good way of connecting a serial port directly to my phone, but SIMH does support a telnet connection to the emulator's console by doing a command like this (2301 is the TCP port to accept connections on):

sim> set console telnet=2301

Now, to create a telnet session from my VT125, I grabbed a Xyplex MAXserver 1640 from my bastement. These are similar to a vintage DECserver, but support telnet (and many other things) in addition to the usual DEC-specific LAT protocol. Basically, I can hook a terminal up to this, and use it to telnet to a host. It also works the other way around, so that I can hook a system's serial port up to it (such as a console port), use telnet to connect to the MAXserver, and connect to the serial console on the physical machine. This later setup is something we do with systems at work, and is very commonly done, as opposed to connecting serial terminals up to network-attached hosts, like I am trying to do.

For the purposes of this post, I won't go into detail on how to set up a MAXserver, but I basically placed a boot image on a tftp server, told the MAXserver to boot from that image, gave it an IP address, and told it to reset itself to default settings.

To connect the MAXserver to the VT125, I used a RJ45 serial cable (a "roll over" cable) and DB25 to RJ45 adapter, which has the same pinout as a Cisco RJ45 to DB25 DTE serial cable. Ethernet then connects the MAXserver to my home network.


Next, I set my phone to connect over WiFi to my home network, and noted its IP address. After turning on my VT125, booting the MAXserver, and starting up the emulated VAX on my phone. From there, I told the MAXserver to connect to my emulated VAX console:

Xyplex> connect 172.27.3.150:2301

Now, just boot the emulated VAX, and enjoy!


I have more pictures up on my flickr account.

Saturday, August 7, 2010

Compiling SIMH emulators for Android


In order to make this process easier on my phone, I used a rooted firmware. It will take some more effort to get this to work as a packaged application.
Setting up the development environment.
To get started, you'll need a working native C compiler for Android. After a lot of trial and error, I ended up discovering what I had to do. I used an amd64 architecture version of Debian GNU/Linux, Ubuntu works the same way. Using a Linux host to compile the code if not essential will make your life a lot easier. Follow the official instructions to download the source to Android using git, and then build it.

Along with doing that, I tested this on the Android emulator, which is a part of what you just downloaded and built, under out/host/linux-x86/bin. Put that directory in your path, and run the "android" command, create a virtual platform, and boot it. From the command line, if you built a virtual device named "Android21", for example, you'll want to run "emulator -avd Android21 -shell" so that you can get a shell on the virtual device. To copy files over to the image, the easiest method that I've found is to shut down the emulator, mount the virtual sdcard image (for these examples, I'm using "Android21" as the virtual device name):

$ sudo mount -o loop ~/.android/avd/Android21.avd/sdcard.img /mnt
$ sudo cp whatever /mnt
$ sudo umount /mnt

In addition, you will want the "agcc" script to make it easier to compile things. Download it from here. I modified my copy to use gcc-4.4.0 instead of gcc-4.2.1. You will need to add the location of arm-eabi-gcc to your path, and change all references of "4.2.1" in the agcc script to "4.4.0" and add that to your path. arm-eabi-gcc can be found under the prebuilt/linux-x86/toolchain/arm-eabi-4.4.0/bin path of where you built the android sources.

Fixing the Android SDK and SIMH makefile

The current version of the Android libc (bionic) headers has a problem compiling when bionic/libc/kernel/arch-arm/asm/byteorder.h is included. In order to make the file compile, comment out these lines, lines 22-27 in my copy:

/*#ifndef __thumb__
* if (!__builtin_constant_p(x)) {
*
* asm ("eor\t%0, %1, %1, ror #16" : "=r" (t) : "r" (x));
* } else
*#endif */

Once you do that, you will need to need to download the SIMH sources. Unpack them, and modify the the makefile in the top directory, to make the following changes:

On line 12, remove -lrt from OS_CCDEFS, so that it reads:
OS_CCDEFS = -lm -D_GNU_SOURCE

Change all references of "gcc" to "agcc".

Only some of the simulators actually compile, and I haven't tried to compile network support for any of it. In order to get network support, you would need to compile libpcap as well. For now, that is left as an exercise for the reader. :)

After those changes, I just did make vax vax780 to build the MicroVAX 3900 and VAX-11/780 simulators, which can be copied over to your phone or emulator. You will probably get a bunch of warnings about the use of variable-size enums versus 32-bit enums. They seem to be harmless, and I'm pretty sure that you can ignore those warnings.

If you're too lazy to compile it

If you don't feel like spending the time to set up a development environment, and compile SIMH by yourself, I have a pre-compiled version for Android available. I used SIMH v3.8-1, which is the newest release as of this posting. You can get a pre-compiled copy of the Android emulator to test on from the pre-compiled Android SDK. After you download and unpack that, you will need to put the tools directory inside the sdk into your path.

Preparing your phone and copying things over

You will need a rooted phone to make this work easily. Rooting your phone is left as an exercise for the reader. Once you have root permissions, you will need to USB debugging on your phone. On my Evo, it's under Settings -> Applications -> Development -> USB debugging.

You will need an application that will work as a terminal emulator on your phone. You can use "adb shell" from the Android SDK, which will give you a shell from your phone on your computer. To run completely hosted from the phone, use something like ConnectBot.

From this point you can copy files necessary for the emulator to your phone. I copied the necessary files all to my phone's SD card:

$ adb shell mkdir /sdcard/simh
$ adb push BIN/vax /sdcard/simh
$ adb push VAX/ka655x.bin /sdcard/simh
...

Now, use adb shell to do a few things on the phone itself. Pay attention to what your phone is saying, as the first time you try to "su" on your phone, it may pop up a dialog asking if this is ok. Whether or not you see that will depend on exactly how you rooted your phone. If "adb shell" gives you a "#" prompt straight away, you don't need to use su, as you're already root.

$ adb shell
$ su -
# mkdir /data/simh
# cat /sdcard/simh/vax
# chmod 755 /sdcard/simh/vax

It is necessary to put any executables on the internal storage (eg in /data/simh like I did), because you cannot directly execute binaries from the sdcard, at least in Android 2.1.

Running a SIMH emulator

Once you've done that, as root on the phone, do something like:

# cd /sdcard/simh
# /data/simh/vax

And then just use SIMH as you normally would.

Connectbot allows for multiple simultaneous sessions, so you could do "set console telnet=2300" inside the emulator, and then open another telnet session in connectbot to 127.0.0.1:2300 to connect to a separate console. Connectbot simulates "screen" as a terminal emulator, which seems to do an adequate VT100 emulation for most things. The most I've tested it so far it running /usr/games/worms from 4.3BSD, after "setenv TERM vt100". If you have your device on a WiFi network, you could even use another machine to telnet into SIMH and be the console, or another emulated serial terminal on the system.

That's it! If you don't know what to do from here, take a look around the SIMH site, you can run things like ancient versions of UNIX, OpenVMS through HP's hobbyist program, NetBSD, or play with the other emulators and other OSes. Where you go from here is up to you.

EDIT Aug 8, 2010:

I forgot to add the pictures that I have taken of this. Check them out on flickr. I also now added the picture to the top.

Friday, August 6, 2010

Back again

After a bit of a hiatus from posts, I'm back again.

I've finished a big hurdle in the start of downsizing and creating focus in my computer and interesting technology collection, and after finally buying a house, I have completely finished moving out of a ~2500 sq ft warehouse that I was using to store my collection. At its peak, it was stacked high with equipment, and paths to navigate through the space were sometimes non-existent.

I have since resolved to limit the size of my collection, and thus the number of projects that I want to do. As a direct result of my massive work in downsizing (with critical help from friends for some larger items and a push to finish up towards the end of July - but otherwise mostly done by myself over the January through July of 2010), I now have time to work on projects instead of just spending all of my time moving things back and forth.

I replaced my old iPhone 1.0 with an HTC Evo, running Android, back in June when they came out, and have been exceedingly happy with it. As a direct result of how easy it is to hack and develop on, I have started some projects involving running computer simulators on it. Currently, I have the set of VAX emulators (VAXstation 3900 and VAX-11/780) from Bob Supnik's Simh emulator collection running both 4.3BSD and OpenVMS, with a minimal amount of work.

I have posted pictures of 4.3BSD running both on the Android emulator, and my phone on my Flickr account.

My next post will describe what I had to do to get Simh to compile and run on my phone, and after playing with that, my next target is the Hercules IBM Mainframe emulator.

Saturday, February 28, 2009

New blog

I've created a new blog to keep track of the progress we're making at work on our new Coates cluster that we're putting in this spring. I know I haven't posted here in a while, but I expect to spend more time posting to that one (and I want to keep the content separate from my personal blog anyways).