Facility


Computer Cluster: HAGAR

Hagar the Horrible is, of course only a cartoon character, drawn by Chris Browne, who first appeared in 1972. His popularity soon grew and now appears in more than 1,900 newspapers in 58 countries around the world, being translated into 13 languages. Keeping with the Scandinavian feel that the name Beowulf implies and my love for cartoons, it was decided that our Do-It-Yourself supercomputer should share its name with one of my favorite cartoon characters.

The following description of our system and a brief overview of the installation process we used are hoped to prove useful for anybody who is considering buying or building there own Beowulf Cluster. More details of the issues involved in the design and building of these types of clusters can be found on-line or in Sterling et als book How to Build a Beowulf.

lab.png

THE HARDWARE

Hagar, at present, currently consists of three Beowulf computer clusters.

The first cluster is a part of NSF Science and Technology Center for Environmentally Friendly Solvents and Processes and consists of 17 Compaq DS10 Alpha-servers and 33 Dell Precision 330 computers. Each Alpha node contains a 466MHz 21264 processor with 256MB memory and 9Gb hard drive. The communications between nodes are through 100Mbps Fast Ethernet local connection using Baystack 45024T switch by Nortel Networks. Precision 330 nodes contain 1.7Ghz Pentium4 processors, 1Gb of memory, 20Gb hard drive and communicate through 100Mbps Fast Ethernet local connection using 3Com Superstack3 switch. The new cluster to be purchased as major equipment in the proposed grant will replace this 8-year-old cluster.

The second cluster consists of 41 HP d530 computers. These HP nodes contain 3.0Ghz Pentium4 processor, 1Gb of memory, 40Gb hard drive each and communicate through 1000Mbps Gigabit Ethernet local connection using 2 NetGear GS524T switches.

The third cluster consists of 10 Dell PE 1950 III nodes. Power Edge nodes contain two 3.16 GHz Quad Xeon processors each (8 CPUs per node), 16 Gb of memory and 250 Gb hard drive and communicate through 1000Mbps Gigabit Ethernet local connection using Dell switch.

Two storage units are used to store and backup data: Dell PV 220S (700 Gb) and Dell MD1000 (4Tb).

SOFTWARE INSTALLATION

The operating system we use is RedHat Linux 7.1. For more information and free downloads goto www.redhat.com. Each internal node was configured identically, with the same partition sizes, bare minimum of software (i.e. system and network software only) and configuration files (except of course the bits which refer to that machines identity). Unfortunately the configuration of each machine has done manually since the cloning methods described elsewhere require a boot disk. The problem we encountered was that that the kernel for the Alpha system is larger than 1.4 Mb and therefore cannot be copied onto a floppy disk. The master node was set up differently, with the full Linux installation being used. User accounts are also stored only on this machine. Third party software is also only installed on this machine.

In order to allow users to login in to internal nodes (necessary for them to use MPI and the queuing system) all the nodes, with aliases, are listed in the /etc/hosts.equiv and /etc/shosts.equiv files, thus eliminating the need for users to create .rhosts and .shosts files. The /etc/pam.d/rlogin file was also modified to make sure that secure tty is not required when logging into the internal nodes. The resulting ends up looking like:

auth optional /lib/security/pam_securetty.so
auth sufficient /lib/security/pam_rhosts_auth.so
auth required /lib/security/pam_pwdb.so shadow nullok
auth required /lib/security/pam_nologin.so
account required /lib/security/pam_pwdb.so
password required /lib/security/pam_cracklib.so
password required /lib/security/pam_pwdb.so nullok use_authtok md5 shadow
session required /lib/security/pam_pwdb.so

Software is shared over our internal network via the exporting of the scratch partition to all other nodes. To achieve this the line

/scratch a???.hagar.unc.edu(rw,no_root_squash)

is added to the /etc/exports file. To set up the automounting service (autofs) the /etc/auto.master file has the line

/scratch a???.hagar.unc.edu(rw,no_root_squash)

is added to the /etc/exports file. To set up the automounting service (autofs) the /etc/auto.master file has the line

/scratch/nodes /etc/auto.beowulf --timeout 600

added. The file /etc/auto.beowulf has to be created and contains the entries

a000 -fstype=nfs a000:/scratch
a001 -fstype=nfs a001:/scratch
a002 -fstype=nfs a002:/scratch
......
a016 -fstype=nfs a016:/scratch

User accounts (held on the master node) are also mounted this way with the line

home -fstype=nfs a000:/home

also being added to the /etc/auto.beowulf file. Symbolic links were then set on the internal nodes, such that /home path points to the

/scratch/nodes/home path, i.e.

ls -s /scratch/nodes/home /home

On the master node a large /usr partition was created, and the director /usr/local, where third party software was installed, is automounted by the internal nodes. Thus this software has to be installed and maintained on one machine.

The version of MPI employed in LAM 6.3.2 which was obtained from www.mpi.nd.edu/lam/, and the DQS queuing system is employed is can be obtained from www.scri.fsu.edu/~pasko/dqs.html. It is also recommended that the Compaq Fortran and C compilers are obtained, details can be found at www.compaq.com, since we observed a 3 times speed up compared with the GNU compilers.

The current state of jobs and queues online. (access from lab computers only)