This chapter describes the details about building a Beowulf cluster using Debian/GNU Linux and FAI. This chapter was written for FAI version 2.x for Debian woody and was not yet updated. The example configuration files were removed from the fai packages after FAI 2.8.4.
For more information about the Beowulf concept look at http://www.beowulf.org.
The example of a Beowulf cluster consists of one master node and 25 clients. A big rack was assembled which all the cases were put into. A keyboard and a monitor, which are connected to the master server most of the time, were also put into the rack. But since we have very long cables for a monitor and a keyboard, they can also be connected to all nodes if something has to be changed in the BIOS, or when looking for errors when a node does not boot. Power supply is another topic you have to think about. Don't connect many nodes to one power cord and one outlet. Distribute them among several breakout boxes and outlets. And what about the heat emission? A dozen nodes in a small room can create too much heat, so you will need an air conditioner. Will the power supplies of each node go to stand-by mode or are all nodes turned on simultaneously after a power failure?
All computers in this example are connected to a Fast Ethernet switch. The master node (or master server) is called nucleus. It has two network cards. One for the connection to the external Internet, one for the connection to the internal cluster network. If connected from the external Internet, it's called nucleus, but the cluster nodes access the master node with the name atom00, which is a name for the second network interface. The master server is also the install server for the computing nodes. A local Debian mirror will be installed on the local hard disk. The home directories of all user accounts are also located on the master server. It will be exported via NFS to all computing nodes. NIS will be used to distribute account, host, and printer information to all nodes.
All client nodes atom01 to atom25 are connected via the switch with the second interface card of the master node. They can only connect to the other nodes or the master, but can't communicate to any host outside their cluster network. So, all services (NTP, DNS, NIS, NFS, …) must be available on the master server. I chose the class C network address 192.168.42.0 for building the local Beowulf cluster network. You can replace the subnet 42 with any other number you like. If you have more than 253 computing nodes, choose a class A network address (10.X.X.X).
In the phase of preparing the installation, you have to boot the first install client many times, until there's no fault in your configuration scripts. Therefore you should have physical access to the master server and one client node. So, connect both computers to a switch box, so one keyboard and monitor can be shared among both.
The master server will be installed by hand if it is your first computer installed with Debian. If you already have a host running Debian, you can also install the master server via FAI. Create a partition on /files/scratch/debmirror for the local Debian mirror with more than 22 GB space available.
Add the following lines for the second network card to /etc/network/interfaces:
# Beowulf cluster connection auto eth1 iface eth1 inet static address 192.168.42.250 netmask 255.255.255.0 broadcast 192.168.42.255
Add the IP addresses for the client nodes. The FAI package has an example for this /etc/hosts file:
# create these entries with the Perl one liner
# perl -e 'for (1..25) {printf "192.168.42.%s atom%02s\n",$_,$_;}'
# Beowulf nodes
# atom00 is the master server
192.168.42.250 atom00
192.168.42.1 atom01
192.168.42.2 atom02You can give the internal Beowulf network a name when you add this line to /etc/networks:
beowcluster 192.168.42.0
Activate the second network interface with: /etc/init.d/networking start.
Add a normal user account tom which is the person who edits
the configuration space and manages the local Debian mirror:
# adduser tom # addgroup linuxadmin
This user should also be in the group linuxadmin.
# adduser tom linuxadmin
First, set the NIS domainname name by creating the file
/etc/defaultdomain and call domainname(8). To initialize the
master server as NIS server call /usr/lib/yp/ypinit -m. Also edit
/etc/default/nis so the host becomes a NIS master server. Then,
copy the file netgroup from the examples directory to /etc and
edit other files there. Adjust access to the NIS service.
/etc/ypserv.securenets: # Always allow access for localhost 255.0.0.0 127.0.0.0 # This line gives access to the Beowulf cluster 255.255.255.0 192.168.42.0
Rebuild the NIS maps:
master# cd /var/yp; make
You will find much more information about NIS in the NIS-HOWTO document.
Now the user tom can create a local Debian mirror on
/files/scratch/debmirror using mkdebmirror. You can add the option
—debug to see which files are received. This will need about 250 GB
disk space for Debian 3.0 (aka woody). Export this directory to the
netgroup @faiclients read only. Here's the line for /etc/exports:
/files/scratch/debmirror *(ro)
Add the following packages to the install server:
nucleus:/# apt-get install ntp tftpd-hpa dhcp3-server \ nfs-kernel-server etherwake fai nucleus:/# tasksel -q -n install dns-server nucleus:/# apt-get dselect-upgrade
Configure NTP so that the master server will have the correct system time.
It's very important to use the internal network name atom00 for the
master server (not the external name nucleus) in
/etc/dhcp3/dhcpd.conf and make-fai-nfsroot.conf. Replace the
strings FAISERVER with atom00 and uncomment the following line in
make-fai-nfsroot.conf so the Beowulf nodes can use the name for
connecting to their master server.
NFSROOT_ETC_HOSTS="192.168.42.250 atom00"
Set up the install server daemon as described in [pxeboot]. If you will have many cluster nodes (more than about 10) and you will use rsh in fai.conf raise the number of connects per minute to some services in inetd.conf:
shell stream tcp nowait.300 root /usr/sbin/tcpd /usr/sbin/in.rshd login stream tcp nowait.300 root /usr/sbin/tcpd /usr/sbin/in.rlogind
The user tom should have permission to create the
symlinks for booting via network card, so change the group and add
some utilities.
# chgrp -R linuxadmin /srv/tftp/fai; chmod -R g+rwx /srv/tftp/fai # cp /usr/share/doc/fai-doc/examples/utils/* /usr/local/bin
Now, the user tom sets the boot image for the first beowulf
node.
$ fai-chboot -IFv atom01
Now boot the first client node for the first time. Then start to adjust the configuration for your client nodes.
The following tools are useful for a Beowulf cluster:
all_hosts
to get the list of all hosts up. You can also use the dsh(1) command
(dancer's shell, or distributed shell).
rup(1) shows briefly the CPU load of every host.
These are some common tools for a cluster environment:
rgang. It's is a tool which executes commands
on or distributes files to many nodes. It uses an algorithm to build a
tree-like structure to allow the distribution processing time to scale
very well to 1000 or more nodes (available at
http://fermitools.fnal.gov/abstracts/rgang/abstract.html).
jmon(1) which installs a simple daemon on every cluster
node.
Wake on LAN is a very nice feature to power on a computer without having physical access to it. By sending a special ethernet packet to the network card, the computer will be turned on. The following things have to be done, to use the wake on LAN (WOL) feature.
To wake up a computer use the command etherwake(8).