To solve a complex problem correctly and efficiently, there are several concerns in computational physics. One of these concerns is how to speed up the simulation process. One solution for speeding up is to use or buy expensive parallel supercomputer. For most research groups, this is not practical. Thanks to the advanced technology of current PC, we can build up parallel computing cluster entirely from commodity parts both in hardware and software within an affordable budget.
The original PC cluster project, also called Beowulf project, was started at the Center of Excellence in Space Data and Information Sciences NASA in early 1994. It is a system which usually consists of one master or server node, and one or more client nodes connected together via Ethernet. The master node controls the whole cluster and serves files to the client nodes. The master node is also the cluster's console and gateway to the outside Internet world.
The advantages of a Beowulf-like cluster are:
Two Pentium III 1GHz CPU, 512M RAM, three 3Com 3c905c ethernet cards, one 30G bytes hard disk, one floppy, one VGA card, one monitor
Two Pentium III 1GHz CPU, 512M RAM, two 3Com 3c905c ethernet cards, one 30G bytes hard disk, one floppy, one VGA card, one monitor
One Pentium III 1GHz CPU, 512M RAM, two 3Com 3c905c ethernet cards, no hard disk, one floppy, one VGA cards (for debugging)
D-Link DES-1016R, D-Link DFE-916DX
The server node needs a full installation of RedHat Linux OS.
The server must prepare a simplified kernel image for client nodes to down load.
The server must prepare a mimimum root file system for client nodes to mount.
The client node must have a network boot disk to boot from its floppy driver then ask for the kernel image from network.
The client node found a kernel image from the server, down load it, uncompress it, and execute it.
The kernel image for client than will mount its root file system as a NFS-root file system. Based on the above analysis, we need to know
Linux kernel is a multi-processes, multi-user system. It contains several components such as process management, memory management, filesystems, device control, networking etc. It responds to user's requests by allocating CPU, RAM, I/O devices, networking resources in a fair way. In short, the kernel of Linux OS is a big chunk of executable code in charge of handling all such requests. If the system want to be functional, the first thing is to down load and execute the kernel.
The size of kernel can be big or small totally depend on the application. For example, if you don't need PCMCIA you don't need to include it in the kernel. In general, bigger kernel provides more services but consume more CPU times making system slowing down. For server kernel, it is kind of big because we ask it to do a lot of thing. For client kernel, it is comparable small because it just simply execute the programs assigned by server.
cd /usr/src gzip -cd linux-2.2.XX.tar.gz | tar xvf -to get it all put it in place, where the options -cd in gzip mean that decompress the file then send the output to stdout Replace "XX" with the version number of the latest kernel.
cd /usr/src/linux make mrproper
make config
to configure the basic kernel.
make config
needs bash to work: it will search for bash in
$BASH, /bin/bash and /bin/sh (in that order), so one of those must be
correct for it to work. To see further information about the kernel
configuration, see
Documentation/Configure.help.
Do not skip this step even if you are only upgrading one minor version.
make menuconfig
make xconfig
make oldconfig
: Default all questions based on
the contents of your existing ./.config file.
make dep
to set up all the dependencies
correctly.
make zImage
or make bzImage
to create a
compressed kernel image.
If you configured any of the parts of the kernel as modules, you
will have to do make modules
followed by
make modules_install
. Read
Documentation/modules.txt
for more information.
Beside the kernel, you also need a root file system to host programs, configurations, and data. Creating the root filesystem involves selecting files necessary for the system to run.
A root filesystem must contain everything needed to support a full Linux system. To be able to do this, the disk must include the minimum requirements for a Linux system:
/dev, /proc, /bin, /etc, /lib, /usr,
/tmp,
sh, ls, cp, mv,
etc.,
rc, inittab, fstab,
etc.,
/dev/hd*, /dev/tty*, /dev/fd0,
etc.,
In order to build such a root filesystem, you need a spare device that is large enough to hold all the files before compression. There are several choices: here we choose ramdisk.
Use a ramdisk (DEVICE=/dev/ram0
). In this case,
memory is used to simulate a disk drive. To learn how to use
ramdisk see the following link
How to Use a Ramdisk for Linux.
Prepare the DEVICE with:
dd if=/dev/zero of=/dev/ram0 bs=1k count=4096
This command zeros out the device. Zeroing the device is critical because the filesystem will be compressed later, so all unused portions should be filled with zeros to achieve maximum compression.
Next, create the filesystem.
mke2fs -m 0 -N 2000 /dev/ram0
Next, make a mounting point and mount the device.
mkdir -p /tmp/ramdisk mount -t ext2 /dev/ram0 /tmp/ramdisk
Here is a reasonable minimum set of directories for your root filesystem.
/dev
-- Device files, required to perform I/O
/proc
-- Directory stub required by the proc filesystem
/etc
-- System configuration files
/sbin
-- Critical system binaries
/bin
-- Essential binaries considered part of the system
/mnt
-- A mount point for maintenance on other disks
/usr
-- Additional utilities and applications
First, create the directories listed above.
cd /tmp/ramdisk mkdir dev proc etc sbin bin mnt usr usr/lib
For making /dev
cp -dpR /dev/fd[01]* /tmp/ramdisk/dev cp -dpR /dev/tty[0-6] /tmp/ramdisk/devor
mknod console c 5 1
For the detail root filesystem contents, go to ramdisk.tar
Finally, after you set up all the libraries you need, run
ldconfig to
remake /etc/ld.so.cache
on the root filesystem. The
cache tells the loader where to find the libraries. You can do this with
ldconfig -r /tmp/ramdisk
When you have finished constructing the root filesystem, unmount it, copy it to a file and compress it:
umount /tmp/ramdisk dd if=/dev/ram0 bs=1k | gzip -v9 > rootfs.gz
dd if=rootfs.gz of=/dev/fd0 bs=1k seek=KERNEL_BLOCK
All PC systems starts the boot process by executing code in ROM (specifically, the BIOS) to load the sector from sector 0, cylinder 0 of the boot drive. The boot drive is usually the first floppy drive (/dev/fd0) or first hard disk (/dev/hda). The BIOS then tries to execute this sector. On most bootable disks, sector 0, cylinder 0 contains either:
When the kernel is completely loaded, it initializes device drivers and its internal data structures. Once it is completely initialized, it consults a special location in its image called the ramdisk word. This word tells it how and where to find its root filesystem. A root filesystem is simply a filesystem that will be mounted as '/'. The kernel has to be told where to look for the root filesystem; if it cannot find a loadable image there, it halts.
In some boot situations - often when booting from a diskette - the root filesystem is loaded into a ramdisk, which is RAM accessed by the system as if it were a disk. Also, the kernel can load a compressed filesystems from the floppy and uncompress it onto the ramdisk, allowing many more files to be squeezed onto the diskette.
Once the root filesystem is loaded and mounted, you see a message like:
VFS: Mounted root (ext2 filesystem) readonly.
Once the system has loaded a root filesystem successfully, it tries to
execute the init
program (in /bin
or
/sbin
). init
reads its configuration file
/etc/inittab, looks for a line designated
sysinit
(/etc/rc.d/rc.sysinit,
and executes the named script. This script is a set
of shell commands that set up basic system services, such as fsck
on hard disks, loading necessary kernel modules, initializing
swapping, initializing the network, and mounting disks mentioned in
/etc/fstab
.
The script often invokes various other scripts to do modular
initialization. For example, in the common SysVinit structure, the
directory /etc/rc.d
contains a complex structure of subdirectories
whose files specify how to enable and shut down most system services. However,
on a bootdisk the sysinit script is often very simple.
When sysinit script finishes control retruns to init, which then
enters the default runlevel, specified in /etc/inittab
with the initdefault
keyword.
Next, you must prepare a network booting disk for client computers as the follows:
Down load etherboot-4.0 and etherboot-4.7.24 from http://www.slug.org.au/etherboot. Get the file floppyload.bin from etherboot-4.0/bin and get the file 3c905c-tpo.lzrom from etherboot-4.7.24/src/bin32 then enter the following command to make a booting floppy from network ( you must be super user ) # cat floppyload.bin 3c905c-tpo.lzrom > /dev/fd0 note: To get 3c905c-tpo.lzrom, you must go to etherboot-4.7.24/ src to carry out make. For detail, please see INSTALL instruction.
1. prepare /etc/dhcpd.conf file such as ----------------------------------------------------------------------------- # Sample configuration file for ISCD dhcpd # # Don't forget to set run_dhcpd=1 in /etc/init.d/dhcpd # once you adjusted this file and copied it to /etc/dhcpd.conf. # default-lease-time 21600; max-lease-time 21600; option subnet-mask 255.255.255.0; option broadcast-address 192.168.0.255; shared-network WORKSTATIONS { subnet 192.168.0.0 netmask 255.255.255.0 { } } group { use-host-decl-names on; option log-servers 192.168.0.254; host pc1 { hardware ethernet 00:01:02:92:70:69; fixed-address 192.168.0.1; filename "/tftpboot/pc1/vmlinuz.3c905nomodPc1"; } host pc2 { hardware ethernet 00:01:02:91:43:0F; fixed-address 192.168.0.2; filename "/tftpboot/pc2/vmlinuz.3c905nomodPc2"; } host pc3 { hardware ethernet 00:01:02:92:70:18; fixed-address 192.168.0.3; filename "/tftpboot/pc3/vmlinuz.3c905nomodPc3"; } host pc4 { hardware ethernet 00:01:02:91:43:45; fixed-address 192.168.0.4; filename "/tftpboot/pc4/vmlinuz.3c905nomodPc4"; } } ----------------------------------------------------------------------------- 2. Edit /etc/rc.d/init.d/dhcpd script file, find the line daemon /usr/sbin/dhcpd then change it to daemon /usr/sbin/dhcpd eth1 (for eth0, leave it don't change it) 3. Check if the file /var/state/dhcp/dhcpd.leases exists, if not touch /var/state/dhcp/dhcpd.leases to create it. 4. Add a soft link in /etc/rc.d/rc3.d ln -s ../init.d/dhcpd S65dhcpd Now, you can test dhcp server by putting the netboot floppy in client PC and turn the power on.
1. Check the /etc/services to make sure the following line exists tftp 69/udp 2. Check the /etc/inetd.conf to make sure the following line is uncomment out. tftp dgram udp wait root /usr/sbin/tcpd in.tftpd 3. Start inetd again to read the new configuration files. 4. tftp daemon is invoked by inetd, you must make sure the /etc/hosts.allow contains the following line ALL: 192.168.0. or more specifically the following lines #bootpd: 0.0.0.0 (for bootpd uncomment this line) in.tftpd: 192.168.0. portmap: 192.168.0. 5. Add the host name in /etc/hosts and must be consist with the content of /etc/dhcpd.conf ------------------------------------------------------------------------- 192.168.0.1 pc1 192.168.0.2 pc2 192.168.0.3 pc3 192.168.0.4 pc4 -------------------------------------------------------------------------
mkdir -p /tftpboot/pc1 mkdir -p /tftpboot/pc2 mkdir -p /tftpboot/pc3 mkdir -p /tftpboot/pc4
* No module support (for simplicity). * Support for your specific network card, for example, 3com 3c905c. * RAM disk support. * BOOTP support. * /proc filesystem support. * NFS filesystem support. * Root file system on NFS
./mknbi-linux --rootdir=/tftpboot/pc$1/pc$1root /usr/src/linux/arch/i386/boot/bzImage > /tftpboot/pc$1/vmlinuz.3c905nomodPc$1
Copy the sever root filesystem to /tftpboot/pc1 and delete any unnecessary files or packages to reduce the size of client root filesystems. Modify the network setup, NFS setup, and others in /tftpboot/pc1/etc directory.
Contains a lot of HOWTO for various aspects of Linux OS. Further details of this article can be found here.
The Beowulf Project official site.
Boot a kernel image over an Ethernet network.