4.1 Installing Linux

If Linux isn't built into your cluster software, the first step is to decide what distribution and version of Linux you want.

4.1.1 Selecting a Distribution

This decision will depend on what clustering software you want to use. It doesn't matter what the "best" distribution of Linux (Red Hat, Debian, SUSE, Mandrake, etc.) or version (7.3, 8.0, 9.0, etc.) is in some philosophical sense if the clustering software you want to use isn't available for that choice. This book uses the Red Hat distribution because the clustering software being discussed was known to work with that distribution. This is not an endorsement of Red Hat; it was just a pragmatic decision.

Keep in mind that your users typically won't be logging onto the compute nodes to develop programs, etc., so the version of Linux used there should be largely irrelevant to the users. While users will be logging onto the head node, this is not a general-purpose server. They won't be reading email, writing memos, or playing games on this system (hopefully). Consequently, many of the reasons someone might prefer a particular distribution are irrelevant.

This same pragmatism should extend to selecting the version as well as the distribution you use. In practice, this may mean using an older version of Linux. There are basically three issues involved in using an older version-compatibility with newer hardware; bug fixes, patches, and continued support; and compatibility with clustering software.

If you are using recycled hardware, using an older version shouldn't be a problem since drivers should be readily available for your older equipment. If you are using new equipment, however, you may run into problems with older Linux releases. The best solution, of course, is to avoid this problem by planning ahead if you are buying new hardware. This is something you should be able to work around by putting together a single test system before buying the bulk of the equipment.

With older versions, many of the problems are known. For bugs, this is good news since someone else is likely to have already developed a fix or workaround. With security holes, this is bad news since exploits are probably well circulated. With an older version, you'll need to review and install all appropriate security patches. If you can isolate your cluster, this will be less of an issue.

Unfortunately, at some point you can expect support for older systems to be discontinued. However, a system will not stop working just because it isn't supported. While not desirable, this is also something you can live with.

The final and key issue is software compatibility. Keep in mind that it takes time to develop software for use with a new release, particularly if you are customizing the kernel. As a result, the clustering software you want to use may not be available for the latest version of your favorite Linux distribution. In general, software distributed as libraries (e.g., MPI) are more forgiving than software requiring kernel patches (e.g., openMosix) or software that builds kernel modules (e.g., PVFS). These latter categories, by their very nature, must be system specific. Remember that using clustering software is the raison d'être for your cluster. If you can't run it, you are out of business. Unless you are willing to port the software or compromise your standards, you may be forced to use an older version of Linux. While you may want the latest and greatest version of your favorite flavor of Linux, you need to get over it.

If at all feasible, it is best to start your cluster installation with a clean install of Linux. Of course, if you are adding clustering software to existing systems, this may not be feasible, particularly if the machines are not dedicated to the cluster. If that is the case, you'll need to tread lightly. You'll almost certainly need to make changes to these systems, changes that may not go as smoothly as you'd like. Begin by backing up and carefully documenting these systems.

4.1.2 Downloading Linux

With most flavors of Linux, there are several ways you can do the installation. Typically you can install from a set of CD-ROMs, from a hard disk partition, or over a network using NFS, FTP, or HTTP. The decision will depend in part on the hardware you have available, but for initial experimentation it is probably easiest to use CD-ROMs. Buying a boxed set can be a real convenience, particularly if it comes with a printed set of manuals. But if you are using an older version of Linux, finding a set of CD-ROMs to buy can be difficult. Fortunately, you should have no trouble finding what you need on the Internet.

Downloading is the cheapest and easiest way to go if you have a fast Internet connection and a CD-ROM burner. Typically, you download ISO images-disk images for CD-ROMs. These are basically single-file archives of everything on a CD-ROM. Since ISO images are frequently over 600 MB each and since you'll need several of them, downloading can take hours even if you have a fast connection and days if you're using a slow modem.

If you decide to go this route, follow the installation directions from your download site. These should help clarify exactly what you need and don't need and explain any other special considerations. For example, for Red Hat Linux the place to start is http://www.redhat.com/apps/download/. This will give you a link to a set of directions with links to download sites. Don't overlook the mirror sites; your download may go faster with them than with Red Hat's official download site.

For Red Hat Linux 9.0, there are seven disks. (Earlier versions of Red Hat have fewer disks.) Three of these are the installation disks and are essential. Three disks contain the source files for the packages. It is very unlikely you'll ever need these. If you do, you can download them later. The last disk is a documentation disk. You'd be foolish to skip this disk. Since the files only fill a small part of a CD, the ISO image is relatively small and the download doesn't take very long.

It is a good idea to check the MD5SUM for each ISO you download. Run the md5sum program and compare the results to published checksums.

[root@cs sloanjd]# md5sum FC2-i386-rescuecd.iso

22f4bfca5baefe89f0e04166e738639f  FC2-i386-rescuecd.iso

This will ensure both that the disk image hasn't been tampered with and that your download wasn't corrupted.

Once you have downloaded the ISO images, you'll need to burn your CD-ROMs. If you downloaded the ISO images to a Windows computer, you could use something like Roxio Easy Creator.^[1] If you already have a running Linux system, you might use X-CD-Roast.

^[1] There is an appealing irony to using Windows to download Linux.

Once you have the CD-ROMs, you can do an installation by following the appropriate directions for your software and system. Usually, this means booting to the first CD-ROM, which, in turn, runs an installation script. If you can't boot from the CD-ROM, you'll need to create a boot floppy using the directions supplied with the software. For Red Hat Linux, see the README file on the first installation disk.

4.1.3 What to Install?

What you install will depend on how you plan to use the machine. Is this a dedicated cluster? If so, users probably won't log onto individual machines, so you can get by with installing the minimal software required to run applications on each compute node. Is it a cluster of workstations that will be used in other ways? If that is the case, be sure to install X and any other appropriate applications. Will you be writing code? Don't forget the software development package and editors. Will you be recompiling the kernel? If so, you'll need the kernel sources.^[2] If you are building kernel modules, you'll need the kernel header files. (In particular, these are needed if you install PVFS. PVFS is described in Chapter 12.) A custom installation will give you the most control over what is installed, i.e., the greatest opportunity to install software that you don't need and omit that which you do need.

^[2] In general, you should avoid recompiling the kernel unless it is absolutely necessary. While you may be able to eke out some modest performance gains, they are rarely worth the effort.

Keep in mind that you can go back and add software. You aren't trapped by what you include at this point. At this stage, the important thing is to remember what you actually did. Take careful notes and create a checklist as you proceed. The quickest way to get started is to take a minimalist approach and add anything you need later, but some people find it very annoying to have to go back and add software. If you have the extra disk space (2 GB or so), then you may want to copy all the packages to a directory on your server. Not having to mount disks and search for packages greatly simplifies adding packages as needed. You only need to do this with one system and it really doesn't take that long. Once you have worked out the details, you can create a Kickstart configuration file to automate all this. Kickstart is described in more detail in Chapter 8.

Table of Contents