2.4 Cluster Kits

If installing all of this software sounds daunting, don't panic. There are a couple of options you can consider. For permanent clusters there are, for lack of a better name, cluster kits, software packages that automate the installation process. A cluster kit provides all the software you are likely to need in a single distribution.

Cluster kits tend to be very complete. For example, the OSCAR distribution contains both PVM and two versions of MPI. If some software isn't included, you can probably get by without it. Another option, described in the next section, is a CD-ROM-based cluster.

Cluster kits are designed to be turnkey solutions. Short of purchasing a prebuilt, preinstalled proprietary cluster, a cluster kit is the simplest approach to setting up a full cluster. Configuration parameters are largely preset by people who are familiar with the software and how the different pieces may interact. Once you have installed the kit, you have a functioning cluster. You can focus on using the software rather than installing it. Support groups and mailing lists are generally available.

Some kits have a Linux distribution included in the package (e.g., Rocks), while others are installed on top of an existing Linux installation (e.g., OSCAR). Even if Linux must be installed first, most of the configuration and the installation of needed packages will be done for you.

There are two problems with using cluster kits. First, cluster kits do so much for you that you can lose touch with your cluster, particularly if everything is new to you. Initially, you may not understand how the cluster is configured, what customizations have been made or are possible, or even what has been installed. Even making minor changes after installing a kit can create problems if you don't understand what you have. Ironically, the more these kits do for you, the worse this problem may be. With a kit, you may get software you don't want to deal with-software your users may expect you to maintain and support. And when something goes wrong, as it will, you may be at a loss about how to deal with it.

A second problem is that, in making everything work together, kit builders occasionally have to do things a little differently. So when you look at the original documentation for the individual components in a kit, you may find that the software hasn't been installed as described. When you learn more about the software, you'll come to understand and appreciate why the changes were made. But in the short term, these changes can add to the confusion.

So while a cluster kit can get you up and running quickly, you will still need to learn the details of the individual software. You should follow up the installation with a thorough study of how the individual pieces in the kit work. For most beginners, the single advantage of being able to get a cluster up and running quickly probably outweighs all of the disadvantages.

While other cluster kits are available, the three most common kits for Linux clusters are NPACI Rocks, OSCAR, and Scyld Beowulf.^[1] While Scyld Beowulf is a commercial product available from Penguin Computing, an earlier, unsupported version is available for a very nominal cost from http://www.linuxcentral.com/. Donald Becker, one of the original Beowulf developers, founded Scyld Computing, which was subsequently acquired by Penguin Computing. Scyld is built on top of Red Hat Linux and includes an enhanced kernel, tools, and utilities. While Scyld Beowulf is a solid system, you face the choice of using an expensive commercial product or a somewhat dated, unsupported product. Furthermore, variants of both Rocks and OSCAR are available. For example, BioBrew (http://bioinformatics.org/biobrew/) is a Rocks-based system that contains a number of packages for analyzing bioinformatics information. For these reasons, either Rocks or OSCAR is arguably a better choice than Scyld Beowulf.

^[1] For grid computing, which is outside the scope of this book, the Globus Toolkit is a likely choice.

NPACI (National Partnership for Advanced Computational Infrastructure) Rocks is a collection of open source software for creating a cluster built on top of Red Hat Linux. Rocks takes a cookie-cutter approach. To install Rocks, begin by downloading a set of ISO images from http://rocks.npaci.edu/Rocks/ and use them to create installation CD-ROMs. Next, boot to the first CD-ROM and answer a few questions as the cluster is built. Both Linux and the clustering software are installed. (This is a mixed blessing-it simplifies the installation but you won't have any control over how Linux is installed.) The installation should go very quickly. In fact, part of the Rocks' management strategy is that, if you have problems with a node, the best solution is to reinstall the node rather than try to diagnose and fix the problem. Depending on hardware, it may be possible to reinstall a node in under 10 minutes. When a Rocks installation goes as expected, you can be up and running in a very short amount of time. However, because the installation of the cluster software is tied to the installation of the operating system, if the installation fails, you can be left staring at a dead system and little idea of what to do. Fortunately, this rarely happens.

OSCAR, from the Open Cluster Group, uses a different installation strategy. With OSCAR, you first install Linux (but only on the head node) and then install OSCAR-the installations of the two are separate. This makes the installation more involved, but it gives you more control over the configuration of your system, and it is somewhat easier (that's easier, not easy) to recover when you encounter installation problems. And because the OSCAR installation is separate from the Linux installation, you are not tied to a single Linux distribution.

Rocks uses a variant of Red Hat's Anaconda and Kickstart programs to install the compute nodes. Thus, Rocks is able to probe the system to see what hardware is present. To be included in Rocks, software must be available as an RPM and configuration must be entirely automatic. As a result, with Rocks it is very straightforward to set up a cluster using heterogeneous hardware. OSCAR, in contrast, uses a system image cloning strategy to distribute the disk image to the compute nodes. With OSCAR it is best to use the same hardware throughout your cluster. Rocks requires systems with hard disks. Although not discussed in this book, OSCAR's thin client model is designed for diskless systems.

Both Rocks and OSCAR include a variety of software and build complete clusters. In fact, most of the core software is the same for both OSCAR and Rocks. However, there are a few packages that are available for one but not the other. For example, Condor is readily available for Rocks while LAM/MPI is included in OSCAR.

Clearly, Rocks and OSCAR take orthogonal approaches to building clusters. Cluster kits are difficult to build. OSCAR scales well over Linux distributions. Rocks scales well with heterogeneous hardware. No one approach is better in every situation.

Rocks and OSCAR are at the core of this book. The installation, configuration, and use of OSCAR are described in detail in Chapter 6. The installation, configuration, and use of Rocks is described in Chapter 7. Rocks and OSCAR heavily influenced the selection of the individual tools described in this book. Most of the software described in this book is included in Rocks and OSCAR or is compatible with them. However, to keep the discussions of different software clean, the book includes separate chapters for the various software packages included in Rocks and OSCAR.

This book also describes many of the customizations made by these kits. At the end of many of the chapters, there is a brief section for Rocks and OSCAR users summarizing the difference between the default, standalone installation of the software and how these kits install it. Hopefully, therefore, this book addresses both of the potential difficulties you might encounter with a cluster-learning the details of the software and discovering the differences that cluster kits introduce.

Putting aside other constraints such as the need for diskless systems or heterogeneous hardware, if all goes well, a novice can probably build a Rocks cluster a little faster than an OSCAR cluster. But if you want greater control over how your cluster is configured, you may be happier with OSCAR in the long run. Typically, OSCAR provides better documentation, although Rocks documentation has been improving. You shouldn't go far wrong with either.

Table of Contents