9.3 LAM/MPI

The Local Area Multicomputer/Message Passing Interface (LAM/MPI) was originally developed by the Ohio Supercomputing Center. It is now maintained by the Open Systems Laboratory at Indiana University. As previously noted, LAM/MPI (or LAM for short) is both an MPI library and an execution environment. Although beyond the scope of this book, LAM was designed to include an extensible component framework known as System Service Interface (SSI), one of its major strengths. It works well in a wide variety of environments and supports several methods of inter-process communications using TCP/IP. LAM will run on most Unix machines (but not Windows). New releases are tested with both Red Hat and Mandrake Linux.

Documentation can be downloaded from the LAM site, http://www.lam-mpi.org/. There are also tutorials, a FAQ, and archived mailing lists. This chapter provides an overview of the installation process and a description of how to use LAM. For more up-to-date and detailed information, you should consult the LAM/MPI Installation Guide and the LAM/MPI User's Guide.

9.3.1 Installing LAM/MPI

You have two basic choices when installing LAM. You can download and install a Red Hat package, or you can download the source and recompile it. The package approach is very quick, easy to automate, and uses somewhat less space. If you have a small cluster and are manually installing the software, it will be a lot easier to use packages. Installing from the source will allow you to customize the installation, i.e., select which features are enabled and determine where the software is installed. It is probably a bad idea to mix installations since you could easily end up with different versions of the software, something you'll definitely want to avoid.

Installing from a package is done just as you'd expect. Download the package from http://www.lam-mpi.org/ and install it just as you would any Red Hat package.

The files will be installed under the /usr directory. The space used is minimal. You can use the laminfo command to see the details of the installation, including compiler bindings and which modules are installed, etc.

If you need more control over the installation, you'll want to do a manual install: fetch the source, compile, install, and configure. The manual installation is only slightly more involved. However, it does take considerably longer, something to keep in mind if you'll be repeating the installation on each machine in your cluster. But if you are building an image, this is a one-time task. The installation requires a POSIX- compliant operating system, an appropriate compiler (e.g., GNU 2.95 compiler suite) and utilities such as sed, grep, and awk, and a modern make. You should have no problem with most versions of Linux.

First, you'll need to decide where to put everything, a crucial step if you are installing more than one version of MPI. If care isn't taken, you may find that part of an installation has been overwritten. In this example, the source files are saved in /usr/local/src/lam-7.0.6 and the installed code in /usr/local/lam-7.0.6. First, download the appropriate file from http://www.lam-mpi.org/ to /usr/local/src. Next, uncompress and unpack the file.

You'll see a lot of files stream by as the source is unpacked. If you want to capture this output, you can tee it to a log file. Just append | tee tar.log to the end of the line and the output will be copied to the file tar.log. You can do something similar with subsequent commands.

Next, create the directory where the executables will be installed and configure the code specifying that directory with the --prefix option. You may also include any other options you desire. The example uses a configuration option to specify SSH as well. (You could also set this through an environmental variable LAMRSH, rather than compiling it into the code-something you must do if you use a package installation.)

If you don't have a FORTRAN compiler, you'll need to add --without-fc to the configure command. A description of other configuration options can be found in the documentation. However, the defaults are quite reasonable and will be adequate for most users. Also, if you aren't using the GNU compilers, you need to set and export compiler variables. The documentation advises that you use the same compiler to build LAM/MPI that you'll use when using LAM/MPI.

You'll see a lot of output with these commands, but all should go well. You may also want to make the examples and clean up afterwards.

Again, expect a lot of output. You only need to make the examples on the cluster head. Congratulations, you've just installed LAM/MPI. You can verify the settings and options with the laminfo command.

9.3.2 User Configuration

Before you can use LAM, you'll need to do a few more things. First, you'll need to create a host file or schema, which is basically a file that contains a list of the machines in your cluster that will participate in the computation. In its simplest form, it is just a text file with one machine name per line. If you have multiple CPUs on a host, you can repeat the host name or you can append a CPU count to a line in the form cpu=n, where n is the number of CPUs. However, you should realize that the actual process scheduling on the node is left to the operating system. If you need to change identities when logging into a machine, it is possible to specify that username for a machine in the schema file, e.g., user=smith. You can create as many different schemas as you want and can put them anywhere on the system. If you have multiple users, you'll probably want to put the schema in a public directory, for example, /etc/lamhosts.

You'll also want to set your $PATH variable to include the LAM executables, which can be trickier than it might seem. If you are installing both LAM/MPI and MPICH, there are several programs (e.g., mpirun, mpicc, etc.) that have the same name with both systems, and you need to be able to distinguish between them. While you could rename these programs for one of the packages, that is not a good idea. It will confuse your users and be a nuisance when you upgrade software. Since it is unlikely that an individual user will want to use both packages, the typical approach is to set the path to include one but not the other. Of course, as the system administrator, you'll want to test both, so you'll need to be able to switch back and forth. OSCAR's solution to this problem is a package called switcher that allows a user to easily change between two configurations. switcher is described in Chapter 6.

A second issue is making sure the path is set properly for both interactive and noninteractive or non-login shells. (The path you want to add is /usr/local/lam-7.0.6/bin if you are using the same directory layout used here.) The processes that run on the compute nodes are run in noninteractive shells. This can be particularly confusing for bash users. With bash, if the path is set in .bash_profile and not in .bashrc, you'll be able to log onto each individual system and run the appropriate programs, but you won't be able to run the programs remotely. Until you realize what is going on, this can be a frustrating problem to debug. So, if you use bash, don't forget to set your path in .bashrc. (And while you are setting paths, don't forget to add the manpages when setting up your paths, e.g., /usr/local/lam-7.0.6/man.)

It should be downhill from here. Make sure you have ssh-agent running and that you can log onto other machines without a password. Setting up and using SSH is described in Chapter 4. You'll also need to ensure that there is no output to stderr whenever you log in using SSH. (When LAM sees output to stderr, it thinks something bad is happening and aborts.) Since you'll get a warning message the first time you log into a system with SSH as it adds the remote machine to the known hosts, often the easiest thing to do (provided you don't have too many machines in the cluster) is to manually log into each machine once to get past this problem. You'll only need to do this once. recon, described in the subsection on testing, can alert you to some of these problems.

Also, the directory /tmp must be writable. Don't forget to turn off or reconfigure your firewall as needed.

9.3.3 Using LAM/MPI

Booting the runtime system with lamboot.
Writing and compiling a program with the appropriate compiler, e.g., mpicc.^[3]
^[3] Actually, you don't need to boot the system to compile code.
Execute the code with the mpirun command.
Clean up any crashed processes with lamclean if things didn't go well.
Shut down the runtime system with the command lamhalt.

In order to use LAM, you will need to launch the runtime environment. This is referred to as booting LAM and is done with the lamboot command. Basically, lamboot starts the lamd daemon, the message server, on each machine.

Since there are considerable security issues in running lamboot as root, it is configured so that it will not run if you try to start it as root.

As noted above, you must be able to log onto the remote systems without a password and without any error messages. (If this command doesn't work the first time, you might give this a couple of tries to clear out any one time error messages.) If you don't want to see the list of nodes, leave out the -v. You can always use the lamnodes command to list the nodes later if you wish.

You'll only need to boot the system once at the beginning of the session. It will remain loaded until you halt it or log out. (Also, you can omit the schema and just use the local machine. Your code will run only on the local node, but this can be useful for initial testing.)

Once you have entered your program using your favorite editor, the next step is to compile and link the program. You could do this directly by typing in all the compile options you'll need. But it is much simpler to use one of the wrapper programs supplied with LAM. The programs mpicc, mpiCC, and mpif77 will respectively invoke the C, C++, and FORTRAN 77 compilers on your system, supplying the appropriate command-line arguments for LAM. For example, you might enter something like the following:

(hello.c is one of the examples that comes with LAM and can be found in /usr/local/src/lam-7.0.6/examples/hello if you use the same directory structure used here to set up LAM.) If you want to see which arguments are being passed to the compiler, you can use the -showme argument. For example,

With -showme, the program isn't compiled; you just see the arguments that would have been used had it been compiled. Any other arguments that you include in the call to mpicc are passed on to the underlying compiler unchanged. In general, you should avoid using the -g (debug) option when it isn't needed because of the overhead it adds.

To compile the program, rerun the last command without -showme if you haven't done so. You now have an executable program. Run the program with the mpirun command. Basically, mpirun communicates with the remote LAM daemon to fork a new process, set environment variables, redirect I/O, and execute the user's command. Here is an example:

As shown in this example, the argument -np 4 specified that four processes be used when running the program. If more machines are available, only four will be used. If fewer machines are available, some machines will be used more than once.

Of course, you'll need the executable on each machine. If you're using NFS to mount your home directories, this has already been taken care of if you are working in that directory. You should also remember that mpirun can be run on a single machine, which can be helpful when you want to test code away from a cluster.

If a program crashes, there may be extraneous processes running on remote machines. You can clean these up with the lamclean command. This is a command you'll use only when you are having problems. Try lamclean first and if it hangs, you can escalate to wipe. Rerun lamboot after using wipe. This isn't necessary with lamclean. Both lamclean and wipe take a -v for verbose output.

Once you are done, you can shut down LAM with the lamhalt command, which kills the lamd daemon on each machine. If you wish, you can use -v for verbose output. Two other useful LAM commands are mpitask and mpimsg, which are used to monitor processes across the cluster and to monitor the message buffer, respectively.

9.3.4 Testing the Installation

LAM comes with a set of examples, tests, and tools that you can use to verify that it is properly installed and runs correctly. We'll start with the simplest tests first.

The recon tool verifies that LAM will boot properly. recon is not a complete test, but it confirms that the user can execute commands on the remote machine, and that the LAM executables can be found and executed.

Since lamboot is required to run the next tests, you'll need to run these tests as a non-privileged user. Once you have booted LAM, you can use the tping command to check basic connectivity. tping is similar to ping but uses the LAM echo server. This confirms that both network connectivity and that the LAM daemon is listening. For example, the following command sends two one-byte packets to the first three machines in your cluster.

The LAM test suite is the most comprehensive way to test your system. It can be used to confirm that you have a complete and correct installation. Download the test suite that corresponds to your installation and then uncompress and unpack it.

This creates the directory lamtests-7.0.6 with the tests and a set of directions in the file README. Next, you should start LAM with lamboot if you haven't already done so. Then change to the test directory and run configure.

You'll see lots of output scroll past. Don't be concerned about an occasional error message while it is running. What you want is a clean bill of health when it is finally done. You can run specific tests in the test suite by changing into the appropriate subdirectory and running make.