9.2 Selecting a Library

Those of you who do your own dentistry will probably want to program your parallel applications from scratch. It is certainly possible to develop your code with little more than a good compiler. You could manually set up communication channels among processes using standard systems calls.^[2]

^[2] In fairness, there may be some very rare occasions where efficiency concerns might dictate this approach.

The rest of you will probably prefer to use libraries designed to simplify parallel programming. This really comes down to two choices-the Parallel Virtual Machine (PVM) library or the Message Passing Interface (MPI) library. Work was begun on PVM in 1989 and continued into the early '90s as a joint effort among Oak Ridge National Laboratory, the University of Tennessee, Emory University, and Carnegie-Mellon University. An implementation of PVM is available from http://www.netlib.org/pvm3/. This PVM implementation provides both libraries and tools based on a message-passing model.

Without getting into a philosophical discussion, MPI is a newer standard that seems to be generally preferred over PVM by many users. For this reason, this book will focus on MPI. However, both PVM and MPI are solid, robust approaches that will potentially meet most users' needs. You won't go too far wrong with either. OSCAR, you will recall, installs both PVM and MPI.

MPI is an API for parallel programming based on a message-passing model for parallel computing. MPI processes execute in parallel. Each process has a separate address space. Sending processes specify data to be sent and a destination process. The receiving process specifies an area in memory for the message, the identity of the source, etc.

Primarily, MPI can be thought of as a standard that specifies a library. Users can write code in C, C++, or FORTRAN using a standard compiler and then link to the MPI library. The library implements a predefined set of function calls to send and receive messages among collaborating processes on the different machines in the cluster. You write your code using these functions and link the completed code to the library.

The MPI specification was developed by the MPI Forum, a collaborative effort with support from both academia and industry. It is suitable for both small clusters and "big-iron" implementations. It was designed with functionality, portability, and efficiency in mind. By providing a well-designed set of function calls, the library provides a wide range of functionality that can be implemented in an efficient manner. As a clearly defined standard, the library can be implemented on a variety of architectures, allowing code to move easily among machines.

MPI has gone through a couple of revisions since it was introduced in the early '90s. Currently, people talk of MPI-1 (typically meaning Version 1.2) and MPI-2. MPI-1 should provide for most of your basic needs, while MPI-2 provides enhancements.

While there are several different implementations of MPI, there are two that are widely used-LAM/MPI and MPICH. Both LAM/MPI and MPICH go beyond simply providing a library. Both include programming and runtime environments providing mechanisms to run programs across the cluster. Both are widely used, robust, well supported, and freely available. Excellent documentation is provided with both. Both provide all of MPI-1 and considerable portions of MPI-2, including ROMIO, Argonne National Laboratory's freely available high-performance IO system. (For more information on ROMIO, visit http://www.mcs.anl.gov/romio.) At this time, neither is totally thread-safe. While there are differences, if you are just getting started, you should do well with either product. And since both are easy to install, with very little extra work you can install both.

Table of Contents