13.5 Broadcast Communications
In this
subsection, we will further improve the efficiency of our code by
introducing two new MPI functions. In the process,
we'll reduce the amount of code we have to work
with.
13.5.1 Broadcast Functions
If you look back to the last solution, you'll notice
that the parameters are sent individually to each process one at a
time even though each process is receiving the same information. For
example, if you are using 10 processes, while process 0 communicates
with process 1, processes 2 through 10 are idle. While process 0
communicates with process 2, processes 3 through 10 are sill idle.
And so on. This may not be a big problem with a half dozen processes,
but if you are running on 1,000 machines, this can result in a lot of
wasted time. Fortunately, MPI provides an alternative,
MPI_Bcast.
13.5.1.1 MPI_Bcast
MPI_Bcast provides a mechanism to distribute
the same information among a communication group or communicator.
MPI_Bcast takes five arguments. The first three
define the data to be transmitted. The first argument is the buffer
that contains the data; the second argument is the number of items in
the buffer; and the third argument, the data type. (The supported
data types are the same as with MPI_Send, etc.)
The next argument is the rank of the process that is generating the
broadcast, sometimes called the root of the broadcast. In our
example, this is 0, but this isn't a requirement.
All processes use identical calls to MPI_Bcast. By
comparing their rank to the rank specified in the can, a process can
determine whether it is sending or receiving data. Consequently,
there is no need for any additional control structures with
MPI_Bcast. The final argument is the communicator,
which effectively defines which processes will participate in the
broadcast. When the call returns, the data in the
root's communications buffer will have been copied
to each of the remaining processes in the communicator.
Here is our numerical integration code using
MPI_Bcast (and MPI_Reduce, a
function we will discuss next). New code appears in boldface.
#include "mpi.h"
#include <stdio.h>
/* problem parameters */
#define f(x) ((x) * (x))
int main( int argc, char * argv[ ] )
{
/* MPI variables */
int noProcesses, processId;
/* problem variables */
int i, numberRects;
double area, at, height, lower, width, total, range;
double lowerLimit, upperLimit;
/* MPI setup */
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &noProcesses);
MPI_Comm_rank(MPI_COMM_WORLD, &processId);
if (processId = = 0) /* if rank is 0, collect parameters */
{
fprintf(stderr, "Enter number of steps:\n");
scanf("%d", &numberRects);
fprintf(stderr, "Enter low end of interval:\n");
scanf("%lf", &lowerLimit);
fprintf(stderr, "Enter high end of interval:\n");
scanf("%lf", &upperLimit);
}
MPI_Bcast(&numberRects, 1, MPI_INT, 0, MPI_COMM_WORLD);
MPI_Bcast(&lowerLimit, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD);
MPI_Bcast(&upperLimit, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD);
/* adjust problem size for subproblem*/
range = (upperLimit - lowerLimit) / noProcesses;
width = range / numberRects;
lower = lowerLimit + range * processId;
/* calculate area for subproblem */
area = 0.0;
for (i = 0; i < numberRects; i++)
{ at = lower + i * width + width / 2.0;
height = f(at);
area = area + width * height;
}
MPI_Reduce(&area, &total, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);
/* collect information and print results */
if (processId = = 0) /* if rank is 0, print results */
{ fprintf (stderr, "The area from %f to %f is: %f\n",
lowerLimit, upperLimit, total );
}
/* finish */
MPI_Finalize( );
return 0;
}
Notice that we have eliminated the control structures as well as the
need for separate MPI_Send and
MPI_Recv calls.
13.5.1.2 MPI_Reduce
You'll also notice that we have used a new function,
MPI_Reduce.
The process of collecting data is so common that MPI includes
functions that automate this process. The idea behind
MPI_Reduce is to specify a data item to be
accumulated, a storage location or variable to accumulate in, and an
operator to use when accumulating. In this example, we want to add up
all the individual areas, so area is the data to
accumulate, total is the location where we
accumulate the data, and the operation is adding or
MPI_SUM.
More specifically, MPI_Reduce has seven arguments.
The first two are the addresses of the send and receive buffers. The
third is the number of elements in the send buffer, while the fourth
gives the type of the data. Both send and receive buffers will
manipulate the same number of elements which will be of the same
type. The next operation identifies the function used to combine
elements. MPI_SUM is used to add elements. MPI
defines a dozen different operators. These include operators to find
the sum of the data values (MPI_SUM), their
product (MPI_PROD), the largest and smallest
values (MPI_MAX and MPI_MIN),
and numerous logical operations for both logical and bitwise
comparisons using AND, OR, and XOR (MPI_LAND,
MPI_BAND, MPI_LOR,
MPI_BOR, MPI_LXOR, and
MPI_BXOR). The data type must be compatible with
the selected operation.
The next to the last argument identifies the root of the
communications, i.e., the rank of the process that will accumulate
the final answer, and the last argument is the communicator. These
must have identical values in every process. Notice that only the
root process will have the accumulated result. If all of the
processes need the result, there is an analogous function
MPI_Allreduce that is used in the same way.
Notice how the use of MPI_Reduce has simplified
our code. We have eliminated a control structure, and, apart from the
single parameter in our recall to MPI_Reduce, we
no longer need to distinguish among processes. Keep in mind that it
is up to the implementer to determine the best way to implement these
functions. Details will vary. For example, the
"broadcast" in
MPI_Bcast simply means that the data is sent to
all the processes. It does not necessarily imply that an
Ethernet-style broadcast will be used, although that is one obvious
implementation strategy. When implementing for other networks, other
strategies may be necessary.
In this chapter we have introduced the six core MPI
functions-MPI_Init,
MPI_Comm_size, MPI_Comm_rank,
MPI_Send, MPI_Recv, and
MPI_Finalize-as well as several others that
simplify MPI coding. These six core functions have been described as
the six indispensable MPI functions, the functions that you really
can't do without. On the other hand, most MPI
programs, with a little extra work, could be rewritten with just
these six functions. Congratulations! You are now an MPI
programmer.
|