Norman Matloff's MPICH/MPICH2 MPI Tutorial

Norm Matloff's MPICH/MPICH2 MPI Tutorial

Professor Norm Matloff
Dept. of Computer Science
University of California at Davis
Davis, CA 95616

(Please mail any questions to Norm Matloff.)

Contents:

Installing MPICH/MPICH2

If you wish to install MPICH/MPICH2 yourself, say on your Linux box, download the source code from the MPICH/MPICH2 home page. Just unpack the software and type

configure 
make
make install

If your system uses ssh instead of rsh, add an option to the configure command:

configure -rsh=ssh

If you don't like the default installation directory, add a -prefix option too.

Path:

Make sure your shell search path includes the directory containing mpicc, as well as the directory in which your MPICH/MPICH2 application executable will reside.

Compiling MPICH/MPICH2 application programs:

Make sure that mpicc is in your search path. (Also, if you have other versions of MPI there, make sure the one you want comes first.)

mpicc -g -o binary_file_name source_file.c 

(If you wish to use C++, use mpiCC instead of mpicc.)

Running MPICH application programs:

To describe how to run an MPICH program, I will assume that it is of the "SPMD" ("single-program, multiple data") type. This means that the same program runs on all nodes, though typically accessing different data. An example is our MPI sample program, which is of SPMD type. (We could convert it to non-SPMD ("MPMD") type by forming a "master" program from all the code which the current version assigns to node 0, and a "slave" program consisting of the rest of the code.)

To run our MPI sample program, prime, we would set up a procgroup ("process group") file, listing the nodes we will use and program(s) we will run, say the following file My.pg:

pc8.cs.ucdavis.edu 0 /home/matloff/tmp/prime
pc10.cs.ucdavis.edu 1 /home/matloff/tmp/prime
pc12.cs.ucdavis.edu 1 /home/matloff/tmp/prime

(Use 0 for the first machine, and 1 for all others.)

If your machines are using rsh, make absolutely sure that your $HOME/.rhosts files on these machines include the names of these machines. If the machines are using ssh, make sure that you've set things up for passwordless remote execution. On UCD CSIF machines, see these instructions.

Also make sure there are no undefined variables in your .cshrc startup ($TERM may be one). Node 0 will be pc8, node 1 will be pc10, etc.

Then from your node-0 machine type

mpirun -p4pg My.pg prime 100 0

at node 0, in this case pc8. (Make sure you do this at node 0.)

Running MPICH2 application programs:

The key to running an MPICH2 application is the mpd daemon, one of which must run on each machine to be used by your program.

The primitive way to arrange for this is to simply type

mpd &

in a terminal window at each machine

But it is more convenient to use mpdboot. To do this, first set up a file mpd.hosts in some directory, with the names of the machines on which you want daemons to be running; list the network names of the machines, one line per machine, e.g.

pc29.cs.ucdavis.edu
pc30.cs.ucdavis.edu

Then type

mpdboot

Output

Note carefully that all the output from printf() will be collected and printed at whichever machine you started it. These may be interspersed together from different nodes, making your output difficult or impossible to read. (Note: This interspersing of printf() outputs is quite common in parallel systems.)

Debugging:

Don't use printf() calls for most of your debugging. Your debugging will be much easier and faster if you use a debugging tool, such as gdb. To use gdb with MPICH/MPICH2, follow the directions given in my parallel debugging guide, at http://heather.cs.ucdavis.edu/~matloff/pardebug.html.