Instructions on How to Run MPI, OpenMP and CUDA Programs

Instructions on how to run MPI, OpenMP and CUDA programs

Sachin Kumawat and Norm Matloff

This is a quick overview on running parallel applications with MPI, OpenMP and CUDA. The recommended platform is Unix (includes Linux and Mac OS X) and useful (but untested!) links are accordingly provided for Windows Tools as well. These instruction should be followed to gain familiarity with the tools before starting with actual assignments and taking quizzes.

CRUCIAL NOTE: When you take the quizzes, the various executables, gcc, mpicc, mpiexec, R an python must be in your search path, as the OMSI script will invoke them. A similar statement holds for library paths. Thus it is absolutely essentially that you do a dry run of OMSI before the first quiz.

CSIF

The gcc ompiler is OpenMP-capable.

Both MPI (mpicc, mpiexec) and CUDA (nvcc) toolchains are installed on CSIF machines.

Your Laptop

For our quizzes, you will need gcc, MPI and R for running code, and Python for running our OMSI quiz tool. CUDA for quizzes will just be "pencil and paper" style, no actual compiling/running.

As noted, your version of gcc must be OpenMP-capable. To test that, download omp_hello.c, compile and run:

gcc -g omp_hello.c -fopenmp 
./a.out

This will probably fail on a Mac; see below for the remedy.

For the programming assignments, you will also need gcc, MPI and R.

If you have a CUDA-compatible video card, you may install CUDA but be prepared for some obstacles to resolve. Installation can be performed by following instructions from CUDA Toolkit's homepage. The setup is rather involved but majority of the issues are discussed here.

Installation of OpenMP capable C/C++ compiler and MPI tools

2.1 Compilling OpenMP programs with GCC
2.2 Running OpenMP applications

3.1 Running CUDA applications

1.1 Installing MPICH2 (for personal systems only, skip for CSIF systems)

Unix-family systems

Download the latest stable MPICH2 source files from MPICH2 Downloads page. Extract the tarball, enter the extracted directory on a terminal and type the following:

./configure
make
make install

The default installation directory is /usr/local/bin but you can set it to your preferred installation location by modifying the configure command as:

configure -prefix=/your/installation/directory

Now add the bin and lib subdirectories of the installation directory to corresponding environment variables. On a Bash shell (default for linux systems):

export PATH=/your/installation/directory/bin:$PATH
export LD_LIBRARY_PATH=/your/installation/directory/lib:$LD_LIBRARY_PATH

You can check to see if everything is in order by running the following to see if mpicc and mpiexec are found at the corresponding bin directory:

which mpicc
which mpiexec

Note that, same procedure should be followed to set-up the MPI tools on every node (physical computer) you plan to run a multi-node application on. This is not necessary on CSIF, since the machines there share a file system.

Windows

On windows, a version of MPICH called MSMPI can be used along with Visual Studio. Download directly from Microsoft's Website. Compilation and launch instructions are provided here.

1.2 Running MPI applications with MPICH2

Set Up Remote Authentication:

MPI implementations work by invoking programs on other nodes via ssh or equivalent daemon. Therefore before you can run MPI programs, it is required to once setup passwordless login from one MPI machine to another. To set up passwordless login on CSIF (or any) systems, check FAQ 7.9 and 7.10 of csif-general-faq.
Compiling MPICH2 Program:

To compile a MPI program written in C, type:
```
mpicc -g -o binary_file_name source_file.c 
```
For example, for a program PrimePipe.c, make an executable prp this way:
```
mpicc -g -o prp PrimePipe.c
```
(You may need to specify the full path to prp.)
(If you wish to use C++, use mpicxx instead of mpicc.)
Running MPICH2 application:

Set up a hosts file, listing which machines you wish your MPI app to run on, e.g. hosts3:
```
pc28.cs.ucdavis.edu
pc29.cs.ucdavis.edu
pc30.cs.ucdavis.edu
```
Run, say for the above executable named prp on the above hosts file, by typing
```
mpiexec -f hosts3 -n 3 prp 100 0
```
where 100 and 0 are the command-line arguments to prp.

2.1 Compiling OpenMP programs with GCC

The GNU gcc/g++ compilers are capable of running programs with OpenMP directives right out of the box. Therefore no installation/configuration is required for Linux systems (except for OS X, see below). To enable OpenMP support for a program hello_openmp.c, simply compile with the flag -fopenmp as:

gcc -fopenmp hello_openmp.c -o hello_openmp

OpenMP for Windows:

Visual Studio support for OpenMP is outdated, hence it is recommended to utilize GCC functionality on Windows by installing either Cygwin or MinGW. For Visual Studio, instructions to enable OpenMP support are provided here.
OpenMP for Mac OS X:

The default clang compiler on OS X does not support OpenMP. Since gcc on OS X is just a symbolic link to clang, using the default gcc/g++ will not work either. We need to install the latest homebrew version of gcc (e.g. v6.x) and add its location to the PATH environment variable:
```
brew install gcc
export PATH=/usr/local/bin:$PATH
```
The OpenMP program can then be compiled with:
```
gcc-6 -fopenmp hello_openmp.c -o hello_openmp
```
Note that you will need to alias gcc to gcc-6.

The install took 85 minutes when I tried it. Note that it will install in /usr/local/Cellar. Also, I found that I also needed to make sure that /usr/bin/as is ahead of /opt/local/bin in PATH.

(Use g++/g++-6 for C++ applications.)

2.2 Running OpenMP applications

To run an OpenMP application, first specify the number of threads using OMP_NUM_THREADS enviroment variable. For example, to launch 8 threads, type:

setenv OMP_NUM_THREADS 8

under tcsh, similarly for bash. If OMP_NUM_THREADS is not set, by default as many threads as available cores are launched. Now simply run the executable to run the application:

./hello_openmp

3.1 Running CUDA applications

CUDA is installed on CSIF systems at /usr/local/cuda-8.0 and you can obtain details about the installed GPU card on a particular system by typing nvidia-smi on terminal. CUDA is compiled by invoking nvcc compiler. It links with all CUDA libraries and also calls gcc to link with the C/C++ runtime libraries. A CUDA program hello_cuda.cu, which contains both host and device code, can simply be compilled and run as:

/usr/local/cuda-8.0/bin/nvcc hello_cuda.cu -o hello_cuda
./hello_cuda

CUDA for Windows:

Visial Studio provides support to directly compile and run CUDA applications. Instructions for installation and sample program execution can be found here.

Instructions on how to run MPI, OpenMP and CUDA programs

Sachin Kumawat and Norm Matloff

CSIF

Your Laptop

Details

1.1 Installing MPICH2 (for personal systems only, skip for CSIF systems)

Unix-family systems

Windows

1.2 Running MPI applications with MPICH2

Set Up Remote Authentication:

Compiling MPICH2 Program:

Running MPICH2 application:

2.1 Compiling OpenMP programs with GCC

OpenMP for Windows:

OpenMP for Mac OS X:

2.2 Running OpenMP applications

3.1 Running CUDA applications

CUDA for Windows: