Instructions on How to Run MPI, OpenMP and CUDA Programs

Instructions on how to run MPI, OpenMP and CUDA programs

Sachin Kumawat and Norm Matloff

This is a quick overview on running parallel applications with MPI, OpenMP and CUDA. The recommended platform is Unix (includes Linux and Mac OS X) and useful (but untested!) links are accordingly provided for Windows Tools as well. These instruction should be followed to gain familiarity with the tools before starting with actual assignments and taking quizzes.

CRUCIAL NOTE: When you take the quizzes, the various executables, gcc, mpicc, mpiexec, R an python must be in your search path, as the OMSI script will invoke them. A similar statement holds for library paths. Thus it is absolutely essentially that you do a dry run of OMSI before the first quiz.


  • The gcc ompiler is OpenMP-capable.
  • Both MPI (mpicc, mpiexec) and CUDA (nvcc) toolchains are installed on CSIF machines.
  • Your Laptop

    For our quizzes, you will need gcc, MPI and R for running code, and Python for running our OMSI quiz tool. CUDA for quizzes will just be "pencil and paper" style, no actual compiling/running.

    As noted, your version of gcc must be OpenMP-capable. To test that, download omp_hello.c, compile and run:

    gcc -g omp_hello.c -fopenmp 

    For the programming assignments, you will also need gcc, MPI and R.

    If you have a CUDA-compatible video card, you may install CUDA but be prepared for some obstacles to resolve. Installation can be performed by following instructions from CUDA Toolkit's homepage. The setup is rather involved but majority of the issues are discussed here.

    Installation of OpenMP capable C/C++ compiler and MPI tools


    1.1 Installing MPICH2 (for personal systems only, skip for CSIF systems)

    Unix-family systems

    Download the latest stable MPICH2 source files from MPICH2 Downloads page. Extract the tarball, enter the extracted directory on a terminal and type the following:
    make install
    The default installation directory is /usr/local/bin but you can set it to your preferred installation location by modifying the configure command as:
    configure -prefix=/your/installation/directory
    Now add the bin and lib subdirectories of the installation directory to corresponding environment variables. On a Bash shell (default for linux systems):
    export PATH=/your/installation/directory/bin:$PATH
    export LD_LIBRARY_PATH=/your/installation/directory/lib:$LD_LIBRARY_PATH
    You can check to see if everything is in order by running the following to see if mpicc and mpiexec are found at the corresponding bin directory:
    which mpicc
    which mpiexec
    Note that, same procedure should be followed to set-up the MPI tools on every node (physical computer) you plan to run a multi-node application on. This is not necessary on CSIF, since the machines there share a file system.


    On windows, a version of MPICH called MSMPI can be used along with Visual Studio. Download directly from Microsoft's Website. Compilation and launch instructions are provided here.

    1.2 Running MPI applications with MPICH2

    2.1 Compiling OpenMP programs with GCC

    The GNU gcc/g++ compilers are capable of running programs with OpenMP directives right out of the box. Therefore no installation/configuration is required for Linux systems (except for OS X, see below). To enable OpenMP support for a program hello_openmp.c, simply compile with the flag -fopenmp as:

    gcc -fopenmp hello_openmp.c -o hello_openmp
    (Use g++/g++-6 for C++ applications.)

    2.2 Running OpenMP applications

    To run an OpenMP application, first specify the number of threads using OMP_NUM_THREADS enviroment variable. For example, to launch 8 threads, type:

    setenv OMP_NUM_THREADS 8
    under tcsh, similarly for bash. If OMP_NUM_THREADS is not set, by default as many threads as available cores are launched. Now simply run the executable to run the application:

    3.1 Running CUDA applications

    CUDA is installed on CSIF systems at /usr/local/cuda-8.0 and you can obtain details about the installed GPU card on a particular system by typing nvidia-smi on terminal. CUDA is compiled by invoking nvcc compiler. It links with all CUDA libraries and also calls gcc to link with the C/C++ runtime libraries. A CUDA program, which contains both host and device code, can simply be compilled and run as:

    /usr/local/cuda-8.0/bin/nvcc -o hello_cuda