%% This LaTeX-file was created by <root> Tue Feb 15 21:54:11 2000
%% LyX 1.0 (C) 1995-1999 by Matthias Ettrich and the LyX Team

%% Do not edit this file unless you know what you are doing.
\documentclass[twocolumn]{article}
\usepackage[T1]{fontenc}
\setlength\parskip{\medskipamount}
\setlength\parindent{0pt}

\makeatletter


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% LyX specific LaTeX commands.
\providecommand{\LyX}{L\kern-.1667em\lower.25em\hbox{Y}\kern-.125emX\@}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% User specified LaTeX commands.
\usepackage[T1]{fontenc}
\setlength\parskip{\medskipamount}
\setlength\parindent{0pt}

\makeatletter


\setlength{\oddsidemargin}{-0.5in}
\setlength{\evensidemargin}{-0.5in}
\setlength{\topmargin}{0.0in}
\setlength{\headheight}{0in}
\setlength{\headsep}{0in}
\setlength{\textwidth}{7.0in}
\setlength{\textheight}{9.5in}
\setlength{\parindent}{0in}
\setlength{\parskip}{0.1in}
\setlength{\columnseprule}{0.4pt}

\makeatother
\makeatother

\begin{document}

Name: \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_

\textbf{Directions: Work only on this sheet (on both sides, if needed); do not
turn in any supplementary sheets of paper. There is actually plenty of room
for your answers, as long as you organize yourself BEFORE starting writing.
In order to get full credit, SHOW YOUR WORK.}

\textbf{1.} (10) Fill in the blank with a three- or four-letter abbreviation
for a general software category, or else the name of a specific software package:
If we have a hypercube machine and wish to use the shared-memory programming
paradigm, we should use \_\_\_\_\_\_\_\_\_\_\_\_.

\textbf{2.} (15) In the PVM program on p.52 of our text, which PVM library function
is analogous to MPI's MPI\_Comm\_rank()?

\textbf{3.} (15) Suppose the sorting algorithm described in Section 5.3.2 of
our text is implemented in MPI. Show how to code the first recv() at the top
of p.148 in MPI. Assume the numbers are integers and the message type is stated
in a \#define as NUMBER\_TYPE. Make sure to write not only the call to MPI\_Recv()
itself but also the code which determines the value of i.

\textbf{4.} (15) Consider our example MPI program which solves systems of linear
equations. Rewrite the line

\begin{verbatim}
MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);
\end{verbatim}

using MPI\_Send() and/or MPI\_Recv().

\textbf{5.} (15) Look at the sample MulSim program in our printed lecture notes.
Suppose we had forgotten to include calls to LOCK() and UNLOCK(). Let wrong\_cc
denote the final value of CompositeCount under this circumstance, and let true\_cc
denote the correct value. Which of the following statements is correct? (a)
wrong\_cc \( \leq  \) true\_cc, (b) wrong\_cc \( \geq  \) true\_cc, (c) wrong\_cc
could be either larger than, smaller than or equal to true\_cc.

\textbf{6.} (15) Write UCTuplets code for a use-once barrier. Make your code
as efficient as possible.

\textbf{7.} (15) In the text and in class, it was mentioned that the butterfly
barrier in Section 6.1.4 could be used to implement an all-gather operation.
Show the complete MPI code for implementing MPI\_AllGather() for in this manner
for the case of \( P_{2} \) on p.166, using MPI\_Send() and MPI\_Recv(). Show
only the code executed by \( P_{2} \) not the code executed by the other \( P_{i} \).

\textbf{Solutions:}

1. SDSM or DSM.

\textbf{2.} pvm\_mytid().

\textbf{3.}

\begin{verbatim}
MPI_Comm_rank(MPI_COMM_WORLD,&Me);
...
MPI_Recv(&Number,1,MPI_INT,Me-1,MPI_ANY_TAG,MPI_COMM_WORLD,
   &Status); 
\end{verbatim}

\textbf{4.}

\begin{verbatim}
if (my_rank == 0)
   for (m=1; m < p; m++) 
       MPI_Send(&n,1,MPI_INT,m,N_MSG,MPI_COMM_WORLD);
else MPI_Recv(&n,1,MPI_INT,0,N_MSG,MPI_COMM_WORLD,
   &Status); 
\end{verbatim}

\textbf{5.} (a)

\textbf{6.} Note that in the code below, we do not use a \textbf{while} loop,
and the first field in our call to rd() is ``si'', not ``sp''.

\begin{verbatim}
// initialize
if (UCTNodeNum == 0) 
   out("si","barrier",0);
...
// barrier action starts here
in("sp","barrier",&Count); 
out("si"."barrier",++Count);
if (Count < UCTNWorkNodes)
   rd("si","barrier",UCTNWorkNodes); 
\end{verbatim}

\textbf{7.} Suppose each node will contribute K \textbf{int}s, held in LocalArray,
with the result of the gather going into to the 8K-element FullArray. 

In the first stage, \( P_{2} \) must send its K numbers to \( P_{3} \) and
receive K numbers from the latter, putting them in the proper place, which is
starting at FullArray{[}3{*}K{]}:

\begin{verbatim}
MPI_Send(LocalArray,K,MPI_INT,3,AG_MSG,MPI_COMM_WORLD);
MPI_Recv(FullArray+3*K,K,MPI_INT,3,AG_MSG,MPI_COMM_WORLD,
   &Status)
\end{verbatim}

In the second stage, \( P_{2} \) will send 2K numbers to \( P_{0} \), \( P_{2} \)'s
own K numbers, plus \( P_{3} \)'s K numbers, which \( P_{2} \) had received
during the first stage:

\begin{verbatim} 
MPI_Send(LocalArray,K,MPI_INT,0,AG_MSG,MPI_COMM_WORLD);
MPI_Send(FullArray+3*K,K,MPI_INT,0,AG_MSG,MPI_COMM_WORLD);
\end{verbatim}

Then \( P_{2} \) must receive 2K numbers from \( P_{0} \), \( P_{0} \)'s
own K numbers, plus \( P_{1} \)'s K numbers, which \( P_{0} \) had received
during the first stage. We will not show the rest of the code here, but it would
continue along these lines.

Note that we would need to write the code so that sends and receives do not
produce deadlock, say by having lower-numbered partners send first during stages
1 and 3, and higher-numbered partners sending first during stage 2. We would
also have to have the code coordinate correctly; the code for \( P_{0} \) during
stage 2, for instance, would consist of two receives, the first being to FullArray+2{*}K
and the second to FullArray+3{*}K.

\end{document}

