Professor Norm Matloff
Dept. of Computer Science
UC Davis
530-752-1953
matloff@cs.ucdavis.edu
The UCTuplets package is an implementation of the Linda parallel programming paradigm, featuring functionality on two levels:
Both packages run on networks of Unix workstations of a given architecture (e.g. a network of SPARCstations). The user writes in C or C++ and makes library calls to the UCTuplets system.
The base version should run on almost any Unix machines connected through network (TCP/IP) access. Note, however, that no provision is made for translating data between little- and big-endian machines. Thus the machines must be either all little-endian or all big-endian. Also, the package can be run by invoking one's UCTuplets application program several times on the same machine, say for development purposes.
The research version.has been tested successfully on Linux, SPARC, HP and SGI platforms. It must be run on a homogeneous set of workstations, e.g. all Linux PCs or all SPARCs, using the same compiler on all machines, as it depends on functions within the application program having the same (virtual) address in each instance of the program.
Just type "make" to build either package. On SPARCs, first do an
include of
Your program must include a line That line will make the following variables available to your program: Of these, UCTNodeNum, UCTWorkNodes (total number of machines minus the 1
tuple manager node), UCTArgc, and UCTArgv are the ones of main use. Note that
the latter two pertain only to the arguments which are for the application
itself. Thus for example argv[5] for the program as a whole is identical to
UCTArgv[1]. Programs are written in SPMD (single program, multiple data) form, meaning
that all nodes other than node 0 execute the same program. Even node 0, which
is the tuple manager and thus does not do computation, runs the same program.
Say the application source code file is x.c and the executable is x, and we
have 5 nodes. Then x runs on all 5 nodes, the only difference being that on
node 0 x calls the tuple manager function, while at the other nodes x calls
Worker(), which does the computation of the application. The Worker() function, written by the user, is required. This is where the
work of the application is performed. It should be of type int, with no
arguments. The body of the user's main() function should consist of just one line: UCTInit() will call the tuple manager in the case of node 0, and call
Worker() in the other cases. The usual Linda in(), out() and rd() functions are available. Tuple
components can be of type string, int, float, pointer or int/float array. The first component of any tuple specification in calling these functions,
which is required to be of type string, will specify the types of the other
components, using this scheme: In a call to in() or rd(), components of a tuple specification are
either matching or nonmatching, i.e. are used in
specifying criteria for finding a matching tuple in the tuple space.
The p, I and F types are used to capture the values of nonmatching
components (instead of using the `?' operator of classical Linda). For
example, the call
would direct the tuple manager to find a 3-component tuple whose first
component is "abc", and whose second component is equal to the current value
of X. The value of the third component would be copied to N. In a call to
out(), the p type would not be used. An I or F component must be preceded by an int component in the case of a
call to out(), the int component being used to store the length of the array,
and be preceded by a pointer component in the case of a call to in() or rd(),
with the length of the array being placed in the location specified by the
pointer upon return from the call. Note that for in() and rd(), an I
or F field is considered to be a pointer and thus nonmatching; matching
is not allowed for arrays. For example:
Note: Apparently the array Y above must be local, not
global. (This seems to be due to some problem with use
of varargs.)
The rules here are the same as for the base version, except that the first
code in the type-specifying component of a tuple must be either `r' ("remote")
or `l' ("local"). The remote operation simply means the base system, while
the local operation indicates intelligent-tuple operation, as follows: The main idea of intelligent tuples is for worker nodes to offload some of
their work to the tuple server. This is not motivated by situations in which
the node running the tuple server is especially fast; instead, the point is to
avoid communications costs. To illustrate this idea, consider a simple example in which we have a
counter which must be incremented every time a worker node reaches a certain
spot in the program. In other implementations of Linda, including the base
version of UCTuplets, one would do the incrementing with code something
like: The problem with this is that it requires two trips through the network,
thus double the communication overhead. It would be much nicer if a worker
node could simply ask the tuple manager to do the incrementing of the "count"
tuple, returning the old value to the worker node. The idea of intelligent
tuples is to allow such an operation, using a call to the UCTuplets library
function tmexec() ("tuple manager execute"). In the first argument in that
call, the user would specify a user-written function which the tuple manager
is to execute. The user-written function will consist mainly of calls to
in(), out() and rd(), all with the `l' prefix, i.e. local. The function tmexec() itself has the following call structure (the
following line is part of UCTInclude.h): The first argument is the address of the user-specified function; the
second argument is an array of input arguments for that user function; the
third is the length of that array in bytes; the fourth is an array of output
arguments for the user function; the fifth is a pointer which will, after the
call to tmexec(), point to the length of the output array. Here is how our example above could be written. First, our user-specified
function would be something like: To command the tuple manager to execute this function, we call
tmexec(): The key point is that the in() and out() calls in FetchAndAdd() will be
done locally by the tuple manager, thus saving network communication. Use the Unix set command to set UCTDIR to the UCTuplets directory (which
has bin, lib and include as subdirectories). (It is assumed here that either
the base version or the research version of UCTuplets is installed here, not
both.) To compile a base-version application, say x.c, type:
On SPARCs, add -lsocket to the compile command line.
For a research-version application, use g++ instead of gcc. The format for running an UCTuplets application "by hand" (as opposed to
via the UCTRun script, described below) is where:
Say x has arguments 10 and 4. Then to run x, say at machines u, v and w,
on port 1991, we would type: on machine u, and then type the same thing at v and w except that the 0
would be replaced by 1 and 2, respectively.
Some more detail on the port number: You do not have to know TCP/IP or
know what a port number is in order to run UCTuplets. Just choose a
number above 1024, and run the netstat command, say as "netstat --inet"
if that option exists, and scan through it to make sure the port number
you wish to use does not appear anywhere in the output. Don't worry
about what the numbers mean.
Don't reuse the same port number within a 3-4 minute period.
UCTRun, in the bin subdirectory, can be used to avoid having to manually
start the UCTuplets application at each machine. Its command-line format
is where machine_file is a text file listing the machines to be used
(u, v, and w in the example above), each on a separate line. You
would run the script from u. Make sure that you have set up your .rhosts file
for this. In the case of the research version of UCTuplets, you may
need to edit the first line of UCTRun to indicate the full path for perl
on your system. One can attach gdb to running UCTuplets processes, in the manner described
in my page on
parallel debugging for MPI and similar packages. However, a nice aspect
of UCTuplets is that it is easy to simply start each UCTuplets manually via
gdb to begin with. Norman Matloff conceived the
functionality and structure of the base and research versions of UCTuplets.
The software was implemented by Michael Foster, David Standring and Robert
Sweeney, who also designed and performed the initial empirical
investigations.How to write an UCTuplets application
program:
Include files and UCT variables:
#include <UCTInclude.h>
char UCTTMName[MAX_TM_LENGTH];// tuple manager node name
int UCTPortNum, // port number for tuple manager
UCTNodeNum, // number of this node
UCTNWorkNodes; // total number of work nodes
int UCTArgc; // replacement for argc
char **UCTArgv; // replacement for argv[]
Program
format:
UCTInit(argc,argv);
Tuple format and operations:
s: string
i: int
f: float
p: pointer to int or float
I: int array
F: float array
in("sip","abc",X,&N);
int Y[1000],N;
in("spI","def",&N,Y); // get array and its length
// do some work on Y
out("siI","def",N,Y); // put array back in the tuple space
How to write an intelligent-tuples program:
in("sp","count",&I);
out("si","count",I);
I++;
void tmexec (void*, char*, int, char*, int*);
void FetchAndAdd(int * Ins, int InsL, int * Outs, int * OutsL)
{ int I;
in("lsp","count",&I);
memcpy(Outs,I,sizeof(int));
*OutLengthPtr = sizeof(int);
out("lsi","count",++I);
}
tmexec(&FetchAndAdd,Ins,InsL,Outs,&OutsL);
memcpy(I,Outs,*OutsL);
How to compile and run an application program:
gcc -g -o x x.c -I$UCTDIR/include -L$UCTDIR/lib -luct
app_name t_mgr_node port_num node_num tot_nodes app_args
x u 1991 0 3 10 4
UCTRun machine_file port_number app_name app_args
Debugging:
Authors: