The Linda Parallel Processing Paradigm (and Its Descendant, Jini/Javaspaces)

The Linda parallel programming environment, which started out as a Yale University research project and a commercial product, is a something of a "compromise" between the shared-memory and message-passing paradigms. Communication between processors is handled through a tuple space where processors post and read messages. Its principles are now being used in exciting new languages such as the Jini/Javaspaces language by Sun Microsystems, which was modeled after the Linda concept. IBM has a similar project, called TSpaces.

Since the tuple space is shared by everyone, the "look and feel" a programmer gets is somewhat similar to that of the shared-memory world view. On the other hand, the posting and reading of tuples is similar to the sending and receiving of messages in a message-passing system.

An advantage of this approach is that processing elements can enter and leave the computation pool at will, without announcing their arrival or departure. Processing elements do not send to or receive from specific nodes. Suppose for example we have a large computational problem, say finding a new very large prime number, in which many volunteers across the Internet will participate. The tuple-space approach allows them to do so easily and for whatever amount of time they wish to contribute.

Tuples are like vectors, except that their components can be of various mixed types, i.e. integers, floating-point numbers, character strings, and so on. For example, a tuple could consist of

("abc", 1.5, 12)

Tuples are created, and added to the tuple space, using the out() function. For instance, if the variable Y, say, is declared in the C/Linda program as

int Y;

and currently Y is equal to 4, then the call

out("wxyz",Y,200)

would create and add to the tuple space the tuple

("wxyz",4,200)

Linda applications typically have character-string values for the first components of tuples, with these being roughly analogous to the role of names of shared variables and message types in the shared-memory and message-passing paradigms, respectively.

Processors obtain tuples from the tuple space via the rd() and in() calls. These are identical except that in() removes the tuple from the tuple space while rd() leaves it there. Tuples are selected on a matching basis, with the `?' operator being used by a requesting processor to receive one or more of the tuple components. For example, the call

rd("uvw", 18, ? A, ? B);

will search the tuple space for a 4-component tuple whose first two components are the string "uvw" and the integer 18. The values of the last two components will be assigned to A and B.

Here is a brief example of how all this can be used. Suppose each processor has a variable Y, and we want to compute and print out the sum of all the Y values. Suppose further that each processor has an i.d. number, stored in NodeNumber. Then we could use the following code:

if (NodeNumber == 0) out("sum of all Ys",0);
in("sum of all Ys", ? Tmp);
Tmp += Y;
out("sum of all Ys",Tmp);
/* barrier code would go here */
if (NodeNumber == 0)  {
   in("sum of all Ys", ? Tmp);
   printf("%d\n",Tmp);
}

Since all processors would execute this code, it would compute the global sum as desired. Note that no lock or other synchronization operations are needed; since a processor REMOVES the tuple from the tuple space by calling in(), there is no danger that the summing operation would be nonatomic.

There are many implementations of Linda. The Glenda version, for instance, runs on top of PVM, while POSYBL is standalone. Our implementation, UCTuplets, is also standalone.

One consideration is how to implement the `?' operator without a special compiler. Glenda handles this with a preprocessor, while POSYBL uses function calls instead of `?'.

Example: The simplest example is that in the UCTuplets package.

Programming tips:

The tuple space idea often obviates the need for lock variables. Since an in() operation removes a tuple from the tuple space, no other process can touch it until we execute an out().
It is crucial to keep in mind that the tuple space is not ordered. In particular, it is not a first-in, first-out structure, i.e. a queue. Thus an in() or rd() operation will not necessarily access the oldest among several matching tuples. Falsely assuming order in the tuple space is a very common source of bugs for beginning Linda programmers.