Professor Norm Matloff
Dept. of Computer Science
University of California at Davis
Davis, CA 95616
(Please mail any questions to Norm Matloff.)
Go to the JIAJIA home page.
Unpack JIAJIA and then go to the lib subdirectory. Look for a platform which matches yours, and cd to that subdirectory, and simply run make.
If the machines on which you will be running JIAJIA do not allow the use of rsh and use ssh instead, then before building JIAJIA you need to change the file src/init.c. On line 249, replace "rsh" by "ssh".
If you are using JIAJIA on a network of workstations with a common file system, you can configure JIAJIA to take this into account, so that JIAJIA will not needlessly copy your executable to the various nodes. Set this by adding -DNFS to the CFLAGS line in lib/Makefile.common . Note, though, that there seems to be a bug, at least in Version 2.1. I have written a patch, which replaces src/init.c. It has a restriction that the .jiahosts file must be in your home directory.
Also, JIAJIA prints out rather voluminous performance statistics. You may find it helpful to pipe your output through the UNIX more command, or if you do not want these statistics, you can comment out the lines in jia_exit() when you build JIAJIA. (It seems that removing -DDOSTAT from lib/Makefile.common does not accomplish the desired effect.)
The key point is that (simulated) shared memory is set up via calls to jia_alloc(), similar to the ordinary C malloc() . There are various calls available for lock/unlock, barrier etc.
See my JIAJIA sample program for the general structure, and also a second version of that program.
Make sure your application code contains a line like
Run gcc as usual, but make sure that you use the -I and -L command-line options for the JIAJIA /src and /lib/X subdirectories, respectively, where X refers to your platform; the latter subdirectory should be the one which contains the file libjia.a. Also specify -ljia (and -lm if you need it).
I am assuming here that you are running JIAJIA on a set of machines which share a common file system.
Also, make sure that the UNIX command rsh works. If your system uses only ssh, you'll need to set up passwordless access.
Set up a .jiahosts file in the directory in which your application's executable resides. Write one line in the file for each node to be used in JIAJIA program, with the line contents consisting of the node's network name, your username and 0, say:
pc12 matloff 0 pc14 matloff 0 pc16 matloff 0
Then simply type the name of the executable (and command-line arguments, if any), at node 0 (i.e. pc12 above). This latter point is crucial.
The JIAJIA library redefines stdout on all nodes numbered greater than 0. Thus something like printf() will work normally on node 0, but its output will be sent to a file on the other nodes. The name of the file will be of the form x-i.log if JIAJIA is configured for NFS (i.e. shared file system among the nodes), x.log otherwise, where x is the name of the application and i is the node number.
In JIAJIA and any other program, you should avoid using printf() for debugging, and instead use a debugging tool such as ddd or gdb. See my debugging Web page if you have not been doing so in the past. Things are a bit more involved in the case of parallel-processing programs, so also see my parallel debugging guide.
It is important to note that JIAJIA and other page-based DSMs rely on their own page-fault handlers, which in turn operate using UNIX signals. Since your debugging tool, say gdb, will halt at all signals, that means that your debugging tool will halt frequently for no reason related to your debugging goals, and thus you must disable such halts. The command for this in gdb (note that ddd has a command window which you can use to instruct gdb) is
(gdb) handle 11 nostop noprint
Note also that since the nodes are all in constant communication with each other, your debugging tool may "hang" on one node at a place which does not seem likely. This is natural, and is solved by stepping further through other nodes so that the communications can take place, and eventually the debugger will stop hanging at the node in question.
In order to minimize network traffic, JIAJIA uses Scope Consistency. This means that a change made to a variable at one node will not be propagated to other nodes unless the programmer takes proper care. There are two ways to do this, locks and barriers:
(Note that this is time-consuming, and if one needs the barrier operation without performing such updates, JIAJIA offers the jia_wait() function.)
If you have an "ordinary" seg fault in your program, i.e. a pointer or array error which does NOT involve shared memory, JIAJIA may still report it as a shared-memory error. It is important to keep this in mind.
JIAJIA has a number of features which will can be exploited to make your code run much faster. Beginners should probably not worry about these, but for serious JIAJIA programming usage of these features is virtually mandatory.
Each word in shared memory will have a home node. The location of the home will be set by default if one uses the simplest version of jia_alloc(), while the more sophisticated ones allow one to specify a home. If one chooses the latter option, one can write one's application code in such a way that most shared-memory accesses made at a node are "local," i.e. are to portions of the shared memory for which this node is home. This can be a highly significant factor in the speed of the application.
In addition, JIAJIA allows allocation of shared-memory to homes at the "block," i.e. subpage level, via the most sophisticated version of jia_alloc().
The home of a word is fixed by default, but one can allow the home to migrate, using jia_config().
See the manual in the /doc subdirectory of the JIAJIA package for details on these and many other special performance features. See also the clearer documentation on the Web for JUMP, a DSM package derived from JIAJIA.