Goals of this Assignment:
Here you will develop some routines and data structures to store and manipulate sparse matrices, meaning those whose elements include lots of 0s. If one is working with very large sparse matrices--matrices in the real world today can have rows and columns numbering in the thousands or even millions--we may wish to store only the nonzero elements, to save space.
Before going into the details, note the following:
Any compression scheme for sparse matrices consists of specifying information on "which" and "what":
The "which" part represents overhead, as ordinary matrix storage doesn't have it, but if the matrix is "sparse enough," we will in the end use less space.
Here we will restrict to matrices whose elements are only 0s and 1s, motivated by graph applications (the existence of an edge from vertex i to vertex j is coded as a 1 in row i, column j). Thus the "what" part above is not an issue.
The "which" part will be specified by stating where clumps of 1s are located. Roughly speaking, the storage structure will say something like, "There are two 1s starting at row 2, column 5, and six 1s starting and row 4, column 1, and ..."
It is assumed that the uncompressed form of a matrix is stored as a standard C/C++ two-dimensional array, i.e. row and column numbers corresponding to row-major ordering. This is true even if code has declared the array one-dimensional. Note that a clump of 1s may straddle two or more rows.
Again motivated by the graph context, we will assume that the number of rows and columns are equal, and that any matrix has at least one 1.
A compressed matrix will be stored as an object of a C++ class, with these specifications:
class smatrix { public: int nrc; // number of rows and columns in the matrix int nclumps; // number of clumps // cmatrix contains the compressed form of the matrix; // cmatrix[2*j] = r and cmatrix[2*j+1] = s means there is // a clump of s 1s beginning at (1-dimen.) position r of the // uncompressed matrix int *cmatrix; // compressed form of the matrix int nclumps; // number of clumps // constructor inputs rc x rc uncompressed fullmatrix smatrix(int rc, char *fullmatrix); smatrix operator+ (smatrix& summand2); // boolean sum printm(void); // prints uncompressed form of "this" to screen, row by row };
The addition operator is for merging two graphs that have the same set of vertices. There will be a link from i to j in the merged graph if there is one in either or both of the summands. It is assumed that the two operands are of the same size (you do not have to check for this).
You may add other member variables and methods to the class if you wish, say to help your coding, but do not change anything above. For instance, do NOT add any arguments to the methods.
The purpose of the code is to save memory space. Thus, do not uncompress a matrix, even temporarily, during the addition or print operations.
Example:
int main() { char m1[4] = {0,1,1,0}; smatrix sm1(2,m1); char m2[9] = {0,1,1,0,0,1,1,1,0}; smatrix sm2(3,m2); char m3[9] = {0,1,0,1,0,1,1,1,1}; smatrix sm3(3,m3); sm2.printm(); smatrix sm4 = sm2 + sm3; ... }
After executing the above, you should have
sm1.cmatrix: nclumps = 1 cmatrix = 1 2 sm2.cmatrix: nclumps = 2 cmatrix = 1 2 5 3 sm3.cmatrix: nclumps = 3 cmatrix = 1 1 3 1 5 4 sm4.cmatrix: nclumps = 2 cmatrix = 1 3 5 4
The call to printm() for sm2 should display
0 1 1 0 0 1 1 1 0
HIGHLY IMPORTANT NOTE:
The ability to use a debugging tool is crucial for software development. You should be using one for every program you write--for your OWN benefit. You can download Chapter 1 of my book on debugging for free here, to learn not only the mechanics of the tools but also the philosophy of effective debugging. (There is no need for you to buy the book.)
During the interactive grading session, the TA will ask you to demonstrate the operation of a debugging tool, e.g. setting a breakpoint or checking the value of a variable, on your code for this assignment. The tool is of your choice, but it must be one of the official ones on CSIF.