One of the reasons why C has become so popular is that it allows the programmer to do many types of machine-level applications without resorting to using machine language as the vehicle for programming. One example of that is C's ability to work with memory addresses. Another example, or class of examples, consists of C constructs for bit-level operations, which will be described here.
C includes the bitwise-AND operator & and the bitwise-OR operator |. To AND two bits together, one merely multiplies them: 0 AND 0 = 0, 1 AND 0 = 0, 0 AND 1 = 0, 1 AND 1 = 1. To OR two bits together, we just add them, except that 1 OR 1 is equal to 1, not 2: 0 OR 0 = 0, 1 OR 0 = 1, 0 or 1 = 1, 1 or 1 = 1.
Note that
The AND operator is useful for
In the next few examples, suppose the variable X is of type int.
As an example of (a), suppose we wish to put 0s in Bit Positions 3 and 2 (the rightmost bit position being Bit Position 0), and suppose we have a machine/compiler combination which implements int variables as 4-byte quantities (again, this is typical for most workstations today). We would accomplish our goal with the statement
X &= 0xfffffff3;
Here is why: First, note that
X &= 0xfffffff3
is just a short form of
X = X & 0xfffffff3
(like X += 5 vs. X = X + 5). So we are AND-ing 0xfffffff3 with X and putting the result back in X. The constant 0xfffffff3 is, in bit form,
11111111111111111111111111110011
Most of the bits here are 1s, and since, as mentioned earlier, AND-ing by a 1 produces no change, most bits in X will not change. Only the bits at Bit Positions 3 and 2 will potentially change: They will change to 0s if they were 1s (or stay at 0s if they were 0s).
For example, after executing the statements
X = 29; X &= 0xfffffff3;
X will have the value 17 (try this yourself on the machine, and make sure you understand).
By the way, the value to be AND-ed with X, in this case 0xfffffff3, is called a mask. To see why, think of what happens when you wish to paint the walls of a room. You would like to cover up the electrical sockets so that they don't get painted; you can do this with masking tape. Well, in the example above, we wished to ``cover up'' all the bits except those at Bit Positions 3 and 2, so we AND-ed them with 1s, so that they would not change.
As an example of (b), suppose we wish to set Z = 12 if there is a 1 in Bit Position 5 of X. We could do this with the statement
if (X & 0x00000020) Z = 12;
Here is why this statement accomplishes this goal: The mask 0x00000020 has 0s everywhere except in Bit Position 5, where there is a 1. Recall from above that AND-ing with a 0 produces a 0, no matter what value it is AND-ed with, while AND-ing with a 1 results in copying the bit value. So the quantity X & 0x00000020 will have 0s everywhere except at Bit Position 5, where there will be a copy of X's original Bit 5. If the latter is a 1, then X & 0x00000020 will be nonzero (0x00000010, to be specific), and since any nonzero value is condidered `true', Z = 12 will be executed.
OR-ing is used to put 1s at specific bit positions. You should verify, and make sure you understand, that after the statements
X = 29; X |= 0x00000020;
are executed, X will be equal to 61.
The operators << and >> shift the bits in a variable toward the left or toward the right, respectively. Suppose for example that X and Y are of type int, and we execute
X = 5; Y = X << 3;
Then (say on the CAE machines) X will be, in bit form,
00000000000000000000000000000101
All these bits will be shifted left by 3 positions, and the result placed in Y. The latter will now be
00000000000000000000000000101000
which has the value 40.
Note that the bits on the left end disappear. This may create problems. For example, shifting may turn a positive number into a negative one, or vice versa, since the leftmost bit of a number tells its sign. This presents no problem, though, for variables of type unsigned, so shift operations are typically done on variables of that type, not of type int.
These arise in a special type of struct, which allows one to assign variable names to groups of bits within a variable. We will not go into this here.
Below we have a database example as an application of bit operators.
Many databases are extremely large. Just imagine, for example, how much computer space--either in memory, during program execution, or on disk, for permanent storage, or a combination of both--is required to store the records for all of the Bank of America's credit-card holders. One way to solve this problem is to ``pack'' the data; the Pascal language actually includes a provision for this, the packed array construct.
C does not have this feature, but its bit-level operators can easily be used to implement the equivalent of it. The program below does this.
1 Script started on Tue May 19 18:42:41 1992 2 heather% cc -g Packed.c -lm 3 heather% a.out 4 read in a file? 5 n 6 enter MaxN 7 3 8 enter command 9 e 10 enter ItemNum 11 7 12 enter Datum 13 3 14 enter command 15 e 16 enter ItemNum 17 5 18 enter Datum 19 2 20 enter command 21 e 22 enter ItemNum 23 8 24 enter Datum 25 2 26 enter command 27 v 28 enter output file name 29 gy 30 enter command 31 q 32 heather% !! 33 a.out 34 read in a file? 35 y 36 enter file name 37 gy 38 enter command 39 d 40 enter ItemNum 41 5 42 2 43 enter command 44 d 45 enter ItemNum 46 8 47 2 48 enter command 49 q 50 heather% cat Packed.c 51 52 53 /* example of how C's bit-level operations can be used to implement 54 packing in the Pascal sense 55 56 data consist of integers in the range 0..MaxN; each datum is 57 stored in a FieldSize-bit subset of an int variable (actually, 58 a variable of type `unsigned int', though we will just say `int' 59 in the rest of this discussion); there are IntSize/FieldSize such 60 fields per int variable, where IntSize = sizeof(int); FieldSize 61 is equal to log(MaxN+1), where the log is base-2; it is assumed 62 that FieldSize evenly divides IntSize 63 64 for example, on Sun's and most other current workstations, 65 int variables are stored in 32-bit words; suppose we wish to 66 store integers in the range 0..3; these can be stored in 2-bit 67 fields (0 = 00, 1 = 01, 2 = 10, 3 = 11); thus we can get 32/2 68 = 16 such fields into one int variable, and thus compress 69 storage requirements (either in memory or in a disk file) by 70 a factor of 16, a big savings; for example, instead of needing 71 an array of 80,000 elements, 72 73 int X[80000]; 74 75 we could fit all 80,000 data items into an array of 5,000 int 76 variables, i.e. 77 78 int DataArray[5000]; 79 80 DataArray[0] would contain what would have been (if we had not 81 packed) X[0],X[1],...,X[15]; DataArray[1] would contain what 82 would have been X[16],X[17],...,X31]; etc.; note the bit positions: 83 X[0] would be in Bits 31 and 30 of DataArray[0], X[1] would be in 84 Bits 29 and 28, etc. (calling the leftmost bit Bit 31 and the 85 rightmost bit Bit 0) */ 86 87 88 #define MaxArrayLength 100 89 #define MaxIntSize 100 90 91 92 #include <fcntl.h> 93 #include <math.h> 94 95 96 unsigned DataArray[MaxArrayLength], /* array of packed data */ 97 ItemNum, /* index the data item would have in X if 98 packing were not used */ 99 DataArrayIndex, /* index of the item within DataArray */ 100 DataArrayStartBit, /* bit position of the item within the 101 DataArray element */ 102 Datum, 103 ZeroOutA[MaxIntSize], 104 ZeroOutB[MaxIntSize], 105 All1s, 106 MaxN, /* upper bound for the value of a datum */ 107 FieldSize, /* number of bits needed to store a datum */ 108 IntSize, /* sizeof(int) */ 109 NFields; /* number of fields per DataArray element */ 110 111 112 char Command; /* d = display data item 113 e = enter new data item 114 q = exit program 115 v = save to file */ 116 117 118 InitZeroOutValues() 119 120 { unsigned Tmp = MaxN; 121 int BitPos,I; 122 123 BitPos = FieldSize - 1; 124 for (I = 0; I < NFields; I++) { 125 ZeroOutB[BitPos] = Tmp; 126 ZeroOutA[BitPos] = All1s - ZeroOutB[BitPos]; 127 BitPos += FieldSize; 128 Tmp <<= FieldSize; 129 } 130 } 131 132 133 ReadFile() 134 135 { int FD; char InFileName[20]; 136 137 printf("enter file name\n"); 138 scanf("%s",InFileName); 139 FD = open(InFileName,O_RDONLY); 140 read(FD,&MaxN,IntSize); 141 read(FD,DataArray,MaxArrayLength*IntSize); 142 } 143 144 145 Init() 146 147 { IntSize = sizeof(int); 148 All1s = -1; 149 printf("read in a file?\n"); 150 scanf("%c",&Command); 151 if (Command == 'y') ReadFile(); 152 else { 153 printf("enter MaxN\n"); 154 scanf("%u",&MaxN); 155 } 156 FieldSize = (int) log2((double) (MaxN+1)); 157 NFields = IntSize*8 / FieldSize; 158 InitZeroOutValues(); 159 } 160 161 162 Insert() /* inserts Datum at the position in DataArray appropriate 163 for the given value of ItemNum */ 164 165 { DataArrayIndex = ItemNum / NFields; 166 DataArrayStartBit = (NFields-ItemNum % NFields) * FieldSize - 1; 167 168 /* in inserting Datum into its bit field, we must take care that 169 none of the other bit fields in this element of DataArray are 170 affected; the plan is as follows: 171 172 (a) use the bitwise-AND operator & to zero-out the field in 173 which Datum will be put; 174 175 (b) use the shift-left operator << to shift Datum the proper 176 number of bit positions to make it aligned with the zeroed- 177 out field; 178 179 (c) use the bitwise-OR operator | to insert the item without 180 affecting the other fields 181 182 */ 183 184 /* (a) */ 185 DataArray[DataArrayIndex] &= ZeroOutA[DataArrayStartBit]; 186 187 /* (b) */ 188 Datum <<= DataArrayStartBit - (FieldSize - 1); 189 190 /* (c) */ 191 DataArray[DataArrayIndex] |= Datum; 192 } 193 194 195 GetCommand() 196 197 { printf("enter command\n"); 198 scanf("%c",&Command); 199 /* might be a leftover end-of-line character from the 200 previous action; if so, then read the next character 201 in the input stream */ 202 if (Command == '\n') scanf("%c",&Command); 203 } 204 205 206 InputItemNum() 207 208 { printf("enter ItemNum\n"); 209 scanf("%u",&ItemNum); 210 } 211 212 213 InputDatum() 214 215 { printf("enter Datum\n"); 216 scanf("%u",&Datum); 217 } 218 219 220 unsigned GetItem() /* displays to the screen the element which would 221 be at position ItemNum of X if we had not 222 packed */ 223 224 { unsigned Tmp; 225 226 DataArrayIndex = ItemNum / NFields; 227 DataArrayStartBit = (NFields-ItemNum % NFields) * FieldSize - 1; 228 229 /* the plan: 230 231 (a) copy the DataArray element to Tmp 232 233 (b) zero out the fields in Tmp other the one containing the 234 item of interest 235 236 (c) shift Tmp to the right to produce the item of interest 237 238 */ 239 240 /* (a) */ 241 Tmp = DataArray[DataArrayIndex]; 242 243 /* (b) */ 244 Tmp &= ZeroOutB[DataArrayStartBit]; 245 246 /* (c) */ 247 return Tmp >> (DataArrayStartBit - (FieldSize - 1)); 248 } 249 250 251 SaveToFile() 252 253 { int FD,I; char OutFileName[20]; 254 255 printf("enter output file name\n"); 256 scanf("%s",OutFileName); 257 FD = open(OutFileName,O_WRONLY|O_CREAT,0700); 258 write(FD,&MaxN,IntSize); 259 write(FD,DataArray,MaxArrayLength); 260 } 261 262 263 main() 264 265 { Init(); 266 while (1) { 267 GetCommand(); 268 switch (Command) { 269 case 'd': InputItemNum(); printf("%u\n",GetItem()); break; 270 case 'e': InputItemNum(); InputDatum(); Insert(); break; 271 case 'q': exit(); 272 case 'v': SaveToFile(); 273 } 274 } 275 } 276 277 heather% e 278 heather% 279 script done on Tue May 19 18:44:16 1992
Analysis:
First read the comments, Lines 53-85. They describe the general goal of the program. The program inputs values and stores them in a packed array. The user can ask to have them displayed, or have them written--again in packed form--to a file. The program is demonstrated on Lines 3-49. For example, on Lines 11 and 13, the user is asking to store the value 3 as the seventh item in the database, while on Lines 23 and 25, he/she is asking to store 2 into the eighth item. On Lines 27 and 29, the user is asking that the database be saved into the file `gy'.
By the way, note the `-lm' in the command line for the compile (Line 2). This tells the compiler (actually, the linker, ld, which is called by the compiler) to look in C's math library (/usr/lib/libm.a) to find any functions it can't find elsewhere. In this case, we are calling the base-2-logarithm function, log2 on Line 156, which makes the `-lm' necessary:
Note also Line 93 in this regard. We are telling the compiler to include the C source code file /usr/include/math.h (the indicates that the file is in the directory /usr/include; otherwise, we would have used quotation marks on the file name). That file contains definitions related to the math library; in our case here, it contains line stating that log2 is of type double (double-precision version of float):
extern double log2();
Without including this file, or writing this line ourselves, the compiler would mistakenly think that log2 was of the default type, int, and would likely produce erroneous machine code.
Note that unsigned type is used instead of int, for the reasons given above. Accordingly, %u format is used instead of %d, e.g. in Line 269.
Let's take a look at how the Insert() function works.
First, on Lines 165 and 166, we find out which element of DataArray, and at which bit position within that element, this slot is. The answer is DataArray[0] and Bit Position 21. This makes sense: Since MaxN is 3, i.e. we are storing numbers in the range 0..3, we need only two bits per slot. That means that DataArray[0] contains database items 0, 1, 2, ..., 13, 14 and 15, starting at Bit Positions 31, 29, 27, ..., 5, 3 and 1, respectively; in particular, Item 5 will be in this element of DataArray, at Bit Positions 21 and 20.
We must first zero-out this slot (Line 185). This will erase the old value in the slot (technically, it will change the value to 0), in preparation for using the OR operator to add in our new datum there (Line 191). This zeroing will be done by AND-ing DataArray[0] with the mask
11111111110011111111111111111111
which is stored in ZeroOut[21], having been computed earlier in the function InitZeroOutValues() (Lines 118-130). Note again that it is crucial that the other slots in DataArray[0] be left intact, which will be the case, due to all those 1s in the other bit positions in ZeroOut[1]. As a result, DataArray[0], which had previously contained
00000000000000110000000000000000
(the underlined 11 is the 3 in the slot for ItemNum = 7), will now contain
00000000000100110000000000000000
Next, we shift the new datum, 1, to the left so as to line it up with its slot (Line 188). In other words, Datum had been
00000000000000000000000000000001
and now we will shift it to the left to make it equal to
00000000000100000000000000000000
Now we are ready to OR it with DataArray[0] (Line 187), which has the effect of putting the new value, 1, into the requested slot.
The function InitZeroOutValues() uses techniques like those of Insert() and GetItem(). Note that the variable All1s (Line 122) is well-named: It was set to -1 (Line 144), which does indeed consist of a long string of only 1-bits.