Computer Networks, by Andrew Tanenbaum (third edition)
Supplementary Notes by ProfessorNorman Matloff

p.43:

SNA is an old IBM mainframe network protocol stack.

p.46:

The phrase "802 LAN address" refers, for example, to the Ethernet ID. The formal standard for Ethernet is called IEEE 802.3. Another kind of LAN, token ring, is 802.4, etc.

p.59:

The term latency, in a computer context, refers to transit time for a bit, e.g. from a source node to a destination node in a network. This is very different from the term bandwidth, which means the number of bits per second one can place onto a communications channel. For an analogy, imagine a very long but very high-pressure garden hose. If you turn on the faucet, it will be a long time before the first drop of water comes out of the nozzle (long latency), but once the water starts coming, it will come out at a high rate (high bandwidth).

p.57:

A metropolitan area network (MAN) is on a somewhat larger scale than a LAN.

DQDB is a form of LAN, with two cables instead the Ethernet's one (with other significant differences too).

p.59:

The use of the counter described here is an example of the notion of a credit-based system, which is used in some multiprocessor systems as well.

RS-232 is the name of a physical-layer protocol for connecting a computer and a modem.

TYMNET was essentially like today's ISPs, offering services similar to Telnet and FTP. If a company subscribed to TYMNET, people all over the world could log in to it with a local phone call.

p.60:

The frame-relay protocol is called "bare bones" here because of the minimal nature of its services. It does not offer packet disassembly/assembly, error processing, and so on, which are up to the users.

p.61:

The term circuit-switching refers to the fact that bandwidth is reserved for the connection. When you talk on the phone, a certain amount of bandwidth is reserved for your conversation, even during the times when you and the person you are talking to are both silent. This wastes bandwidth, but guarantees low latency. If the phone system for voice were packed-switched, some of the packets carrying your voice may have to queue at various points, thus causing a delay, which would be frustrating to both of you.

p.79:

In the discussion of baud rate, the author is describing what would happen if we were to code the 3-bit string 000 as 0 volts, 001 as 1 volt, 010 as 2 volts, etc.

The root-mean-square (rms) amplitudes tell us which frequencies (harmonics, i.e. multiples of the fundamental frequency f0) in the signal (your voice, data, etc.) are important. This in turn tells us whether the frequencies cut off by our transmission medium make much difference or not.

Figure 2-1 is a good illustration of this. The pictures show what happens as we use more and more of the frequencies (i.e. if the medium cuts off fewer and fewer frequencies). In the left-hand pictures, we can see that the truncated waveform is getting closer and closer to the original one (which is at the top). The bottommost picture, which shows the signal truncated to its first 8 harmonics, actually seems to be a fairly good approximation to the original signal, which we can see from the picture on the top-right, which shows that by 8 harmonics we have captured most of the signal's power.

Figure 2-2 then shows the consequences.

p.184

The basic idea is that if there are errors during transmission, the receiver will hopefully notice that what it received is not a valid codeword.

For example, think of asynchronous transmission, where we send 8 data bits and 1 parity bit (ignore the start and stop bits), with even parity. In the notation on this page, this would mean m = 8 and r = 1. Then for example 011101000 is a valid codeword, but 011101001 isn't. (I'm taking the parity bit to be the rightmost bit.) So, if the receiver receives 011101001, it knows there was an error (actually, an odd number of errors).

There are many possible coding systems. Adding a parity bit is one form of a coding system, adding CRC bits is another form, and there are many others.

For any given coding system, it would be nice to be able to not only detect the presence of errors, but also correct the errors. Here is how Hamming distance could be used to do this:

The idea would be that whenever the receiver receives an invalid codeword (for whatever coding system is used; we are not just talking about parity now), it will search the codeword space to find the codeword which is closest to the bit string that was received, where "closest" is defined in terms of Hamming distance. Of course, this is not guaranteed to work in error correction, just like parity and CRC are not guaranteed to work in error detection, but it does seem like a reasonable approach.

In evaluating many different types of coding systems, it would be nice to have some kind of "measure" of each system. As Tanenbaum points out, the Hamming distance of a code system---defined to be the minimum Hamming distance among all pairs of codewords in that system---is an appealing measure. (Be careful of the "overloading" of terms here: The Hamming distance between two codewords, and the overall "measure" of a given code system.) The reason for this is that a code system with Hamming distance d+1 will be able to correct all any message containing d or fewer errors, as Tanenbaum shows.

p.185:

Starting with the third full paragraph, and extending to the next page, Tanenbaum is motivating something called the "Hamming code system," or more briefly, the "Hamming code." (As on p.184, be careful of the overloading of the word "Hamming." We now have three different notions with that word in their names: The Hamming distance between two codewords in a given code system; the Hamming distance of a code system; and a specific code system named the "Hamming code.) All of it concerns code systems which are guaranteed to correct single-bit errors.

Paragraph 3 gives some analysis as to how many codewords such a system must have. The key to Tanenbaum's argument consists of partioning the set of all codewords (valid and invalid) into groups, with one group for each possible message. Within a group, one has the message itself, with no errors, plus all codewords which would result when a single bit of the message was erroneous. There are n of the latter, thus n+1 codewords in each group. Tanenbaum then uses this to show how big r must be for a given m.

p.232:

Suppose a "real" (i.e. permanent) Interent node N wishes to send an IP packet to a PC attached to the Interent via a PPP connection. Normally the Data Link layer at N would wrap the IP packet in an Ethernet frame and put the frame on the Ethernet that N is on. But here, the Data Link layer at N would instead wrap the packet in a PPP frame.

Note that the PPP format is an ordinary HDLC format. (One extra field, for Protocol, is added, but that could be considered part of the data, and thus we could still use HDLC chips, for example.)

In HDLC, the code 0xff in the Address field means "send to all stations." That is meaningless here, since the only other station besides N is the PC, but this code is as good as any.

Normally PPP does not do sequencing, leaving that to higher layers such as TCP. Thus in the Control field, we put the code 0x03, which in HDLC means "no sequence numbers."

All PPP frames use the format given in Fig. 3-27, including the initial LCP and NCP configuration frames, which are simply considered protocols. LCP, for example, is identified by the protocol code 0xc021.

When you first make a call to a PPP server at an ISP, both sides exchange LCP frames to agree on the terms of the PPP connection. Then the initiator of the call sends NCP frames to state which protocols it wants to use, e.g. IP, IPX, etc. Then, data from several different protocols may be sent in PPP frames, with the Protocol frame in each field stating which protocol this frame is using. The code for IP is 0x0021.

p.251:

Tanenbaum says that p-persistent CSMA is only for the slotted case. This is not true.

p.253:

The expression t- e should be t0 + t- e and so on.

p.255:

First think of the low-traffic case, where only our node has something to send.

Number the nodes from 0 to N-1. What would happen if we are node 0? At the time our message is formed, we could be anywhere in the contention period (the "scan"); it could be node 0's turn now, node N-1's turn, etc. On average, though, we will be at node N/2's turn. In that case, we at node 0 will have to first wait for the current scan to complete, which will be N/2 time slots, then at the beginning of the next scan mark our intention to send, and then wait for the rest of that scan to complete, taking N time slots. In other words, we will have to wait 3N/2 time slots before we send.

On the other hand, suppose we are node N-1. Again, when our message is produced, it will on average be node N/2's turn. We then wait N/2 time slots for the end of the scan, where we will mark our intention to send. We then send immediately afterwards, since no one else would have marked their intention to send (remember, we are assuming the low-traffic case). So, here our wait would only be N/2.

The "average" situation is that in which we are node N/2. Our wait will be halfway between the two situations analyzed above, i.e. halfway between 3N/2 and N/2, i.e. N.

So, for us to send one message, our average wait will be N, and the message itself takes time d. Therefore our utilization u = d/(N+d).

Now, what about the high-traffic case, in which everyone always has a message ready to send? A scan/send cycle will now consist of N slots for the scan, and N*d slots for the n transmissions. Thus u = N*d/(N+N*d) = d/(d+1).

By the way, if you are wondering how Tanenbaum got

limk ® ¥(1-1/k)k-1 = 1/e

the answer is that this is from L'Hospital's Rule in calculus.

p.272:

The analysis in the third full paragraph is misleading. One could not get away with using fewer than 100 chips per bit and still maintain the same error rate.

Note that `T is actually equal to -T, since a 0 is represented by -1. This is used in the second line after Equation (4-5).