Introduction to the PuDB Python Debugging Tool

Professor Norm Matloff
University of California, Davis
(author of The Art of Debugging, NSP, 2008 )

Contents:

Overview:

Pound for pound, PuDB is one of the nicest debugging tools I've seen. Its simplicity and eye appeal is great for those who haven't used debugging tools before, and it has sufficient features to be usable by experienced programmers as well. Its compact footprint makes it nice for debugging client/server network code. For complex work, say threads code, I would suggest WinPDB (which, in spite of its name, is cross-platform, NOT specific to MS Windows), but I am using PuDB myself quite productively.

The Principle of Confirmation:

The way that a debugging tool helps you find your bugs is that it makes it easy to apply what I call the Principle of Confirmation:

Keep confirming that what you are "sure" is true, really is true. Eventually you'll find something that doesn't confirm, and then you'll probably have pinpointing the location of a bug.

Are you "sure" that your variable w has the value 88 on line 20? Confirm it! Are you "sure" that in a certain if/else statement, the code takes the else branch in a certain situation? Confirm it! A debugging tool makes this easy to do, by allowing you to execute your program line by line, querying the values of variables along the way.

Installation:

Try typing

easy_install pudb

into a command (i.e. shell) window. (On Linux or Mac platforms, you'll need to prefix the command with sudo.) If you don't have easy_install, download the package from the PuDB home page, unpack it, go to the resulting directory and type

python setup.py install

You may also need urwid. Try running PuDB first, but if it says it needs urwid, you can install it on Ubuntu, for instance, vai

sudo apt-get install python-urwid

Quick pictorial introduction:

As our example, let's debug code that does a binary search. We have a sorted list, say [5,12,13], and a value we wish to insert into the list. Our function findinspt() will return the index of the insertion point. Say we wish to insert 6 into [5,12,13]. Then the insertion point would be 1, since the 6 would go just before 12, which has index 1 in the list. The insertion point for 1 would be 0, since 1 would go before the 5. The insertion point for 16 would be 3, meaning that it would go past the last element, 13, which has index 2.

Here's our code, in the file binsearch.py:

# example program to demonstrate PuDB debugger; finds insertion point in
# a sorted list

def findinspt(x,xnew):  # returns insertion point of xnew in x
   n = len(x)
   lo = 0
   hi = n-1
   while True:
      mid = (lo + hi) / 2
      if xnew > x[mid]: lo = mid + 1
      else: hi = mid
      if xnew == x[mid]: return mid

y = [5,12,13]
print findinspt(y,3)
print findinspt(y,8)
print findinspt(y,12)
print findinspt(y,30)

The idea of a debugging tool,

So, let's run our code under PuDB. To do this, we type

python -m pudb.run binsearch.py

in a command window. The result will look like this:

We see our code in the main box on the left, with highlighting on the line we will next execute, which is

def findinspt(x,xnew):  # returns insertion point of xnew in x 

You might think it odd that we are about to "execute" that definition, but this is how Python works. The action to be "executed" is defining a function.

But let's go down to the first line of "real" execution,

y = [5,12,13]        

We do this by hitting the n key ("next"). The screen now looks like this:

Sure enough, the line

y = [5,12,13]        

is now highlighted, meaning that we are about to execute it. To do so, hit n again, with the following result:

Note that in the Variables section at the upper-right, our variable y is shown, but its value is not. To display it, we use the arrow keys to navigate over there (right-arrow, then down-arrow to y), and hit Enter. We get a dialog window:

We now arrow-key our way down to the Expanded entry, hit Enter, then right-arrow to OK, and hit Enter again. The screen now looks like this:

So, we see the values of y now. We now hit the left-arrow key to return to the code subwindow.

At this point, we have two choices. One would be to hit n again, going to the next print statement. Or we could we hit the s ("step into the function") key, resulting in our going to the line

n = len(x)                  

and pausing there. I usually prefer to use n at first, only using s when I've narrowed down the problem to a given function. So, let's try hitting n.

Since something is to be printed, PuDB takes us back to the shell command line:

But nothing ever gets printed--the program is hanging! The dreaded infinite loop! So, hit ctrl-c to kill the program. PuDB then tells us "PROCESSING EXCEPTION - hit 'e' to examine]." Heeding that suggestion, we hit the e key, and get another pop-up window:

This tells us that when we hit ctrl-c, our program was executing line 8, in a call from line 15. This information is probably not very helpful in this case, but often it is useful. Close the pop-up window by selecting Close.

What we need to do now is restart the program, and this time hit s rather than n when we have that choice as above. To restart, hit q, then select Restart.

Now hit n twice as above, but then hit s to enter findinspt(). Once inside, hit n a few more times. The screen will look something like this:

Note that the local variables of findinspt() are displayed in the upper-right of the screen. As before, we could display x too if desired. The Stack subwindow is telling us the call sequence, similar to the pop-up we saw earlier; we are now at line 10, from a call at line 15.

Applying the Principle of Confirmation, we check that at each step (not shown here), the variables mid, lo and so on have the values we think they should have. This does check out.

However, after going through a couple of iterations, we see that lo and hi eventually both are equal to 0, never changing. And inspection of the code shows that they never will change.

Intuitively, it should be clear that once we hit a situation with lo = hi, we should exit the loop, and return the proper value. At this point xnew should either be equal to x[lo], or less than it. So, we change the code accordingly:

# example program to demonstrate PuDB debugger; finds insertion point in
# a sorted list

def findinspt(x,xnew):  # returns insertion point of xnew in x
   n = len(x)
   lo = 0
   hi = n-1
   while True:
      mid = (lo + hi) / 2
      if xnew > x[mid]: lo = mid + 1
      else: hi = mid 
      if xnew == x[mid]: return mid
      if lo == hi: return lo

y = [5,12,13]
print findinspt(y,3)
print findinspt(y,8)
print findinspt(y,12)
print findinspt(y,30)

We could now run the code in a separate window to see if it works, but we can do that from within PuDB. First restart, using the process we discussed above, resulting in this screen:

Note that PuDB did load the new version of our code, with the extra line we added. To run, we could hit n three times as before, but let's just hit c ("continue"). We get this screen:

We are invited to restart, which we do, but then we hit o, to see what our program had printed out:

Hmmm...almost right, except that that last value should be 3, not 2.

We'll need to restart again, which we do. But this time, let's expedite things by going straight to a line of interest within findinspt(). This one looks good:

if xnew > x[mid]: lo = mid + 1 

So we set a breakpoint there, by using the down-arrow key to navigate to that line, and then hitting b. The new screen is:

However, since findinspt() seems to work correctly for the first three values we are trying here (3, 8, 12), we should impose a condition on our breakpoint--stop only when xnew is 30. To accomplish this, use the arrow keys to navigate to the Breakpoints box, then select this breakpoint (the only one we have right now). We get this screen:

Navigate to Condition, then type

xnew == 30

in the box:

Then select OK, and hit left-arrow to return to the code box.

Now hit c. Here is the result:

As you can see, xnew is indeed 30. The preceding part of the program has been executed, but without pausing until now.

Now hit n a couple of times, resulting in:

So, lo and hi are both 2. Our next line,

if lo == hi: return lo 

will thus return the value 2--but it should be 3. We see we need an exception for the case in which xnew is larger than the last element of x. We change our code in that portion of the program to

if lo == hi:
   if xnew <= x[lo]: return lo
   else: return lo+1

We then try it, and see that it works.