


\documentstyle[11pt,psfig]{article}

\setlength{\oddsidemargin}{0in}
\setlength{\evensidemargin}{0in}
\setlength{\topmargin}{-0.3in}
\setlength{\headheight}{0in}
\setlength{\headsep}{0in}
\setlength{\textwidth}{6.5in}
\setlength{\textheight}{9.0in}
\setlength{\parindent}{0in}
\setlength{\parskip}{0.05in}

\begin{document}

\title{Chapter 5 \\
Modular Programming, Viewed at the Machine Level}

\author{Norman Matloff \\
University of California at Davis}

\date{December 8, 1996}

\maketitle

In your introductory programming courses you took, you probably were
taught about {\bf modular} and {\bf top-down} programming, which involve
breaking a program into a number of essentially self-contained pieces.
Typically the main program should be fairly short, perhaps a page, and
should consist in large part of calls to subprograms, such as {\bf
functions} in the C language.  The goal is to better organize one's
thinking during the original programming process, and to make future
maintenance and revision of the program easier.

This idea will also be very important for us here, for
several reasons.

\begin{itemize} \item Assembly-language programming can benefit from the
top-down approach, just as programming in high-level languages does.
(In assembly language, the usual term for a subprogram is {\bf
subroutine}, sometimes abbreviated to {\bf routine}.)

\item In some settings we must write our program as a mixture of two or
more languages.  For instance, we might write most of a program in a
high-level language such as C, but write part of it in assembly language
or even another high-level language.  In some cases, for instance, one
of the subtasks that our program must perform has already been written
by someone else in a language other than our favorite.  

A good example of a mixed C and assembly-language program is the Unix
operating system, say the Berkeley version.  For portability and ease
of programming, the Unix source code is mostly written in C, but there
do remain some machine-dependent portions in assembly language.

Of course, mixed-language programs are automatically modular to some
degree.

\item Some frequently-used operations are coded into efficient procedures
and then stored in files or directories called {\bf libraries}, so
that many different programs can make use of them (``Why re-invent
the wheel?'').  This too results in increasing modularization.

\item Even if we work purely with a single high-level language, the
``look under the hood'' theme of this book implies that it is
important to know how compilers translate calls to subprograms
in high-level languages.  For example, it is important to know
that the top-down philosophy, though good from the point of view
of human efficiency, has a harmful effect on machine efficiency,
actually slowing execution speed to some degree.

Knowing how compilers translate calls to subprograms is also
absolutely crucial to the mixed-language programming mentioned
above.

\end{itemize}

\section{Stacks}

A {\bf stack} is an area in main memory which we use for temporary
storage.  We introduce the idea here, because it will play a central
role in function calls.  The stack which we use for this purpose is
sometimes called the {\bf run-time stack}, and most often called ``the''
stack; we will adopt the latter terminology here.

The stack can be {\it any} contiguous block of words in memory, fixed at
one end (the {\bf base}) and movable at the other end (the {\bf top}).
Classical CISC machines such as Mac-1 have a dedicated register, a
stack pointer (SP), to point to the current top-of-stack location.
RISC machines tend to use one of the general-purpose registers as
a stack pointer;\footnote{The SPARC architecture uses a very different
scheme, called {\bf register windows}.  We will not discuss it here.}
DLX, for instance, uses r14.  In order to include both approaches in
our discussion, we will use the term ``stack register'' (SR) to refer
to either the SP or the general-purpose register serving as an SP.

For example, on Mac-1 suppose we initialize SP to 0x5060.  Suppose we
then {\bf push} the values 0x0057, 0x10c4 and 0x0012 onto the stack, in
that order.  Then the stack would look like this (memory addresses
appear on the left, contents to their right, and ``comments'' on the far
right):
 
\begin{verbatim}
0x505d      0x0012      top
0x505e      0x10c4
0x505f      0x0057
0x5060      ____    base (not used for our stack storage)

    Figure 5.1
\end{verbatim}

At this time c(SP) will be 0x505d.  Note that the stack grows toward
0x0000; each item which is pushed onto the stack makes c(SP) decrease by
1.  (If this were DLX, the contents of r14 would go down by 4 with each
push.)  Note also that there {\it is} something in 0x5060, but as far as the
stack is concerned, we don't care what is there.

To remove the top item from the stack, we say that we {\bf pop}
the stack.  If we pop the stack in Figure 5.1, we get the setting
of Figure 5.2:

\begin{verbatim}
0x505e      0x10c4  pop
0x505f      0x0057
0x5060      ____    base (not used for our stack storage)

    Figure 5.2
\end{verbatim}

At this time c(SP) will be 0x505e:  It had been 0x505d, but the pop
{\it added} 1 to it, just as a push {\it subtracts} 1.
Note that the value 0x0012 which had been at the top of the stack
before is still there, i.e. c(0x506e) is still 0x0012, but 
this item is no longer considered to be part of the stack.

Mac-1 and most other CISC architectures include instructions and
addressing modes which make push and pop operations more efficient.  For
example, suppose we wish to push the value 22 onto the stack in
Mac-1.  There are both primitive ways and advanced ways to do this.  We
will usually use the advanced way, but a discussion of the primitive way
will help deepen our understanding of stack operations.

Here is the primitive way:

\begin{verbatim}
      desp       ;  subtract 1 from SP
      loco 22    ;  put 22 in AC
      stol 0     ;  copy AC to memory location pointed to by SP+0, i.e by SP

             Program 5.1
\end{verbatim}

But Mac-1, being a CISC machine, has a special instruction which 
reduces the length of the code above:

\begin{verbatim}
      loco 22    ;  put 22 in AC   
      push       ;  push it onto the stack

             Program 5.2
\end{verbatim}

Note that {\bf push} is an instruction, just as {\bf lodd}, {\bf addd}
and so on.  It has an op code (0xf400), and goes through the Step A/Step
B/Step C sequence just as any instruction does.  Even though {\bf push}
combines the {\it actions} of a {\bf desp} and a {\bf stol}, (compare
Programs 5.1 and 5.2), you should not think of {\bf push} as
``consisting of several instructions;'' {\bf push} is an instruction in
its own right.  

Program 5.2 is not only easier to program and clearer to
read than Program 5.1, but equally importantly, it is more efficient:

\begin{itemize}
\item
It is time-efficient, since only two instructions must be fetched
and decoded, rather than three.

\item
It is space-efficient, occupying two words instead of three.
\end{itemize}

It was considerations such as these which led to CPUs becoming
more and more CISC-ish in the 1970s.  However, in the 1980s RISC
proponents began to question whether such local efficiencies
actually produced overall efficiencies, and concluded that having
so many instructions made it difficult to attain a fast clock speed.

Mac-1 also has a {\bf pop} instruction, which is the mirror image
of {\bf pop}.  It increments the SP and places the popped value into AC.

DLX has no {\bf push} or {\bf pop} instruction, so it handles stacks
in a more primitive way analogous to Program 5.1, say:

\begin{verbatim}
      subi r14,r14,4   ;;  extend the stack by one word
      addi r1,r0,22    ;;  prepare the 22 for storing
      sw 0(r14),r1     ;;  write the 22 to the new top of the stack

                 Program 5.3
\end{verbatim}
   
Note that the stack is simply a section of memory, defined by the SR and
the base.  The location and size of the stack are purely a function of
our software, such as the initial value the software assigns to the SR.
Thus if we are not careful, the software may allow the stack to grow to
the point at which, for instance, it writes over the program's
instructions, with disastrous effects.\footnote{This would not be a
problem on systems with virtual memory, in which the hardware checks
for such violations.}
 
\section{Function Calls at the Machine Level}

Consider a function written in C, with its first few lines as follows:

\begin{verbatim}
   int f(x,y)
      char x,y;
   
   {  int w;


\end{verbatim}

The C compiler will of course have to translate this to machine code.
It must deal with a number of questions:

\begin{itemize}
\item What machine instruction will be used to transfer control to the
function from the point at which it is called?  It seems that we need
some kind of branch instruction; if so, which one?

\item How will the parameters x and y be communicated to the function?

\item Where will the function store the local variable w?

\item How will the function return control to the calling module when
it is done?
\end{itemize}

You will find here that the answers to these questions revolve around
stacks.  Specifically, upon entry to the function, on most machines
and with most compilers, the top four elements of the stack will look
like this:

\begin{verbatim}
SP:    w
SP+1:  address of the location to which we will return when the function is done
SP+2:  x
SP+3:  y
\end{verbatim}

In other words, a function call generally consists first of pushing the
parameters onto the stack, then pushing the return address, then
extending the stack to allocate room for the local variables.  This
portion of the stack is called a {\bf stack frame}.

\subsection{Mac-1 Function Calls}

Like most architectures, Mac-1 has a {\bf call} instruction, with the
form 1110xxxxxxxxxxxx, with 1110 being the op code and xxxxxxxxxxxx
being the address of the called function.  The actions of this
instruction are:

\begin{verbatim}
   a.  The current PC value is pushed onto the stack.

   b.  The value xxxxxxxxxxxx is placed into the PC.
\end{verbatim}

Note that these actions occur during Step C of the fetch/decode/execute
instruction cycle.  Since in Step A the PC had already been incremented,
the value pushed in (a) above is the address of the instruction which
{\it follows} the {\bf call} instruction.  This is exactly what we want,
since we want to record the instruction at which execution should resume
after the called function finishes.  The second action, (b) above, then
causes the first instruction in the function to be fetched in the next
Step A, thus implementing a jump to the function, again just what we
want.

In the function itself, the last instruction executed will be {\bf retn}.
Its action is:

\begin{verbatim}
   Pop the stack and place the popped value into the PC.
\end{verbatim}

This occurs in Step C of the instruction, so that in the next Step A
we will fetch the instruction following the {\bf call}, once again just
what we want.

Program 5.4 shows an example of how the process works:

\begin{verbatim}
   1    ;  Fibonacci number generator
   2    ;
   3    top      lodd i
   4             push        ; push i, the parameter
   5             call newfib
   6             pop         ; finish cleaning up the stack
   7             addd fib    ; add 1 to i (for convenience use the 1 at fib)
   8             stod i     
   9             subd n      ; done with loop?
  10             jnzr top
  11    ;
  12    newfib   loco fib    ; c(AC) = address of fib
  13             addl 1      ; add i, resulting in fib+i in AC
  14             push        ; save fib+i
  15             pushi       ; push f_i 
  16             loco 1
  17             addl 1      ; add fib+i to 1, so fib+i+1 is in AC
  18             push        ; save fib+i+1
  19             pushi       ; push f_{i+1}
  20             lodl 0      ; load f_{i+1}
  21             addl 2      ; add f_i to it, yielding f_{i+2}
  22             push        ; save it
  23             loco 1     
  24             addl 2      ; c(AC) = fib+i+2
  25             popi        ; write f_{i+2} to memory
  26             insp 4      ; clean up stack
  27             retn
  28    i        const 0
  29    n        const 8
  30    fib      const 1
  31             const 1
  32             end
  33    ;

              Program 5.4
\end{verbatim}

This program again computes Fibonacci numbers, as in Chapter 4, and we
again see a loop (lines 3-10).  However, the bulk of the work of the
loop is now performed by a function, which I have named
``newfib''.\footnote{The name reminds us that the function is computing
the value of a new Fibonacci number, from two old ones.  Of course, the
choice of name is arbitrary; we can use any name allowed for Mac-1 
assembly language labels.}

This function happens to have one parameter, i.  The way we send it to
newfib is to push it onto the stack (lines 3-4), from which newfib will
pick it up (line 13).  Here is more detail on the latter instruction.

When we execute line 12, the stack will look like this:

\begin{verbatim}
SP:     return address (address of line 6)
SP+1:   i

            Figure 5.3
\end{verbatim}

The {\bf addl} (``add local'') instruction uses {\bf local} (also called 
{\bf based}) addressing mode.  It is available on many machines, and in
some sense is a mirror image of {\bf indexed} addressing mode: In both
cases, the instruction specifies a register r and a constant c.  In
indexed mode, the operand is r distance past a fixed place in memory,
address c; in local mode, the operand is a c distance past the memory
location pointed to by r.\footnote{Note, though, that these descriptions
only show our intended usage of these modes.  As far as the machine is
concerned, the two modes are really the same.}

For Mac-1's {\bf addl} instruction here, r is (implicitly) SP and c is
1, so the instruction adds to AC the memory location pointed to by SP+1 to
AC.  From Figure 5.3, we see that this means we are adding i to AC, just
as the comment in line 13 says.

The {\bf pushi} instruction in line 15 uses {\bf indirect} addressing
mode.  This mode is also available on many machines.  In it, the register
r points to the memory location which will be our operand.  For the
Mac-1 instruction here, the register is required to be AC.

As we have found before, Mac-1 is short on registers, so the function
newfib stores most of its data on the stack (lines 14, 15, 18, 19 and 22).
For example, when we finish the instruction at line 22, the stack will
look like this:

\begin{verbatim}
SP:       f_{i+2}
SP+1:     f_{i+1}
SP+2:     fib+i+1
SP+3:     f_i
SP+4:     fib+i
SP+5:     return address (address of line 6)
SP+6:     i

            Figure 5.4
\end{verbatim}

So, for instance, the value of i is now the seventh element in the
stack, as opposed to its second-element position in Figure 5.3.  Keep
in mind, though, that i has not moved; the SP has.

On line 27 we see the {\bf retn} instruction.  As indicated earlier,
this will result in our resuming execution of the main program, right
where we left off, at line 6.  However, before line 27 we should 
clean up the stack, which has six elements at that point.  (It did 
have seven elements earlier, but the {\bf popi} instruction in line
25 reduced that number to six.)  Since the function newfib added four
of those elements, we must remove those four before returning to the
main program; this is done in line 26.  \footnote{Otherwise the
{\bf retn} would not work properly.  Also, if we have a lot of
function calls and do not clean up the stack after each one, the
stack will keep growing, and eventually either write over our
program (on a non-virtual memory machine) or cause a protection
error (with virtual memory).}  The {\bf retn} instruction will
then remove another element, and then in line 6 we remove the last one.

\subsection{DLX Function Calls}

Here is a version of the DLX Fibonacci number program from Chapter 4
which now calls a function, newfib.

\begin{verbatim}
   1    
   2              add r1,r0,r0    ;;  r1 will hold "i", initially 0
   3    top:      subi r14,r14,4  ;;  prepare to push "i"
   4              sw 0(r14),r1    ;;  push
   5              jal newfib
   6              nop
   7              addi r14,r14,4  ;;  remove "i" from stack
   8              addi r1,r1,4    ;;  next "i"
   9              slti r7,r1,32   ;;  done with loop?
  10              bnez r7,top
  11              nop
  12    
  13    newfib:   lw r2,0(r14)    ;;  pick up "i" from stack
  14              nop
  15              lw r3,fib(r2)   ;;  r3 will hold fib[i]
  16              addi r2,r2,4    ;;  "i+1"
  17              lw r4,fib(r2)   ;;  r4 will hold fib[i+1]
  18              nop
  19              add r5,r3,r4    ;;  r5 will hold fib[i+2]
  20              addi r2,r2,4    ;;  "i+2"
  21              sw fib(r2),r5
  22              jr r31
  23              nop
  24    
  25    fib:      
  26    .word 1,1
  27    

                      Program 5.5
\end{verbatim}

As is customary for DLX, we are using r14 as the stack pointer.  DLX has
no {\bf push} instruction, but we synthesize one by using {\bf subi} and
{\bf sw}, lines 3-4 (as we did in Program 5.1, though we did not have to
do so in that case).

In line 5, we see the call, which is handled in DLX by a {\bf jal}
(``jump and link'') instruction.  This instruction jumps to the
specified function, again named newfib, just as Mac-1's {\bf call}
instruction did.  A major difference, though, is that {\bf jal} does
not push the return address on the stack.  Instead, it puts the return
address in r31.  

Similarly, in line 22 we see DLX's {\bf jr} (``jump to the place pointed
to by the register'') instruction, which will play a role similar to
Mac-1's {\bf retn}.  Again, though, there is no stack action here;
the instruction

\begin{verbatim}
    jr r31
\end{verbatim}

simply copies r31 to the pc.  Since r31 contains the return address,
the effect is that we do indeed return.

Note that both {\bf jal} and {\bf jr} are branches, so we have placed
{\bf nop}'s in lines 6 and 23.

We have seen before that DLX's general-purpose registers can be used
for indexed addressing, and in fact that is done here, for example in
lines 15, 17 and 21.  But we can also use them for local addressing
mode, which we do in line 13:  since we are using r14 as our stack
register, the expression 0(r14) means to go 0 bytes past the top of
the stack, that is take the first element of the stack. The expression
4(r14) would access the second element of the stack, and so on.

The example here does not illustrate a problems which must commonly be
dealt with in writing functions.  In writing a function, we must be
careful not to disturb register values being using in the calling
program.  We have done this in the example here:  the main program uses
registers r1 and r7 (r14, the stack pointer, is used in a coordinated
way in both the main program and the function), and the function avoids
changing the contents of these registers.  

However, in general, the calling program and a function are in different
source files, and may even be written by different people.  Thus the
standard practice is that upon entry to a function, the values of all
registers used by the function are saved on the stack, and those values
are restored upon exit from the function.

Here is how the above code could be changed in this manner.  The
function newfib uses registers r2, r3, r4 and r5, so these are the
ones we must save.  We could replace line 13 above and add some new
lines, so that the first few lines of newfib would be:

\begin{verbatim}
      newfib:   sw -4(r14),r2      ;;  save old value of r2
                sw -8(r14),r3      ;;  save old value of r3
                sw -0xc(r14),r4    ;;  save old value of r4
                sw -0x10(r14),r5   ;;  save old value of r5
                subi r14,r14,0x10  ;;  adjust stack pointer
                lw r2,0x10(r14)    ;;  pick up "i" from stack
                nop
\end{verbatim}

Then, just before we exit the function, the following lines could
be inserted between lines 21 and 22 above:

\begin{verbatim}
                lw r5,0(r14)       ;;  restore old r5
                lw r4,4(r14)       ;;  restore old r4
                lw r3,8(r14)       ;;  restore old r3
                lw r2,0xc(r14)     ;;  restore old r2
                addi r14,r14,0x10  ;;  restore stack pointer
\end{verbatim}

Such problems must be addressed on almost any machine.  Furthermore, on
machines like DLX in which a function call's return address is saved in
a register (r31 in the case of DLX) rather than on the stack, we must
take extra care in saving the value in that register.

This is not an issue in the example above, since newfib is a {\bf leaf
function}, meaning that newfib itself does not call any 
functions.\footnote{The name derives from a tree metaphor, in which
the main program calls some functions, each of which in turn call
further functions, and so on.}  Suppose newfib were to call another
function, say f, using an instruction

\begin{verbatim}
   jal f
\end{verbatim}

This would save in r31 the address in newfib to which control is to be
returned when f is done---but that will destroy the current content of
r31, which is the saved address in the main program to which control is
to be returned when newfib is done.

In other words, prior to the call to f, we would have to save the
current content of r31, say on the stack, and restore it after f is
done.

\section{Writing Multiple Assembly-Language Source Files}

Consider 
Program 5.5 in the last section.  Suppose we had decided to
place source code for the main program and the function in separate
files:

\begin{verbatim}
     1  
     2  .global fib
     3  
     4            add r1,r0,r0    ;;  r1 will hold "i", initially 0
     5  top:      subi r14,r14,4  ;;  prepare to push "i"
     6            sw 0(r14),r1    ;;  push
     7            jal newfib
     8            nop
     9            addi r14,r14,4  ;;  remove "i" from stack
    10            addi r1,r1,4    ;;  next "i"
    11            slti r7,r1,32   ;;  done with loop?
    12            bnez r7,top
    13            nop
    14  
    15  
    16  fib:      
    17  .word 1,1
    18  .space 32
    19  

                       fmain.s
\end{verbatim}

\begin{verbatim}
     1  
     2  .global newfib
     3  
     4  newfib:   lw r2,0(r14)    ;;  pick up "i" from stack
     5            nop
     6            lw r3,fib(r2)   ;;  r3 will hold fib[i]
     7            addi r2,r2,4    ;;  "i+1"
     8            lw r4,fib(r2)   ;;  r4 will hold fib[i+1]
     9            nop
    10            add r5,r3,r4    ;;  r5 will hold fib[i+2]
    11            addi r2,r2,4    ;;  "i+2"
    12            sw fib(r2),r5
    13            jr r31
    14            nop
    15  
    16
                           fsub.s       
\end{verbatim}

This is easily done, except that we need to warn the assembler that
other files will be involved in each case.  For example, think of what
happens when the assembler produces fmain.o from fmain.s.  As you know,
normally the assembler will discard the symbol table after finishing
the assembly process.  But in line 2 of fmain.s,

\begin{verbatim}
   .global fib
\end{verbatim}

we are telling the assembler to retain the symbol table line for the
label ``fib'':  

\begin{verbatim}
fib       0x28
\end{verbatim}

Here we are considering the instruction in line 4 to be at address 0x0,
and so on; counting down from there, we see that fib is at 0x28.  The
assembler will include this line of the symbol table (but not the rest
of the table) in the file fmain.o.

Note by the way that when the assembler gets to line 7 in fmain.s,
it will not be able to completely assemble the instruction.  From
Appendix B we see that the machine code is to be of the form

\begin{verbatim}
000011 lc
\end{verbatim}

where lc is a 26-bit constant.  In line 7, that constant is the
address of newfib---which we do not know, since newfib is in another
file, fsub.s!  So, the assembler we have to leave another note in
fmain.o, saying that there is ``unfinished business'' in the instruction
in line 7.

Similarly, when the assembler assembles fsub.s, the assembler will
include the line

\begin{verbatim}
newfib    0x0
\end{verbatim}

in fsub.o.  Also, the assembler will leave notes in fsub.o saying
that there is ``unfinished business'' in lines 6, 8 and 12, since
we do not yet know what the address of fib is.

When the linker combines fmain.o and fsub.o to produce an executable
file a.out, it will use this information.  Suppose a.out is designed
to be loaded starting at location 0x100, and that the main program
will be loaded first, followed by newfib.  Here is how the linker will
resolve the ``unfinished business'':

For example, the linker will notice the note from the assembler in
fmain.o which says that the assembler left ``unfinished business'' 
when it assembled line 7 of fmain.s, since the address of newfib
was not known at that time.  The linker will now determine that
address, as follows.  It sees that there were 10 instructions in
fmain.s, taking up 40 bytes, and that the array fib occupies

\begin{verbatim}
2*4 + 32 = 40
\end{verbatim}

bytes too.  So, all of the main program occupies 80 = 0x50 bytes,
thus bytes 0x100 through

\begin{verbatim}
0x100 + 0x50 - 1 = 0x14f
\end{verbatim}

when loaded into memory.  The subfunction will then immediately
follow, at 0x150.  Moreover, the linker notices the symbol table line
in fsub.o for the newfib label shows location 0x0.  This means that
newfib is the very first word within fmain.o, so it will be loaded
at 0x150 when we run the program.

In other words, the linker now knows that the constant lc in the
machine code for line 7 of fmain.s,

\begin{verbatim}
000011 lc
\end{verbatim}

is 0x150 = 0b00000000000000000101010000.

So the instruction as a whole is 0b00001100000000000000000101010000
0x0c000150.

The linker deals with the previously-unresolved references to fib in 
fsub.s in the same way.

\section{A Look at How Compilers Handle Function Calls}

\section{Analytical Exercises}

{\bf 1.}  Suppose we look at a Mac-1 assembly language file w.asm that
someone else has written, and have no idea what the program is supposed
to do.  However, we do notice that a certain section of code is
inefficient, in which there are four consecutive lines of {\bf pop}
instructions.  Give more efficient code which we could {\it safely}
substitute for those four lines.  State whether your code is more
efficient in speed, number of bytes of memory used, or both.

{\bf 2.}  Consider a C language file g.c.  It consists of the three functions,
main(), a() and b().  There are several different lines in the source
file g.c at which b() is called.  The first few lines of b() are as follows:

\begin{verbatim}
     int b()

     {  int x;

        printf("%d\n",&x);
\end{verbatim}

During the execution of the program from which g.c is compiled, this
printf() line will be executed several times.  State {\it exact, specific}
conditions under which the values printed out will not all be identical.
You should be able to do this problem without actually running the
program on the computer.

{\bf 3.}  Write a few lines of Mac-1 assembly code which will do the
equivalent of {\bf pop}, without using that instruction.

Assume that in the procedure version, it would be called as NEAR.

{\bf 4.}  Consider Program 5.5.  Suppose the program is loaded at
location 0x100 as usual.  At the time the instruction in line 13
is executed, what will be the contents of r31?  Give your answer in
hex.

{\bf 5.}  Suppose before we run Program 5.4 we initialize SP to 0x200,
and that the program is loaded at location 0 as usual.  At the end of
Step C of the {\bf call} instruction in line 5, what will be the
contents of MAR, MDR, IR and the PC?  Give your answers in hex.

{\bf 6.}  Fill in the blank:  DLX's function call instruction {\bf jal},
by saving the return address in a register instead of the stack, saves
memory accesses as long as the function is a \_\_\_\_\_\_ function.

{\bf 7.}  In Program 5.4, suppose we forget to include the {\bf insp6}
instruction.  The first time the function newfib is executed, will it
return to the proper instruction in the calling module?  If so, state
why there is no problem of this kind; if not, state which incorrect
instruction it will return to.
  
\section{Programming Projects}

{\bf 1.}  Write a DLX assembly-language function which will {\it
concatenate} two character strings, producing a third one, similar to
the operation of the C string library function strcat().  The function
will have three parameters, which are the addresses of the first bytes
of each of the three strings.   All strings are terminated with null
characters, i.e. a byte whose bits are all 0s.  Write a program to test
your function.

\end{document} 


