Norm Matloff's ESim Discrete-Event Simulation Package

ESim is an event-oriented discrete-event simulation package It follows the classical approach, but is designed so that the source code is easily understood, facilitating student understanding and addition of new features.

Though process-oriented simulation is very popular (see C++Sim and psim), the event-oriented approach has a number of advantages:

Contents of this page:

Note: No guarantees of any kind are made regarding the accuracy of this software or its documentation.

Where to get it:

Go to Professor Matloff's simulation Web page.

How to install it:

Next, make a directory for ESim, say /usr/local/esim, creating subdirectories lib and include (if you choose some other directory for ESim, change everything below accordingly), and do the following from whatever directory you've unpacked the ESim source in.

g++ -g -c ESimCode.C
ar r libesim.a ESimCode.o
mv libesim.a /usr/local/esim/lib
cp ESimDefs.h /usr/local/esim/include

How to prepare and compile ESim applications:

Be sure your application source file has a line

#include <ESimDefs.h> 

In accessing command-line arguments from within your code ( you should name the arguments to main() Argc and Argv), remember that the first two arguments are the simulation time limit and the debug flag,, followed by the application-specific arguments.

In main(), first make a call

ESSim::ESInit(Argv);

and then do your application-specific initializations, including setting up the first event(s). Then make the call

ESSim::ESMainLoop(Argv);

followed by your simulation output code.

To set up compiling, do the following:

set ESimDir = /usr/local/esim

and then whenever you compile, say z.C, type

g++ -g -I$ESimDir/include z.C-L$ESimDir/lib -lesim -lm

Note that the order of the command-line options here may be important, so stick to the order shown here.

Of course, various shortcuts to this can be set up in Makefiles, shell aliases and the like.

Command-line format for ESim applications:

app_name max_simtime debug_flag application_args

where "debug_flag" is 1 for debugging, 0 otherwise.

ESim application tutorial: M/M/1 queue:

The model being simulated:

The program simulates an M/M/1 queue. There is a single server, to which jobs arrive at random times, with the interarrival time having an exponential distribution. The service time is also random, with an exponential distribution. If the server is busy when a job arrives, the job joins a queue. (In this simulation, we are assuming that the server is some kind of machine, so we refer to it as the machine.)

We are interested in determining the long-run average wait time per job. There is an exact mathematical formula for this quantity, but here we will find the value via simulation instead, to illustrate ESim.

Running this example:

We compile the program and name the executable file mm1, and run it:

mm1 1000.0 0 1.0 0.5

Here we specify no debugging (0), a simulation time limit of 1000.0, a mean job interarrival time of 1.0, and a mean service time of 0.5. The output will be something like

mean wait = 0.968920

In other words, up to simulated time 1000.0, the program simulated the arrival of a certain number of jobs, and their mean wait in the queue was 0.97.

We'll see below how this came about, by examining the source file MM1.C. It is assumed that you know C++, but you need not have any prior background with simulation.

Analysis of the code in this example:

The center of the operation of ESim is the function ESMainLoop(), which manages the event list. (All function and variable names beginning with "ES" are entitities provided by the ESim package. All the rest are application-specific)

The event list is a linear linked list of pending events, ordered by time, with the earliest event at the head of the list. ESMainLoop()'s action is to repeatedly loop around, handling one event per loop. At each iteration of the loop, the function does the following:

To make the ideas more concrete, let's first set forth an example of a typical instance of the operation of the M/M/1 system

We start at time 0.0, and main() sets up the first arrival

   Tmp = ESSim::ESExpon(MeanArrive);
   ArrivalElt *TmpEltPtr = new ArrivalElt();
   TmpEltPtr->ESEvntTime = Tmp;
   TmpEltPtr->ArrivalTime = Tmp;
   strcpy(TmpEltPtr->ESName,"will arrive");
   TmpEltPtr->ESInsertInSchedList();

We call the exponential random number generator and store the generated number in Tmp; say this value is 1.2. This will mean that our first arrival will be at time 1.2. We set up an instance of the ArrivalElt class, set the event type to be "will arrive", and set the event time to 1.2 Finally, we insert this event into the event list. The event list will now contain this one pending event.

In its next loop iteration, ESMainLoop() will take this event off the event list. It will notice that the time of the event is 1.2, so it advances the simulated time, ESSimTime, to 12. It then calls the user-written ESEvntHandler() to process this event. Since the event type is "will arrive", the latter will call the user-written DoArrival(), which does the following:

   strcpy(ESElt::ESCurrEvnt->ESName,"will be done with service");
   // try to serve, else add to machine queue
   if (Machine.ESNBusy == 0)  {
      Tmp = ESSim::ESExpon(MeanServe);
      Machine.ESStartServe(ESElt::ESCurrEvnt,Tmp);
   }
   else Machine.ESAppendToFclQ();
   // schedule the next arrival
   Tmp = ESSim::ESExpon(MeanArrive);
   Tmp += ESSim::ESSimTime;
   ArrivalElt *TmpEltPtr = new ArrivalElt();
   TmpEltPtr->ESEvntTime = Tmp;
   TmpEltPtr->ArrivalTime = Tmp;
   TmpEltPtr->ESInsertInSchedList();

The next event for this newly arrived job will be service by the machine, so we change its event type to "will be done with service". We check to see if the machine is busy; it isn't, so we start service for this job. If you look at the code for ESStartServe(), you will see that this causes this job to be added to the event list. We then schedule the next arrival. Make sure you understand that there will now be two items in the event list, a pending service and a pending arrival.

Later, when a service is done, ESEvntHandler() will call DoDone(), which has the following code:

void DoDone()

{  // process the job which just finished
   TotJobs++;
   TotWait +=
      ESSim::ESSimTime - ( (ArrivalElt *) ESElt::ESCurrEvnt)->ArrivalTime;
   delete ESElt::ESCurrEvnt;
   // bookkeeping, plus a check of machine's queue
   float Tmp = ESSim::ESExpon(MeanServe);
   Machine.ESDoneServe(Tmp);
   return;
}

The first two lines do some accounting, updating the totals needed for us to compute mean waiting time at the end of the simulation. Then we delete the element, as we are done with it.

Now that the machine has finished serving one job, it must check to see if any other jobs are waiting in the queue, and if so, start the next one. This is done by ESDoneServe():

int ESFacil::ESDoneServe(float SrvTm)

{  ESElt *OldFclQHd;

   ESNBusy--;
   if (ESNQ == 0) return 0;

   OldFclQHd = ESFclQHd;
   if (ESFclQHd == ESFclQTl) ESFclQHd = ESFclQTl = 0;
   else ESFclQHd = ESFclQHd->ESNext;
   ESNQ--;

   ESStartServe(OldFclQHd,SrvTm);
   return 1;
}

ESNBusy, the number of busy servers (we only have one in this program), is decremented. If the number in the queue is 0, then we are done; if not, we delete the head of the queue, and call ESStartServe() for that job.

Execution of the program will continue in this manner. Each time one event executes, its corresponding event handler then sets up the next event(s).

Note that we subclassed the basic event element, ESElt:

class ArrivalElt: public ESElt  {

   public:

      float ArrivalTime;  // time this job arrived

      ArrivalElt();
      void ESPrintElt()
         {  ESElt::ESPrintElt();
            printf("  arrival was at %f\n",ArrivalTime);
         }

};

We did this in order to add the field ArrivalTime, which enables us to later compute the total time this frame job waited in the system, which in turn enables us to find the mean wait per job at the end of the simulation.

Note that we also overrode the ESim function ESPrintElt(), which is called if the debug flag is set. Our new version of ESPrintElt() calls the old one to print the common information, and then also prints out ArrivalTime.

To solidify your understanding of the role of the event list, you should rerun the simulation, in this case setting the debug flag to 1 (pipe the output through the more command).

ESim application tutorial-- the Stop-and-Wait protocol:

The model being simulated:

Stop-and-Wait is a standard protocol used in computer networks. In the system being modeled here, network node A is sending a message (a frame) to node B. We have scaled time so that it takes 1.0 unit of time for the message to get onto the communications link, and Alpha amount of time to propagate across the link. Node B sends back a short reply, taking neg,ligible time to get onto the link and alpha time to reach node A. The message may have been corrupted when traveling from A to B, with probability P; if so, B will so say in the reply. Also, B may be busy and have some random delay before replying to A; if the delay is too long, A will timeout and assume the message had been lost, and then send again.

We are interested in the mean time it takes for a frame to be successfully sent, and also the mean number of tries it takes for success.

Running this example:

saw time_limit debug_flag Alpha timeout_time P mean-delay

Analysis of the code in this example:

We've modularized a large part of the code into one function, which we have named SendFrame(). The function has a parameter which indicates whether this is a new frame, i.e. our first attempt at sending this frame, or a retry. Here's the code:

void SendFrame(int New)

{  strcpy(Frame.ESName,"arrive at B");
   Frame.ESEvntTime = ESSim::ESSimTime + 1 + Alpha;
   if (New)  {
      Frame.FrameNumber = NextFrameToBeCreated++;
      Frame.StartTime = Frame.ESEvntTime;
      Frame.NTries = 1;
   }
   else Frame.NTries++;
   Frame.Trashed = 0;
   Frame.ESInsertInSchedList();
   // set up the paired timeout
   TmOt.ESEvntTime = ESSim::ESSimTime + Timeout;
   strcpy(TmOt.ESName,"timeout");
   TmOt.ESInsertInSchedList();
   // pair them together to enable one to cancel the other
   Frame.TimeoutEltNumber = TmOt.ESEltNumber;
   TmOt.FrameEltNumber = Frame.ESEltNumber;
}

Since we are sending to B, we've named the event type "arrive at B". The time that that occurs will be at the present time (ESSimTime) + 1 + Alpha.

If this is a new frame, give it the next available frame number. Record the present time as this frame's start time, i.e. the time at which we first start trying to send it. Also, set Frame.NTries to 1, as this is our first try.

On the other hand, if this is not a new frame, increment Frame.NTries.

Next, reset Frame.Trashed, which records whether the frame is corrupted. Then add this element to the event list.

Finally, set up a timeout element, and add it to the event list too.

Now, let's look at some of the event handler code.

When a new frame is created, or we do a retry on an old frame, the next event will be "arrive at B", upon which DoArrivalAtB() will be called. As you can see from the code in that function, the random delay is generated (modeling that node B is too busy to process the incoming frame immediately), the event type is changed to "done with delay", and that event is added to the event list.

What that event occurs, DoDelayDone() is called:

void DoDelayDone()

{  Frame.Trashed = (ESSim::ESRnd() < P);
   strcpy(ESElt::ESCurrEvnt->ESName,"arrive back at A");
   Frame.ESEvntTime = ESSim::ESSimTime + Alpha;
   Frame.ESInsertInSchedList();
}

Node B first checks to see whether the frame arrived intact. Remember, there is a probability P that it is corrupted, a situation we simulate here by calling ESRnd(), which generates random numbers uniformly distributed on the interval (0,1). Then we set up to send the reply back to A, and add that new event to the event list.

Recall that we set a timeout. If the reply from B does not get back to A within the timeout period, A will give up the frame for lost, and retransmit it. This is modeled by DoTimeout():

void DoTimeout()

{  ESElt::ESCancel(TmOt.FrameEltNumber);
   SendFrame(0);
}

Note that DoTimeout() must remove the other event, "arrive back at A", from the event list. The reason for this is that if DoTimeout() was called, it was because the timeout occurred before B's reply got back to A. So the "arrive back at A" event is canceled. We accomplish that by calling ESCancel(). Then DoTimeout() calls SendFrame() again, so as to start a new attempt to send the frame successfully to B.

On the other hand, if the reply from B does reach A in time, DoArrivalBackToA() is called:

void DoArrivalBackToA()

{  ESElt::ESCancel(Frame.TimeoutEltNumber);
   if (!Frame.Trashed)  {
      TotFrames++;
      TotWait += ESSim::ESSimTime - Frame.StartTime;
      TotTries += Frame.NTries;
      SendFrame(1);
   }
   else SendFrame(0);

}

In this case, it is the timeout event which must be canceled. The function then checks for a report that the frame had been corrupted. If it was received intact, we now do our bookkeeping, and send out a new frame. If not, we try sending this frame again.

Again note that we subclassed ESElt:

class FrameElt: public ESElt  {
   public:
      int FrameNumber;
      float StartTime;  // time at which this frame is first sent
      int NTries;  // number of attempts at sending this frame so far
      int Trashed;  // 1 means erroneous
      int TimeoutEltNumber;  // ESEltNumber for the paired timeout
      FrameElt();
      void ESPrintElt();
};

void FrameElt::ESPrintElt()

{  ESElt::ESPrintElt();
   printf("  frame %d, number of tries = %d, trashed = %d\n",
      this->FrameNumber,this->NTries,this->Trashed);
}

We added several field specific to this application, e.g. NTries, and again overrode ESPrintElt() in order to get better debugging information.

User-relevant library functions, classes and variables:

Functions:

Classes:

"Global" variables:

Aids for debugging and testing:

As with any programs, ESim code should be debugged with the aid of a good debugging tool. See my debugging Web page.

Debugging simulation programs tends to be difficult, as with any program dealing with multiple concurrent activities. The best strategy to find a bug is to step through the simulation with the debugging tool on a small problem, verifying that the different variables do have the correct values at the times you expect them to.

ESim has a couple of debugging aids of its own:

If the "debug" command-line argument is set to 1, the event list is printed out immediately before and immediately after each event is executed. You can also call the functions ESPrintSchedList() and ESPrintElt() directly; note that the latter can be overridden so as to tailor it to your own application. There is a similar variable for printing server information.

Note that our title above is "aids for debugging and testing." In many simulation applications, verifying the correctness of the program is not trivial at all, since we don't know what the "true" output of the simulation should be. Again, the strategy here is to step through the simulation with the debugging tool on a small problem, verifying that the different variables do have the correct values at the times you expect them to. (It may be sufficient to simply run the program with the debug flag on, and carefully reading the output of the automatic calls to ESPrintSchedList().)

Also, for thorough testing be sure to vary the arguments to your program quite a bit, and to have at least some runs of the program with a very long simulation time limit. This is important, because some bugs will only show up when very rare combinations of circumstances occur.