\documentclass[11pt]{article}

\setlength{\oddsidemargin}{0.0in}
\setlength{\evensidemargin}{0.0in}
\setlength{\topmargin}{0.0in}
\setlength{\headheight}{0in}
\setlength{\headsep}{0in}
\setlength{\textwidth}{6.5in}
\setlength{\textheight}{9.0in}
\setlength{\parindent}{0in}
\setlength{\parskip}{0.1in}

\usepackage{times}
\usepackage{fancyvrb}
\usepackage{relsize}
\usepackage{hyperref}
\usepackage{graphicx}

\begin{document}

\title{Data Structures for Event Lists}

\author{Norm Matloff}

\date{February 12, 2008\\
\copyright{}2006-8, N.S. Matloff} 

\maketitle

\tableofcontents 

\section{Introduction}

Any event- or process-oriented discrete-event simulation package will
store all pending events in some kind of {\bf event list}.\footnote{The
reader should take care not to confuse the word {\it list} here with the
Python type {\bf list}.} Let's illustrate that with the SimPy language.

Let's review how a {\bf yield hold} works in SimPy.  The main loop in
SimPy's {\bf simulate()} function consists of removing the earliest
event from the events list {\bf \_e.events}, updating the simulated time
to the time of that event, and then calling the iterator for that event.

So we see that in a discrete-event simulation package, there are two
main operations on the event list:

\begin{itemize}

\item Insertion of a new entry into the list, with the position in the
list determined by the time of this new event relative to the others.

\item Deletion of the earliest item in the list.

\end{itemize}

There may be other operations needed as well:  

\begin{itemize}

\item In SimPy, there is a {\bf cancel()} operation.  Here we must
remove some entry in the event list, and it may not necessarily (and
probably won't be) the earliest one.

\end{itemize}

In simple applications, the event list typically has only a handful of
entries.  But in some applications there could be thousands of events at
a time.  Since event list operations comprise the major portion of
execution time in a simulation program, it is sometimes worthwhile to
use a complex data structure for the event list.  Over the years, many
such structures have been used.  We will get a glimpse of this area
here.

\section{The Basic Approach:  a Linear Linked List}

The most basic data structure for storing an event list would be a
linked list.  The earliest event is at the head of the list, with the
second-earliest next, and so on.

This makes deletion of the earliest event very quick.  But insertion,
plus deletion of general events, can take a long time.

Much research has been done on how to make this approach faster.  For
example, consider the issue of insertion.  With a plain linked list, we
must search for the insertion point sequentially, but should we start
our search at the front of the list or at the back?  Research has shown
that it depends on the distribution of {\bf hold times}, meaning the
times which are the third operands in the {\bf yield hold} statements.
It turns out that the key factor seems to be the {\bf coefficient of
variation} of the distribution, which is the ratio of the standard
deviation to the mean.  If the coefficient of variation is smal, it's
better to start at the back of the list.  This makes sense, if you think
of the extreme case in which the hold times are constant; in such a
case, the newest event should always be inserted at the end of the list.

SimPy uses the linear linked list approach.  

\section{Heaps}

Many readers have studied the data structure known as a {\bf heap}.
This is a complete binary tree in which each child node has a value
greater than or equal to that of its parent.  This is well-suited for
discrete-event simulation.  Say there are n items in the event list.
Then there are only $O(\log_2 n)$ levels in the tree, so an insertion
can be done in that time, compared to $O(n)$ for a linked list.

Deletion of the earliest event is quick, since it is at the root of the
tree, but then the tree must be reorganized, which takes longer than the
corresponding operation for a linked list.

\section{Calendar Queues}

The last data structure we'll look at in this brief introduction is that
of {\bf calendar queues}.  The name is a metaphor alluding to a personal
appointment calendar, with one page for each day, and each page listing
the appointments one has on that day.

To illustrate the idea, let's look at a simple example.  We'll divide
time into four multiples of 0.5:  0.0-0.5, 0.5-1.0, 1.0-1.5 and 1.5-2.0.
(This will be relative time, e.g. with 0.5 meaning 0.5 time from now.)
This would be considered a ``year'' consisting of four ``days.''  Say
our event list currently consists of events at times 0.12, 0.39, 0.58,
1.22, 1.34, 1.49 and 1.56.  The calendar would look like this:

\begin{tabular}{|l|l|}
\hline
(relative) time  & events \\ \hline 
0.0-0.5 & 0.12 \\ \hline 
0.5-1.0 & 0.58 \\ \hline 
1.0-1.5 & 1.22, 1.34, 1.49 \\ \hline 
1.5-2.0 & 1.56 \\ \hline 
\end{tabular}

We first delete the 0.12 event, which itself will probably result in the
addition of a new event, say at time 1.86.\footnote{Recall why this
occurs.  When the {\bf Run()} function returns from its {\bf yield hold}
which generated the 0.12 event, it will run a few lines and then
encounter another {\bf yield hold} (possibly the same statement), at
which point it will add the 1.86 event to the list.} The new calendar
will look like this:

\begin{tabular}{|l|l|}
\hline
(relative) time  & events \\ \hline
0.0-0.5 &  \\ \hline
0.5-1.0 & 0.58 \\ \hline
1.0-1.5 & 1.22, 1.34, 1.49 \\ \hline
1.5-2.0 & 1.56, 1.86 \\ \hline
\end{tabular}

Note that the simulated clock time has now advanced by 0.12, so the
labels on the ``days'' are a bit misleading, but it's all right as long
as we understand this.  (Obviously, a precise mathematical formulation
could be given.)

We next delete the 0.58 event.  Say it then generates an event at
time 2.17.  Where do we place it in the calendar?  The problem here is
that this event, in the calendar metaphor, is set to occur ``next
year.''  We solve that problem by wrapping around:

\begin{tabular}{|l|l|}
\hline
(relative) time  & events \\ \hline
0.0-0.5 & 0.17 \\ \hline
0.5-1.0 & 0.58 \\ \hline
1.0-1.5 & 1.22, 1.34, 1.49 \\ \hline
1.5-2.0 & 1.56, 1.86 \\ \hline
\end{tabular}

Our 2.17 event is listed as 0.17.  Since we are already past that
time---we're in the midst of handling the 0.58 event---it is recognized
that this is time 0.17 of the next 2.0 cycle, i.e. it's time relative to
the beginning of the current cycle is 0.17 + 2.0 = 2.17.

What if the time had been 2.67 instead of 2.17?  Our calendar would look
like this:

\begin{tabular}{|l|l|}
\hline
(relative) time  & events \\ \hline
0.0-0.5 &  \\ \hline
0.5-1.0 & 0.58, 0.67a \\ \hline
1.0-1.5 & 1.22, 1.34, 1.49 \\ \hline
1.5-2.0 & 1.56, 1.86 \\ \hline
\end{tabular}

with the `a' signifying that this is 0.67 in the next cycle.

It's clear how the structure would be coded.  In Python, for instance,
we could do this as a list of lists.  The ``outer'' list would be for
calendar pages, and the ``inner'' lists would be events per page.

Theoretical and empirical research have shown this approach to work
quite well.

\end{document} 

