\documentclass[]{article}

% Makes hyphenation work properly
\usepackage[english]{babel}
\usepackage{graphicx,amsmath,url}
%\usepackage{latexsym}

\begin{document}

\sloppy
\setlength{\parskip}{0.20cm}
\setlength{\parsep}{0.20cm}
\setlength{\topsep}{0cm}
\setlength{\parindent}{0cm}

\renewcommand{\textfraction}{0.2}
\renewcommand{\topfraction}{0.9}
\renewcommand{\dbltopfraction}{0.8}
\renewcommand{\floatpagefraction}{0.7}
\renewcommand{\dblfloatpagefraction}{0.7}
    
                                 
\title{Performance Analysis}
\author{
Peter Harrison \\
      Imperial College, London  
}


\date{}


\maketitle
\tableofcontents

\section*{Course Details}
\begin{itemize}
\item Lectures: Friday 3pm - 5pm, followed by tutorial until 6pm, in 144
\item 20 lectures, 10 tutorials, 2-3 pieces of assessed coursework
\item No coding
\item Lecture notes at: \url{http://www.doc.ic.ac.uk/~uh/pgh/}
\item This is the first year with typed notes
\item Books:
  \begin{itemize}
  \item {\it Performance modelling of communication networks and
      computer architectures} 
    Peter G. Harrison, Naresh M. Patel. 
    Wokingham : Addison-Wesley, 1992. ISBN: 
    0201544199, 14 copies in library, out of print
  \item {\it Probabilistic Modelling }  I. Mitrani, Paperback - 233 pages 2nd Ed (11 December, 1997) 
                                Cambridge University Press; ISBN:
                                0521585309, about \pounds 20
  \end{itemize}
\end{itemize}

\begin{itemize}
\item The course aims to model real-world problems using analytical
  models
\item Queueing theory
\item {\em Uses} maths to solve problems
\item Emphasis is on solving problems rather than making up hard sums
\item Even if you do not choose to specialise in modelling, you will
  learn some ``transferable'' skills
\item {\em Simulation and Modelling} by Tony Field, Wed. 11pm (145),
  Thu 9pm (343)
\end{itemize}


\begin{itemize}
\item Performance analysis
\item \dots Modelling
\item \dots Planning
\item \dots Optimisation 
\end{itemize}

\begin{itemize}
\item Real-world problem: Post Office (average length of queues, Utilisation, etc.)
\item Abstraction, approximation, summary
\item Assumptions
\item Model
\item Techniques for solution
\item Answer to problem
\item Abstraction helps to transfer solution to other problems (PO, computer)
\end{itemize}


  \section{Previews of selected examples and case studies...}
\subsection{Example 1: A Simple TP server}
A TP system accepts and processes a stream of transactions, mediated
through a (large) buffer:
\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=1cm]{diags/Pqueue.eps}}
  \end{center}
  %\caption{A Simple TP server}
  \label{Pqueue}
\end{figure}
\begin{itemize}
\item Transactions arrive ``randomly'' at some specified rate
\item The TP server is capable of servicing transactions at a 
given service {\em rate}
\item [Q:]  If both the arrival rate and service rate are
doubled, what happens to
the mean response time?

\end{itemize}


\subsection{Example 2:  A simple TP server}
Consider the same system as above:
\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=1cm]{diags/Pqueue.eps}}
  \end{center}
  %\caption{A Simple TP server}
  \label{Pqueue}
\end{figure}
\begin{itemize}
\item The arrival rate is 15tps
\item The mean service time per transaction is 58.37ms
\item [Q:] What happens to the mean response time 
if the arrival rate increases by 10\%?
\end{itemize}


\subsection{Example 3: A simple multiprocessor TP system}
Consider our TP system but this time with multiple transaction processors
\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{diags/simplemult.eps}}
  \end{center}
  %\caption{A simple multiprocessor TP system}
  \label{simplemult}
\end{figure}
\begin{itemize}
\item The arrival rate is 16.5 tps
\item The mean service time per transaction is 58.37ms
\item [Q:] By how much is the system response time reduced by adding one 
processor?
\end{itemize}


\subsection*{Example 4: File Allocation }
What is the best way to allocate disk blocks to a heterogenous disk I/O
system?
\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{diags/twodisks.eps}}
  \end{center}
%  \caption{ File Allocation}
  \label{twodisks}
\end{figure}
\begin{itemize}
\item Disk I/O requests are made at an average rate of 20 per second
\item Disk blocks can be located on either
disk and the mean disk access times are 30ms and 46ms respectively
\item [Q:] What is the optimal proportion of blocks to allocate to disk 1
to minimise average response time?
\end{itemize}


\subsection{Example 5: A Simple Computer Model}
Consider an open  uniprocessor CPU system with just disks
\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{diags/simpleCPU.eps}}
  \end{center}
  %\caption{ A Simple Computer Model}
  \label{simpleCPU}
\end{figure}
\begin{itemize}
\item Each submitted job makes 121 visits to the CPU, 70 to disk 1
and 50 to disk 2 {\em on average}
\item The mean service times are 5ms for the CPU, 30ms for disk 1
and 37ms for disk 2

\item [Q:]  What is the 
effect of replacing the CPU with one {\em twice} the
speed?
\end{itemize}


\subsection*{Example 6: A Simple Batch Processor System}
How does the above system perform in batch mode?
\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=2.5cm]{diags/simpleCPUbatch.eps}}
  \end{center}
%  \caption{A Simple Batch Processor System }
  \label{simpleCPUbatch}
\end{figure}
\begin{itemize}
\item Each batch job makes 121 visits to the CPU, 70 to disk 1,
50 to disk 2 {\em on average}
\item The mean service times are 5ms for the CPU, 30ms for disk 1
and 37ms for disk 2
\item [Q:]  How does the system throughput vary with the number of batch jobs
and what is the effect of replacing Disk 1 with one (a) twice and (b)
three times the speed?

\end{itemize}

\subsection*{Example 7: A multiprogramming system
with virtual memory}
\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{diags/simpleCPUbatch.eps}}
  \end{center}
%  \caption{A  multiprogramming System }
  \label{simpleCPUbatch}
\end{figure}
\begin{itemize}
\item Suppose now we add VM and make disk 1 a dedicated paging device
\item Pages are 1Kbyte in size and the (usable) memory is equivalent to
64K pages
\end{itemize}

\begin{itemize}
\item Each job page faults at a rate determined by the following 
lifetime function:
\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{diags/lifetime1chart.eps}}
  \end{center}
%  \caption{Example lifetime function }
  \label{lifetime1chart}
\end{figure}
\end{itemize}
Q: What number of batch jobs keeps the system throughput at its maximum
and at what point does thrashing occur?

\subsection{Example 8: A multiaccess multiprogramming system
with virtual memory}
\vspace{1cm}
\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{diags/multiprogterm1.eps}}
  \end{center}
%  \caption{A multiaccess multiprogramming system}
  \label{multiprogterm1}
\end{figure}

\begin{itemize}
\item During the day the system runs in interactive mode with
a number of terminal users
\item The average think time of each user is 30 seconds
\end{itemize}
Q: How does the system response time and throughput vary with the
number of terminals and how many terminals can be supported
before the system starts to thrash?


\section{Stochastic Processes}
\subsection{Introduction}
Computer systems are 
\begin{enumerate}
\item dynamic - they can pass through a succession of states as time
  progresses
\item influenced by events which we consider here as random phenomena
\end{enumerate}
c.f. bank, supermarket, stock exchange

\underline{Definition}: A stochastic process $\pmb{S}$ is a family of
random variables $\{X_t \in \Omega| t\in T\}$, each defined on some (the same for each)
{\em sample space} $\Omega$ for a {\em parameter space} $T$.

\begin{itemize}
\item $T,\Omega$ may be either discrete or continuous
\item $T$ is normally regarded as time
  \begin{itemize}
  \item real time: continuous
  \item every month or after job completion: discrete
  \end{itemize}
\item $\Omega$ is the set of values each $X_t$ may take
  \begin{itemize}
  \item bank balance: discrete
  \item number of active tasks: discrete
  \item time delay in communication network: continuous
  \end{itemize}
\end{itemize}


     %
     %


\subsection{Example: The Poisson process}
The Poisson process is a renewal process with renewal period
(interarrival time) distribution and probability density function (pdf) 
\begin{eqnarray*}
  F(x)& = &P(X \leq x) = 1 - e^{\lambda x} \\
  \text{pdf: } f(x) & = & F^\prime (x) = \lambda e^{-\lambda x}
\end{eqnarray*}
$\lambda$ is the parameter or rate of the Poisson process. \\
\subsection{Memoryless Property of the (negative) exponential
distribution}If $S$ is an exponential random variable
$$
P(S\leq t+s | S>t) = P(S\leq s) \quad \forall t,s \geq 0
$$
(i.e. it doesn't matter what happened before time $t$) \\
\underline{Proof}\\
\begin{eqnarray*}
  P(S\leq t+s | S>t) & = & \frac{P(t < S\leq t+s)}{P( S>t)} \\
  & = & \frac{P( S\leq t+s) - P(S\leq t)}{1-P(S \leq t)} \\
  & = & \frac{1 - e^{-\lambda(t+s)} - (1 - e^{-\lambda
  t})}{e^{-\lambda t}} \\ &&\text{ ($\lambda$ is the rate of the
  exp. distribution)}\\
  & = & \frac{e^{-\lambda t} - e^{-\lambda (t + s)}}{e^{-\lambda t}} = 1
  - e^{-\lambda s} \\
  & = & P(S \leq s)
\end{eqnarray*}


     %
     %


\subsection{More properties of the Poisson process}

\begin{eqnarray*}
  P(\text{arrival in } (t,t+h)) & = & P(R\leq h) = P(S\leq h) \quad
  \forall t\\
  & = & 1 - e^{-\lambda h} \\
  & = & \lambda h + o(h)
\end{eqnarray*}
Therefore
\begin{enumerate}
\item Probability of an arrival in $(t,t+h)$ is $\lambda h + o(h)$
  regardless of process history before $t$
\item Probability of more than one arrival in $(t,t+h)$ is $o(h)$
  regardless of process history before $t$ 
\end{enumerate}
In fact we can take this result as an alternative definition of the
Poisson process. From it we can derive the distribution function of
the interarrival times (negative exponential) and the Poisson
distribution for $N_t$ (the {\em arrival process})
$$
P(N_t = n) = \frac{(\lambda t)^n}{n!} e^{-\lambda t} .
$$


     %
     %


\subsection{Derivation of the Poisson process}
The proof of the claim is by induction on the integers. Let $P_n(t) =
(P(N_t=n)$. \\
\underline{Base case ($n=0$)}
\begin{eqnarray*}
  P_0(t+h) & = & P(N_t = 0  \text{ \& no arrivals in } (t,t+h)) \\
  & =& P_0(t) (1 - \lambda h + o(h)) \quad \text{ (by independence)} \\
\implies 
 \frac{P_0(t+h) - P_0(t)}{h} & = & -\lambda P_0(t) + o(1) \\
  P_0'(t) & = & -\lambda P_0(t) \quad \text{ as } h \to 0 \\
\implies P_0(t) & = & C e^{-\lambda t} \\
P_0(0) & = & P(N_0 = 0) = 1 \text{ therefore } C=1
\end{eqnarray*}
\underline{Inductive Step ($n>0$)} \\
\begin{eqnarray*}
  P_n(t+h) & = & P_n(t) P_0(h) + P_{n-1}(h) P_1(h) + o(h) \\
  & = & (1 - \lambda h) P_n(t) + \lambda h P_{n-1}(t) + o(h) \\
\implies P_n'(t) & = & -\lambda P_n(t) + \lambda P_{n-1}(t) \\
\implies \frac{d}{dt}(e^{\lambda t} P_n(t)) & = & \lambda e^{\lambda t}
P_{n-1}(t)
\end{eqnarray*}
Inductive hypothesis: 
$$P_{n-1}(t) = e^{-\lambda t} \frac{(\lambda t)^{n-1}}{(n-1)!} $$ This
is OK for $n=1$. Then 

\begin{eqnarray*}
 \frac{d}{dt}(e^{\lambda t} P_n(t)) & = &  \frac{(\lambda )^{n}}{(n-1)!}
t^{n-1} \\ 
\implies e^{\lambda t} P_n(t) & = &  \frac{(\lambda
  t)^{n}}{(n)!} + D
\end{eqnarray*}

For $t=0$, $P_n(t)=0$ for $n\geq 2$ and so $D=0$. Thus
$$P_n(t) = e^{-\lambda t} \frac{\lambda t}{n!}$$


     %
     %


\subsection{Derivation of the interarrival time distribution}
\begin{enumerate}
\item Using the derived Poisson distribution
  \begin{eqnarray*}
    P(T\leq t) & = & 1 - P(\text{0 arrivals in }(0,t)) \\
    & = & 1 - e^{-\lambda t}
  \end{eqnarray*}
(distribution for negative exponential random variable)
\item Directly from assumptions (exercise). Use the memoryless
  property to simplify the equation
$$P(T\leq t+h) = P((T\leq t) \vee [(T>t) \wedge (1\text{ arrival in
  }(t,t+h))])$$ hence derive and solve a differential equation for
$P(T\leq t) = F(t)$.
\end{enumerate}


     %
     %


\subsection{Superposition Property}
If $A_1, \dots , A_n$ are independent Poisson Processes with rates
$\lambda_1 , \dots, \lambda_n$ respectively, let there be $K_i$
arrivals in an interval of length $t$ from process $A_i$ ($1\leq i
\leq n)$. Then $K=K_1 + \dots +K_n$ has Poisson distribution with
parameter $\lambda t$ where $\lambda = \lambda_1 + \dots +
\lambda_n$. I.e. the superposition of PPs with rates $\lambda_i$ is a
PP with rate $\sum_i \lambda_i$. \\
\underline{Proof} \\
The distribution of K is the convolution of the distributions of the
$K_i$ which are Poisson. E.g. if $n=2$
\begin{eqnarray*}
  P(K=k) & = & \sum_{i=0}^k \frac{(\lambda_1 t)^i}{i!} e^{-\lambda_1t}
  \frac{(\lambda_2 t)^{k-i}}{(k-i)!} e^{-\lambda_2t} \\
  & = & \frac{e^{-(\lambda_1 + \lambda_2)t}}{k!} \sum_{i=0}^k
  \binom{k}{i} (\lambda_1 t)^i (\lambda_2 t)^{k-i} \\
  & = & e^{-(\lambda_1 + \lambda_2)t} \frac{[(\lambda_1 + \lambda_2)t]^k}{k!}
\end{eqnarray*}
as required. The proof for arbitrary $n \geq 2$ is an easy induction
on $n$.


     %
     %


\subsection{Decomposition Property}
If a Poisson Process is decomposed into processes $B_1, \dots , B_n$
by assigning each arrival of $A$ to $B_i$ with independent probability
$q_i$ ($\sum q_i =1$), then $B_1, \dots , B_n$ are independent Poisson
Processes with rates $q_1 \lambda , \dots , q_n \lambda$. \\ \\
\underline{Example}: Two parallel processors, $B_1$ and
$B_2$. Incoming jobs (Poisson arrivals $A$) are directed to $B_1$ with
probability $q_1$ and to $B_2$ with probability $q_2$. Then each
processor ``sees'' a Poisson arrival process for its incoming job stream. \\
\underline{Proof} \\
For $n=2$, let $K_i = \text{number of arrivals to } B_i \text{ in time
  } t$ ($i=1,2$), $K=K_1+K_2$
\begin{eqnarray*}
  P(K_1 = k_1, K_2 = k_2) & =  \underbrace{P(K_1 = k_1, K_2 = k_2|K
  = k_1 + k_2)} &  \underbrace{P(K = k_1 + k_2)} \\
  \text{joint probability} &  \text{binomial} &
  \text{Poisson}\\
  & = \frac{(k_1+k_2)!}{k_1! k_2!} q_1^{k_1} q_2^{k_2} \frac{(\lambda
  t)^{k_1 + k_2}}{(k_1 + k_2)!} e^{-\lambda t} \\
& = \frac{(q_1 \lambda t)^{k_1}}{k_1!} e^{-q_1\lambda t} \frac{(q_2 \lambda t)^{k_2}}{k_2!} e^{-q_2\lambda t}
\end{eqnarray*}
I.e. a product of two independent Poisson distributions 
\\($n \geq 2$:
easy induction) 

\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{diags/superpose.eps}}
    \fbox{\includegraphics[height=3.5cm]{diags/decompose.eps}}
      \end{center}
  \caption{Superposition and decomposition of a Poisson process}
  \label{equivalent-open-network}
\end{figure}


% \begin{figure}[tbp]
%   \hfill
%   \begin{center}
%     \fbox{\includegraphics[height=3.5cm]{pictures/super-decomposition-Poisson.eps}}
%   \end{center}
%   \caption{Decomposition of a Poisson process}
%   \label{equivalent-open-network}
% \end{figure}


     %
     %


\subsection{Markov Chains and Markov Processes}
\begin{itemize}
\item Special class of stochastic processes that satisfy the Markov
  property (MP): \\
  Given the state of the process at time $t$, its state at time $t+s$
  has probability distribution which is a function of $s$ only.
\item i.e. the future behaviour after $t$ is independent of the
  behaviour before $t$
\item Often intuitively reasonable, yet sufficiently ``special'' to
  facilitate effective mathematical analysis
\item We consider processes with discrete state (sample) space and:
  \begin{enumerate}
  \item discrete parameter space (times $\{t_0, t_1, \dots \}$), a
    Markov chain or {\em Discrete Time Markov Chain}
  \item continuous parameter space (times $t \geq 0, t \in R $), a
    Markov Process or {\em Continuous Time Markov Chain}. E.g. number of arrivals in $(0,t)$ from a Poisson arrival
  process defines a Markov Process because of  the memoryless property.

  \end{enumerate}
\end{itemize}


     %
     %


\subsection{Markov Chains}
\begin{itemize}
\item Let $X=\{X_n|n=0,1,\dots \}$ be an integer valued Markov chain (MC),
  $X_i \geq 0, X_i \in Z, i \geq 0$. MP states that: 
  $$P(X_{n+1} =j | X_0 = x_0, \dots , X_n = x_n) = P(X_{n+1}=j|X_n=x_n)$$
  $$\text{ for } j,n = 0, 1, \dots$$
\item Evolution of a MC is completely described by its 1-step
  transition probabilities
  $$q_{ij}(n) = P(X_{n+1}=j | X_n =i) \text{ for } i,j,n, \geq 0$$
  Assumption: $q_{ij}(n) = q_{ij}$ is independent of time $n$  ({\em time
  homogeneous}) $$\sum_{j \in \Omega} q_{ij} = 1 \quad \forall i \in \Omega$$
\item MC defines a transition probability matrix
$$
Q = \begin{bmatrix}
  q_{00} & q_{01} &  \dots \\
  q_{10} & q_{11} & \dots \\
  \vdots \\
  q_{i0}& \dots & q_{ii} \dots\\
  \vdots
  \end{bmatrix}
  \text{in which all rows sum to 1}
$$
\begin{itemize}
\item dimension = \# of states in $\Omega$ if finite, otherwise
  countably infinite
\item conversely, any real matrix $Q$ s.t. $q_{ij} \geq 0$, $\sum_j q_{ij}=1$
  (called a {\em stochastic matrix}) defines a MC
\end{itemize}
\end{itemize} 


     %
     %


\subsubsection{Examples}
\begin{enumerate}
\item \underline{Telephone line}: \\
  Line is either idle (state $0$) or busy (state $1$). 
  \begin{eqnarray*}
    \text{Idle at time } n & \implies  & \text{ idle at time } (n+1) \text{
      w.p. } 0.9 \\    
    \text{Idle at time } n & \implies & \text{ busy at time } (n+1) \text{
      w.p. } 0.1 \quad \quad Q  =  
  \begin{bmatrix}
  0.9 & 0.1 \\
  0.3 & 0.7 
  \end{bmatrix}\\    
    \text{Busy at time } n & \implies & \text{ idle at time } (n+1) \text{
      w.p. } 0.3 \\    
    \text{Busy at time } n & \implies & \text{ busy at time } (n+1) \text{
      w.p. } 0.7 \\
 \end{eqnarray*}
May be represented by a state diagram \ref{telephone-line}:
\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=1.5cm]{pictures/telephone-line.eps}}
  \end{center}
  \caption{Telephone line example }
  \label{telephone-line}
\end{figure}


\item \underline{Gambler}\\
Bets \pounds 1 at a time on the toss of fair die. Loses if number is
less than 5, wins \pounds 2 if 5 or 6. Stops when broke or holds
\pounds 100. If $X_0$ is initial capital ($0\leq X_0 \leq 100$) and
$X_n$ is the capital held after $n$ tosses $\{ X_n|n \leq 0,1,2,\dots
\}$ is a MC with $q_{ii}=1$ if $i=0$ or $i\geq 100$, $q_{i,i-1}=
\frac{2}{3}$, $q_{i,i+2}=\frac{1}{3}$ for $i=1,2, \dots ,99$
$q_{ij}=0$ otherwise.
\item \underline{I/O buffer, capacity M records} \\
New record added in any unit of time w.p. $\alpha$ (if not
full). Buffer emptied in any unit of time w.p $\beta$. If both occur
in same interval, insertion done first. Let $X_n$ be the number of
records in buffer at (discrete) time $n$. Then, assuming that
insertions and emptying are independent of each other and of their own
past histories, $\{ X_n | n = 0,1,\dots\}$ is a MC with state space
$\{0,1,\dots , M\}$ and state diagram:
\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3cm]{pictures/io-buffer.eps}}
  \end{center}
  \caption{ I/O buffer  example}
  \label{io-buffer}
\end{figure}


The transition matrix follows immediately, e.g.
$$q_{12} = \alpha(1-\beta) = q_{n,n+1}$$ ($0\leq n \leq M-1$)
$$q_{MM}=1-\beta$$ etc. 
\end{enumerate}


     %
     %


\subsubsection{Multi-step transition probabilities} Let
\begin{eqnarray*}
  q^{(2)}_{ij} & = & P(X_{n+2}=j|X_n=i) \\
  & = & \sum_{k \in \Omega} P(X_{n+1} = k, X_{n+2} = j| X_n = i)
  \text{ from law of tot. prob.}\\
  & = & \sum_{k \in \Omega} P(X_{n+2} = j| X_n=i, X_{n+1} =k)
  P(X_{n+1} = k | X_n =i) \\
  & = & \sum_{k \in \Omega} P(X_{n+2} = j| X_{n+1} =k) P(X_{n+1}
  =k|X_n=i) \text{ by MP} \\
  & = & \sum_{k \in \Omega} q_{ik}q_{kj} \text{ by TH}\\
  & = & (Q^2)_{ij}
\end{eqnarray*}
Similarly, $m$-step transition probabilities :
\begin{eqnarray*}
  q^{(m)}_{ij} & = & P(X_{n+m} = j | X_{n} = i) \quad (m \geq 1)\\
    & = & (Q^m)_{ij}
\end{eqnarray*}
by induction on $m$. Therefore we can compute probabilistic behaviour of
a MC over any finite period of time, in principle. E.g. average no. of
records in buffer at time 50
$$E[X_{50} | X_0 = 0] = \sum_{j=1}^{\min(M,50)} j q_{0j}^{(50)}$$

Computationally intractable. Need to determine long-term behaviour:
hope that asymptotically MC approaches a steady-state
(probabilistically), i.e. that $\exists  \{ p_j | j=0,1,2, \dots \}$
s.t. 
$$p_j = \lim_{n\to \infty} P(X_n=j | X_0 = i) = \lim_{n\to \infty}
q_{ij}^{(n)}$$ 
independent of $i$


     %
     %


\subsubsection{Definitions and properties of MCs}
Let $v_{ij} = $ prob. that state $j$ will eventually be entered, given
that the MC has been in state $i$, i.e.
$$
v_{ij} = P(X_n = j \text{ for some } n| X_0 =i)
$$

\begin{itemize}
\item If $v_{ij} \neq 0$, then $j$ is {\em reachable} from $i$
\item If $v_{ij} = 0 \quad \forall j \neq i, v_{ii}=1$, then $i$ is an {\em absorbing} state
\end{itemize}
\underline{Example (1) \& (3)}: all states reachable from each other
(e.g. look at transition diagrams) \\
\underline{Example (2)}: all states in $\{1,2,\dots ,99\}$ reachable
from each other, 0 \& 100 absorbing \\
A subset of states, $C$ is called {\em closed } if $j \notin C$, implies
$j$ cannot be reached from any $i \in C$. E.g. set of all states,
$\Omega$, is closed. If $\not \exists$ proper subset $C \subset
\Omega$ which is closed, the MC is called {\em irreducible}. \\
A Markov Chain is irreducible if and only if every state is reachable
from every other state. E.g. examples (1) \& (3) and $\{0\}$, $\{100\}$ are closed in example (2).


     %
     %


\subsubsection{Recurrent states}
If $v_{ii}=1$, state $i$ is {\em recurrent} (once entered, guaranteed
eventually to return). Thus we have: \\
$i$ recurrent implies $i$ is 
  either not visited by the MC or is visited an infinite number of
times -- no  visit can be the last (with non-zero probability). \\
If $i$
is not recurrent it is called {\em transient}. This implies the number
of visits to a transient state is finite w.p. 1 (has geometric
distribution). E.g. if $C$ is a closed set of states, $i \notin C$, $j
\in C$ and $v_{ij} \neq 0$, then $i$ is transient since MC will
eventually enter $C$ from $i$, never to return. E.g. in example (2),
states 1 to 99 are transient, states 0 and 100 are recurrent (as is any
absorbing state) \\
\underline{Proposition}: If $i$ is recurrent and $j$ is reachable from
$i$, then $j$ is also recurrent. \\
\underline{Proof}: $v_{jj} \neq 1$ implies $v_{ij} \neq 1$ since if
the MC visits $i$ after $j$ it will visit $i$ repeatedly and with
probability $1$ will eventually visit $j$ since $v_{ij} \neq 0$. This
implies $v_{ii} \neq 0$ since with non-zero probability the MC can
visit $j$ after $i$ but not return to $i$ (w.p. $1-v_{ji} > 0$) {\bf qed.}
\\ \\
Thus, in an irreducible MC, either all states are transient (a
transient MC) or recurrent (a recurrent MC).


     %
     %


\subsubsection{Further properties of MCs}
Let $X$ be an irreducible, recurrent MC and let $n_1^j, n_2^j, \dots$
be the times of successive visits to state $j$, $m_j = E[n_{k+1}^j - n_k^j]$
($k=1,2,\dots$), the mean interval between visits (independent of $k$
by the MP). \\ \\
\underline{Intuition}: $$p_j = \frac{1}{m_j}$$
Because the proportion of time spent, on average, in state $j$ is
$1/m_j$. \\
This is not necessarily true, because the chain may exhibit some {\em
  periodic} behaviour: \\
The state $j$ is {\em periodic} with period $m>1$ if $q_{ii}^{(k)} =
0$ for $k \neq rm$ for any $r\geq 1$ and $P(X_{n+rm} = j \text{ for some } r
\geq 1 | X_n=j) = 1$. Otherwise it is aperiodic, or has period
$1$. (Note that a periodic state is recurrent) \\
\underline{Proposition}: In an irreducible MC, either all states are
aperiodic or periodic with the same period. The MC is then called
aperiodic or periodic respectively. \\ \\
\underline{Proposition}: If $\{X_n|n=0,1,\dots\}$ is an irreducible,
aperiodic MC, then the limiting probabilities $\{p_j|j=0,1,\dots \}$
exist and $p_j = 1/m_j$.


     %
     %


\subsubsection{Null \& positive recurrence}
If $m_j = \infty$ for state $j$ then $p_j =0$ and state $j$ is
{\em recurrent null}. Conversely, if $m_j < \infty$, then $p_j >0$ and state
$j$ is {\em recurrent non-null} or {\em positive recurrent}.\\ \\
\underline{Proposition}: In an irreducible MC, either all states are
positive recurrent or none are. In the former case, the MC is called
positive recurrent (or recurrent non-null). \\
A state which is aperiodic and positive recurrent is often called {\em
  ergodic} and an irreducible MC whose states are all ergodic is also
called ergodic. When the limiting probabilities $\{p_j | j
=0,1,\dots\}$ do exist they form the 
$$
\left.
\begin{aligned}
& \text{steady state} \\
& \text{equilibrium} \\
& \text{long term} \\
\end{aligned}
\right\}
\text{probability distribution (SSPD)}
$$
of the MC. They are determined by the following theorem.
\subsubsection{Steady state theorem for Markov chains}
An irreducible, aperiodic Markov Chain, $X$, with state space $S$ and
one-step transition probability matrix $Q=(q_{ij}|i,j \in S)$, is
positive recurrent if and only if the system of equations
$$
p_j = \sum_{i \in S} p_i q_{ij}
$$
with $\sum_{i \in S} p_i =1$ (normalisation) has a solution. If it exists, the
solution is unique and is the SSPD of $X$. \\
\underline{Note}: If $S$ is finite, $X$ is always positive recurrent,
which implies the equations have a unique solution. The solution is
then found by discarding a balance equation \& replacing it with the
normalising equation (the balance equations are dependent since the
rows of homogeneous equations $\underline{p} -
\underline{p}.\underline{Q} = \underline{0}$ all sum to zero).


     %
     %


\subsection{Markov Processes}
\begin{itemize}
\item Continuous time parameter space, discrete state space
$$
X = \{X_t | t\geq 0\} \quad X_t \in \Omega
$$
where $\Omega$ is a countable set.
\item \underline{Markov property}
$$
P(X_{t+s} =j | X_u , u \leq t) = P(X_{t+s} = j| X_t)
$$
\item Markov process is {\em time homogeneous} if the r.h.s. of this
  equation does not depend on $t$
  \begin{itemize}
  \item $q_{ij}(s) = P(X_{t+s} = j| X_t=i) =  P(X_{s} = j| X_0 =i)$
    ($i,j = 0,1,\dots$)
  \item {\em Transition probability functions} of the MP
  \end{itemize}
\item Markov Property and time homogeneity imply: \\
If at time $t$ the process is in state $j$, the time remaining in state $j$ is
independent of the time already spent in state $j$: {\em memoryless
  property} \\
\underline{Proof}: 
\begin{eqnarray*}
  P(S>t+s | S>t) & = & P(X_{t+u}=j, 0\leq u \leq s|X_u=j, 0\leq u \leq t) \\
  &&\text{where S $=$ time spent in state $j$ } \\ && \text{state $j$ entered at
  time $0$} \\
  & = & P(X_{t+u} =j, 0\leq u \leq s|X_t=j) \text{ by MP}\\
  & = & P(X_u=j , 0\leq u \leq s| X_0=j) \text { by T.H.}\\
  & = & P(S>s)\\
\implies P(S\leq t+s|S>t) & = & P(S\leq s)
\end{eqnarray*}
\item Time spent in
state $i$ is exponentially distributed, parameter $\mu_i$.
\end{itemize}


     %
     %


\subsubsection{Time homogeneous Markov Processes}
The MP implies next state, $j$, after current state $i$ depends only on
$i,j$ -- state transition probability $q_{ij}$.
This implies the Markov Process is uniquely determined by the
products:
$$a_{ij} = \mu_i q_{ij}$$
where $\mu_i$ is the rate out of state $i$ and $q_{ij}$ is the
probability of selecting state $j$ next. The $a_{ij}$ are the
{\em generators} of the Markov Process.
\begin{itemize}
\item intuitively reasonable
\end{itemize}

\subsubsection{Instantaneous transition rates}
Consider now the (small) interval $(t,t+h)$
\begin{eqnarray*}
  P(X_{t+h}=j|X_t=i) & = & \mu_ihq_{ij} + o(h) \\
  & = & a_{ij} h+o(h)
\end{eqnarray*}
\begin{itemize}
\item $a_{ij}$ is the {\em instantaneous transition rate} $i \to j$
  i.e. the average number of transitions $i \to j$ per unit time
  spent in state $i$
\end{itemize}


\subsubsection{Examples}
\begin{enumerate}
\item \underline{Poisson process rate $\lambda$}
  \begin{eqnarray*}
    a_{ij}  = \begin{cases}
      \lambda \text{ if } j=i+1 \\
      0 \text{ if } j \neq i,i+1 \\
      \text{not defined if } j = i
      \end{cases}
  \end{eqnarray*}
because 
\begin{eqnarray*}
  P(\text{arrival in } (t,t+h)) & = & \lambda h + o(h) \\
  i \to i+1 && a_{i,i+1}h
\end{eqnarray*}

\item \underline{The buffer problem}
  \begin{itemize}
  \item Records arrive as a P.P. rate $\lambda$
  \item Buffer capacity is $M$
  \item Buffer cleared at times spaced by intervals which are
    exponentially distributed, parameter $\mu$. Clearance times
    i.i.d. \& independent of arrivals (i.e. clearances are an
    independent PP, rate $\mu$)
    $$a_{ij} = \begin{cases}
      \lambda \text{ if } j = i+1 \quad (0\leq i \leq M-1)\\
      \mu \text{ if } j = 0 \quad (i\neq 0) \\
      0 \text{ otherwise } (j\neq i)
    \end{cases}$$
    [Probability of $>1$ arrivals or clearances, or $\geq 1$ arrivals
    \underline{and} clearances in $(t,t+h)$ is $o(h)$]
  

\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3cm]{pictures/buffer.eps}}
  \end{center}
  \caption{ Buffer problem }
  \label{buffer }
\end{figure}


  \end{itemize}
\item \underline{Telephone exchange} \\Suppose there 4 subscribers and
    any two calls between different callers can be supported
    simultaneously. Calls are made by non-connected subscribers
    according to independent P.Ps, rate $\lambda$ (callee chosen at
    random). Length of a call is (independent) exponentially
    distributed, parameter $\mu$. Calls are lost if called subscriber
    is engaged. State is the number of calls in progress $\in
    \{0,1,2\}$ 
    \begin{eqnarray*}
      a_{01} & = &  4 \lambda \quad \text{(all free) }\\
      a_{12} & = & \frac{2}{3}\lambda\quad \text{(caller has $\frac{1}{3}$ probability of successful connection)}\\
       a_{10} & = & \mu\\
      a_{21} & = & 2\mu \quad \text{(either
    call may end) } \\
\end{eqnarray*}
\underline{Exercise}: The last equation  follows because the distribution of
      the random variable $Z=\min(X,Y)$ where $X,Y$ are independent
      exponential random variables with parameters $\lambda,\mu$ respectively is
      exponential with parameter $\lambda + \mu$. Prove this.


\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=2.5cm]{pictures/telephone.eps}}
  \end{center}
  \caption{ Telephone exchange }
  \label{telephone}
\end{figure}


\end{enumerate}


     %
     %


\subsubsection{The Generator Matrix}
$$\sum_{j \in \Omega, j\neq i} q_{ij} = 1 \implies \mu_i = \sum_{j \in \Omega, j\neq i} a_{ij}$$
(hence {\em instantaneous transition rate out of $i$}). Let $a_{ii} =
- \mu_i$ (undefined so far)
$$
(a_{ij}) = A = \begin{pmatrix}
  a_{00} & a_{01} & \dots &&\\
  a_{10} & a_{11} & \dots &&\\
  \hdots & & &&\\
  a_{i0} & a_{i1} & \dots & a_{ii} & \dots \\
  \hdots
  \end{pmatrix}
$$
$A$ is the generator matrix of the Markov Process. Rows of $A$ sum to
zero ($\sum_{j \in S} a_{ij} = 0$)

\subsubsection{Transition probabilities \& rates}
\begin{itemize}
\item determined by $A$
  \begin{eqnarray*}
    q_{ij} & = & \frac{a_{ij}}{\mu_i} = - \frac{a_{ij}}{a_{ii}} \\
  \mu_i & = & - a_{ii}
  \end{eqnarray*}
\item hence $A$ is all we need to determine the Markov process
\item Markov processes are instances of state transition systems with
  the following properties
  \begin{itemize}
  \item The state holding time are {\em exponentially} distributed
  \item The probability of transiting from state $i$ to state $j$
    depends only on $i$ and $j$
  \end{itemize}
\item In order for the analysis to work we require that the process in
  {\bf irreducible} -- every state must be reachable from every other,
  e.g.


\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=4.5cm]{pictures/irreducible-mp.eps}}
  \end{center}
  \caption{ (Ir)reducible Markov chains}
  \label{irreducible}
\end{figure}


  \item Associated with each arc in a state transition diagram is a
    {\em rate}, e.g.:


\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{pictures/transition.eps}}
  \end{center}
  \caption{ Transition rates in  Markov chains}
  \label{transition}
\end{figure}


  \item The state holding time depends on the departure rates
    ($1/(r_{i,a} + r_{i,b} + r_{i,c}$) is the mean holding time for state $i$ in the example).
  \item The {\em probability flux} from a state $i$ to a state $j$ is the
    average number of transitions from $i \to j$ per unit time at
    equilibrium:
$$\text{Probability flux}(i\to j) = r_{i,j} p_i$$
\item At equilibrium (if it exists) the total flux into each state
  must balance the total flux out of it
\item For example, in the example, for state $i$ alone: 
$$(r_{i,a} + r_{i,b} + r_{i,c})p_i = r_{j,i} p_j + r_{k,i}p_k$$
\item Treating all states the same way leads to a set of {\em linear}
  equations:$$\sum_{j \neq i} a_{ij}p_i = \sum_{j\neq i}a_{ji} p_j =
-a_{ii}p_i \text{ for all } j\in S$$ 
\item Called the {\em balance equations} and subject to the {\em
    normalising equation} $$\sum_{i\in S} p_i =1$$
\item The {\bf steady-state theorem} tells us that if these equations
  have a solution then there is a steady-state and that the solution
  is unique
\item {\bf Important generalisation}: At equilibrium the fluxes also
  balance into and out of {\em every closed contour} drawn around {\em
    any} collection of states
\item This often provides a short-cut to the solution -- see the M/M/1
  queue later.
\end{itemize}


     %
     %


\subsection{Steady state results}
\underline{Theorem} \\
 (a) if a Markov process is transient or recurrent
null, $p_j=0 \quad \forall j \in S$ and we say a SSPD does not exist \\
(b) If a Markov Process is positive recurrent, the limits $p_j$ exist,
$p_j>0, \sum_{j\in S}p_j =1$ and we say $\{ p_j| j \in S\}$ constitute
the SSPD of the Markov Process. 

\subsubsection{Steady State Theorem for Markov Processes}
An irreducible Markov Process $X$ with state space $S$ and generator
matrix $A=(a_{ij})$ ($i,j \in S$) is positive recurrent if and only if
\begin{eqnarray*}
  \sum_{i\in S} p_i a_{ij} & = &  0 \quad \text{ for } j \in S  \text{ [ Balance
  equations]} \\
\sum_{i\in S} p_i & =&   1  \quad \text{ [Normalising equation]}
\end{eqnarray*}
have a solution. This solution is unique and is the SSPD.

\subsubsection{Justification of the Balance Equation}
The rate going from state $i$ to $j(\neq i)$ is $a_{ij}$, the fraction of time
spent in $i$ is $p_i$ 
\begin{eqnarray*}
  \sum_{i \neq j} a_{ij} p_i & = & \sum_{i \neq j} a_{ji} p_j \\
\text{Avg. no. of transitions}&& \\
\text{ $i \to j$ in unit time} &  & \text{Avg. no. of transitions} \\
&&\text{ $j \to i$ in unit time}\\
\sum_{i\neq j} \text{flux} (i \to j) & = & \sum_{i\neq j} \text{flux} (j \to i) \quad
\forall j
\end{eqnarray*}


     %
     %


\subsubsection{Buffer example (DTMC)}
\underline{$1 \leq i \leq M-1$} \\
\begin{eqnarray*}
  \text{(Net ``rate'' in)} & = & \text{(Net ``rate'' out)} \\
   ( \alpha (1-\beta)+ \beta ) p_i & = & \alpha (1-\beta ) p_{i-1} \\
\end{eqnarray*}
$$\{ = p_i = \alpha (1-\beta) p_{i-1} + p_i (1-\alpha)(1-\beta) \text{ from S.S. Theorem}\}$$
Therefore $$p_i = k p_{i-1} = k^ip_0 $$ where $k = \frac{\alpha
  (1-\beta )}{\alpha + \beta - \alpha \beta}$ ($1 \leq i \leq M-1$)

\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{pictures/io-buffer.eps}}
  \end{center}
  \caption{ I/O buffer. }
  \label{no-overtaking}
\end{figure}

\underline{$i = M$} \\
$$\beta p_M = {\alpha(1-\beta)} p_{M-1}$$ therefore $$p_M =
\frac{\alpha(1-\beta)}{\beta} k^{M-1}p_0$$

\underline{$i = 0$} \\
Redundant equation (why?) - use as check
\begin{eqnarray*}
  \alpha(1-\beta)p_0 & = & \sum_{i=1}^M \beta p_i = \beta
  \sum_{i=1}^{M-1} k^i p_0 + k^{M-1} \alpha (1-\beta) p_0 \\
  & = & p_0 \Big(\beta k \frac{1-k^{M-1}}{1-k} + \alpha (1-\beta)k^{M-1}\Big)
\end{eqnarray*}

rest is exercise


     %
     %


\subsubsection{Examples (CTMC)}
\begin{enumerate}
\item I/O buffer: \\
  By considering the flux into and out of each state $0,1,\dots,M$
  we obtain the balance equations:
  \begin{eqnarray*}
    \lambda p_0 & = & \mu (p_1 + \dots + p_M) \text{ (State 0)} \\
    (\lambda + \mu) p_i & = & \lambda  p_{i-1}\quad (1\leq i \leq M-1)
    \text{ State $i$} \\
    \mu p_M & = & \lambda p_{M-1} \text{ State $M$}
  \end{eqnarray*}
Normalising equation:
\begin{eqnarray*}
  p_0 & + &  \dots + p_M = 1 \\
\implies p_j & = & \Big(\frac{\lambda}{\lambda + \mu} \Big)^j \frac{\mu}{\lambda +
  \mu} \quad  (0\leq j \leq M-1) \\
p_M & = & \Big(\frac{\lambda}{\lambda + \mu} \Big)^M
\end{eqnarray*}
Thus (for example) mean number of records in the buffer in the steady
state $ = M \alpha^M + \sum_{j=0}^{M-1}j \alpha^j
\frac{\mu}{\lambda+\mu}$ where $\alpha = \frac{\mu}{\lambda + \mu}$
\item \underline{Telephone network} \\
  \begin{eqnarray*}
    4 \lambda p_0 & = & \mu p_1 \text{ (State 0)} \\
    \Big(\mu + \frac{2}{3}\lambda\Big) p_1 & = & 4 \lambda p_0 + 2 \mu p_2
    \text{ (State 1)} \\ 
    2 \mu p_2 & = & \lambda \frac{2}{3}p_1 \text{ (State 2)} \\
  \end{eqnarray*}
Thus $p_1 = \frac{4\lambda}{\mu} p_0$, $p_2  =
\frac{\lambda}{3\mu}p_1 = \frac{4\lambda^2}{3\mu^2} p_0$ 
with $p_0 + p_1
+ p_2 = 1 \implies p_0 = \Big(1 + \frac{4\lambda}{\mu} +
\frac{4\lambda^2}{3\mu^2}\Big)^{-1}$ 
Average number of calls in progress in the steady state $ = 1.p_1 +
2.p_2 = \frac{\frac{4\lambda}{\mu} \big(1+ \frac{2\lambda}{3\mu}\big) }{1 + \frac{4\lambda}{\mu} +
\frac{4\lambda^2}{3\mu^2}}$ 
\end{enumerate}


     %
     %


\subsubsection{Birth-Death processes and the single server queue
  (SSQ)}
A Markov process with state space $S= \{0,1,\dots\}$ is called a
  birth-death process if the only non-zero transition probabilities
  are $q_{i,i+1}$ and $q_{i+1,i}$ ($i\geq 0$), representing births and
  deaths respectively. (In a population model, $q_{00}$ would be $1$
since $0$ would be an absorbing state.) The SSQ model consists of 
\begin{itemize}
\item a Poisson arrival process, rate $\lambda$
\item a queue which arriving tasks join
\item a server which processes tasks in the queue in FIFO (or other)
  order and has exponentially distributed service times, parameter
  $\mu$ (i.e. given a queue length $>0$, service completions form a
  Poisson process, rate $\mu$)

\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=2.5cm]{pictures/ssq.eps}}
  \end{center}
  \caption{ Single server queue. }
  \label{ssq}
\end{figure}

\item The state is the queue length (including the task being served
  if any), i.e. the state space is $\{0,1,\dots\}$
\item SSQ model is a birth-death process
\item $\lambda, \mu$ are in general functions of the queue length
  (i.e. state dependent) and we write $\lambda(n), \mu(n)$ for state $n$.
\end{itemize}


     %
     %


\subsubsection{The memoryless property in the M/M/1 queue}
\underline{Notation:} The SSQ with Poisson arrivals and exponential
service times is called an M/M/1 queue 
\begin{itemize}
\item the first M describes the
arrival process as a Markov process (Poisson)
\item the second M describes
the service time distribution as a Markov process (exponential)
\item the 1
refers to a single server (m parallel server would be denoted as
M/M/m)
\item Later we will consider M/G/1 queues, where the service time
distribution is non-Markovian (``general'')
\end{itemize}

SSQ therefore follows a Markov process and has the memoryless
property that:
\begin{enumerate}
\item Probability of an arrival in $(t,t+h) = \lambda(i)h+o(h)$ in
  state $i$
\item Probability of a service completion in $(t,t+h) = \mu(i)h+o(h)$
  in state $i>0$ ($0$ if $i=0$)
\item Probability of more than 1 arrival, more than one service
  completion or 1 arrival and 1 service completion in $(t,t+h)=o(h)$.
\item Form these properties we could derive a differential equation
  for the \underline{transient} queue length probabilities -- compare
  Poisson process.
\end{enumerate}


     %
     %


\subsubsection{State transition diagram for the SSQ}

\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{pictures/ssq-state.eps}}
  \end{center}
  \caption{ Single server queue state diagram. }
  \label{ssq}
\end{figure}

\begin{itemize}
\item Consider the balance equation for states inside the red (thick)
  contour. \\
  Outward flux (all from state $i$): $p_i\lambda(i)
  \quad (i \geq 0)$; \\
  Inward  flux (all from state $i+1$):
  $p_{i+1} \mu(i+1)\quad (i \geq 0) $. 
  Thus, 
  \begin{eqnarray*}
    p_i\lambda(i) & = & p_{i+1}\mu(i+1) \\
    \text{so } p_{i+1} & = & \frac{\lambda(i)}{\mu(i+1)} p_i =
    \Big [\prod_{j=0}^i \rho(j)\Big ]p_0
  \end{eqnarray*}
where $\rho(j) = \frac{\lambda(j)}{\mu(j+1)}$
\item Normalising equation implies
  \begin{eqnarray*}
    & p_0 & \Big(1+ \sum_{i=0}^\infty \prod_{j=0}^i \rho(j)\Big) = 1 \\
\implies & p_0 & = \Big [ \sum_{i=0}^\infty \prod_{j=0}^{i-1}
\rho(j)\Big ]^{-1} 
  \end{eqnarray*}
where $\prod_{j=0}^{-1} = 1$ (empty product).
Therefore
$$
p_i = \frac{\prod_{j=0}^{i-1} \rho(j)}{\sum_{k=0}^\infty
  \prod_{n=0}^{k-1} \rho(n)} \quad (i \geq 0).
$$
\item So, is there  always a steady state?
\item \underline{SSQ with constant arrival and service rates}
  $$
  \lambda(n) = \lambda, \quad \mu(n) = \mu, \quad \rho(n) = \rho =
  \lambda /\mu \quad \forall n \in S
  $$
implies
\begin{eqnarray*}
  p_0 & = & \Big [ \sum_{i=0}^\infty \rho^i\Big ]^{-1} = 1-\rho\\
  p_i & = & (1-\rho)\rho^i \quad (i\geq 0)
\end{eqnarray*}
\item \underline{Mean queue length, $L$ (including any  task in
    service)}
  \begin{eqnarray*}
    L & = & \sum_{i=0}^\infty i p_i =  \sum_{i=0}^\infty (1-\rho) i
    \rho^i\\
    & = & \rho(1-\rho) \frac{d}{d\rho} \Big \{  \sum_{i=0}^\infty \rho^i\Big \}
    = \rho(1-\rho) \frac{d}{d\rho} \Big \{ (1-\rho)^{-1}\Big \} \\
    & = & \frac{\rho}{1-\rho}
  \end{eqnarray*}
\item \underline{Utilisation of server}
  \begin{eqnarray*}
    U & = & 1 -P(\text{server idle}) = 1 - p_0 = 1-(1-\rho) = \rho \\
    & = &     \lambda / \mu.
  \end{eqnarray*}
However, we could have deduced this without solving for $p_0$: In the
steady state (assuming it exists),
$$
\lambda = \text{arrival rate}  = \text{throughput} = P(\text{server
  busy}). \text{service rate} = U \mu
$$
Argument applies for any system in equilibrium - didn't need MP - see
M/G/1 queue later.
\end{itemize}


     %
     %


\subsubsection{Response times}
To analyse response times, need to consider the state of the queue at
the time of arrival of a task. We use the \underline{Random Observer
  Property} of the Poisson process: \\ The state of a system at
equilibrium seen by an
arrival of a Poisson process has the same distribution as that seen by
an observer at a random instant, i.e. if the state at time $t$ is
denoted by $S_t$, 
$$
P(S_{t_0^-} = i \; | \;    \text{arrival at $t_0$}) = P(S_{t_0^-} = i)
$$
\underline{Mean response time, $W$} \\
If the queue length seen by an arriving task is $j$, 
\begin{eqnarray*}
  \text{response time}  = && \text{residual service time of task in
  service (if $j > 0$)} \\
  & + & j \text{ i.i.d. service times}
\end{eqnarray*}

But for exponential service times, residual service time has the same
distribution as full service time
\begin{eqnarray*}
  \implies \text{Response time} & = & \text{sum of $(j+1)$ service
  times} \\
  \implies \text{Mean response time, } W & = & \sum p_j (j+1) \mu^{-1}  \\
   & = & \Big (1 + \frac{\rho}{1-\rho}\Big) \mu^{-1} = \frac{1}{\mu -\lambda}
\end{eqnarray*}
\underline{Mean queueing time, $W_Q$ (excluding service)}
$$
W_Q = W - \mu^{-1} = L\mu^{-1} = \frac{\rho}{\mu - \lambda}.
$$
\subsubsection{Distribution of the waiting time, $F_W(x)$}

By the random observer property (and memoryless property) 

$$
F_W(x) = \sum_{j=0}^\infty p_j E_{j+1}(x)
$$

where $E_{j+1}(x)$ is the convolution of $(j+1)$ exponential
distributions, each with parameter $\mu$ - called the Erlang-$(j+1)$
distribution with parameter $\mu$. Similarly, for density functions:
$$
f_W(x) = \sum_{j=0}^\infty p_j e_{j+1} (x)
$$
where $e_{j+1} (x)$ is the pdf corresponding to $E_{j+1}(x)$,
i.e. $ \frac{d}{dx}E_{j+1}(x) =  e_{j+1} (x)$, defined by
\begin{eqnarray*}
  e_1(x) & = & \mu e^{-\mu x} \\
  e_{j+1}(x) & = & \mu \int_0^x e^{-\mu(x-u)} e_j(u) du  \\
  && \text{(convolution of Erlang-$j$ and exponential)} \\
  \implies e_j(x) & = & \mu \frac{(\mu x)^{j-1}}{(j-1)!} e^{-\mu x}
  \quad \text{ (Exercise)} \\
  \implies f_W(x) & = & (\mu -\lambda) e^{(\mu - \lambda)x} \quad
  \text{ (Exercise)}
\end{eqnarray*}


     %
     %


\underline{Example} \\
Given $m$ Poisson streams, each with rate $\lambda$ and independent of
the others, into a SSQ, service rate $\mu$, what is the maximum value
of $m$ for which, in steady state, at least 95\% of waiting times,
$T\leq w$? (Relevant in ``end-to-end'' message delays in communication
networks, for example) 
\begin{eqnarray*}
  P(T \leq w) \geq 0.95 \\
  \text{i.e. } 1 - e^{-(\mu -m\lambda)w} \geq 0.95 \\
  \text{ or } e^{-(\mu -m\lambda)w} \leq 0.05 \\
  \implies e^{(\mu -m\lambda)w} \geq 20 \\
  \mu - m \lambda \geq \frac{\ln{20}}{w} \\
  \implies m \leq \frac{\mu}{\lambda} - \frac{\ln{20}}{w\lambda}
\end{eqnarray*}
$m < \mu / \lambda$ is equivalent to the existence of a steady state.


     %
     %


\subsection{Multiple parallel servers - M/M/m queue}

\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{pictures/MMm.eps}}
  \end{center}
  \caption{ Multiple parallel servers. }
  \label{ssq}
\end{figure}

SSQ representation:
\begin{eqnarray*}
  \lambda(n) & = & \lambda \quad (n\geq 0) \\
  \mu(n) & = & \begin{cases} n \mu \quad (1 \leq n \leq m) \\
    m \mu \quad (n \geq m) \end{cases}
\end{eqnarray*}
Front of queue served by any available server. \\ \underline{Loading},
$\rho= \lambda/\mu$ (average amount of service required in unit
time). \\ By general result for M/M/1 queue:
$$
p_j = p_0 \prod_{i=0}^{j-1} \frac{\lambda(i)}{\mu(i+1)} = 
\begin{cases}
p_0 \frac{\rho^j}{j!} \quad (0\leq j \leq m) \\
p_0 \frac{\rho^j}{m!m^{j-m}} \quad (j \geq m) \\
\end{cases}
$$

This implies
\begin{eqnarray*}
  p_0 & = & \frac{1}{\sum_{i=0}^{m-1} \frac{\rho^i}{i!} +
  \frac{\rho^m}{m!} \sum_{i=m}^\infty (\frac{\rho}{m})^{i-m}} \\
  & = & \frac{1}{\sum_{i=0}^{m-1}\frac{\rho^i}{i!} +
  \frac{\rho^m}{(m-1)!(m-\rho)}} 
\end{eqnarray*}

\underline{Average number of busy servers, $S$}
$$
S = \sum_{k=1}^{m-1} k p_k + m \sum_{k=m}^\infty p_k = \dots = \rho
$$

\underline{Steady state argument}
\begin{eqnarray*}
  \text{arrival rate} & = & \lambda \\
  \text{average throughput} & = & S\mu \quad \text{($\mu$ per active
  server)}\\
  \implies S\mu & = & \lambda \quad \text{in equilibrium}.
\end{eqnarray*}

\underline{Utilisation} $U=S/m = \rho/m$, the average fraction of busy
servers. 


     %
     %


\subsubsection{Waiting times}
Waiting time is the same as service time if an arrival does not have
to queue. Otherwise, the \underline{departure} process is Poisson,
rate $m\mu$, whilst the arrived task is queueing (all servers
busy). This implies that the queueing time is the same as the queueing
time in the M/M/1 queue with service rate $m\mu$.
\begin{itemize}
\item Let 
  \begin{eqnarray*}
    q & = & P(\text{arrival has to queue}) \\
    & = & P(\text{find} \geq m \text{ jobs on arrival})
  \end{eqnarray*}
\item by R.O.P.
$$
q = \sum_{j=m}^\infty p_j = p_0 \frac{\rho^m}{(m-1)!(m-\rho)} \quad
\text{Erlang delay formula}
$$
\end{itemize}

     %
     %

\begin{itemize}
\item Let $Q$ be the queueing time random variable (excluding service)
  \begin{eqnarray*}
    F_Q(x) & = & P(Q\leq x) = P(Q=0) + P(Q\leq x | Q>0) P(Q>0) \\
    & = & (1-q) + q (1-e^{-(m\mu-\lambda)x}) \\
    & = & 1 - q e^{-(m\mu-\lambda)x}
  \end{eqnarray*}
$P(Q\leq x | Q>0)$ is given by the SSQ, rate $m\mu$, result for
waiting time. (Why is this?)\\ Note that
$F_Q(x)$ has a jump at the origin, $F_Q(0) \neq 0$.
\item \underline{Mean queueing time} 
$$
W_Q = \frac{q}{m\mu-\lambda}
$$
\item \underline{Mean waiting time}
$$
W = W_Q + 1/\mu = \frac{(q+m)\mu - \lambda}{\mu(m\mu-\lambda)}
$$
\item \underline{Exercise}: What is the  waiting time
  \underline{density function}?
\end{itemize}


     %
     %


\subsubsection{The infinite server}
\begin{itemize}
\item In the M/M/m queue, let $m \to\infty$
  \begin{itemize}
  \item ``unlimited service capacity''
  \item always sufficient free servers to process an arriving task -
    e.g. when the number tasks in the system is finite
  \item no queueing $\implies$ infinite servers model delays in a
    task's processing
    $$
    p_j = \frac{\rho^j}{j!} p_0  \implies p_0 = e^{-\rho}
    $$
  \item balance equations have a solution for all $\lambda,\mu$
  \item this is not surprising since the server can never be overloaded
  \end{itemize}
\item This result is independent of the exponential assumption - a
  property of the queueing discipline (here there is no queueing - a
  pathological case)
\end{itemize}


     %
     %


\subsection{M/M/m queues with finite state space}
M/M/1/k (k is the max. queue length) queue
\begin{eqnarray*}
  \mu(n) =\mu \quad \text{for all queue lengths $n$} \\
  \lambda(n) = 
  \begin{cases}
    \lambda \quad (0\leq n < k) \\
    0 \quad (n\geq k)
  \end{cases}
\end{eqnarray*}
Hence, if $\rho = \lambda / \mu$.
\begin{eqnarray*}
  p_j = \begin{cases}
    p_0 \rho^j \quad (j=0,1,\dots , k) \\
    \frac{\prod_{i=0}^{j-1} \lambda(i)}{\prod_{i=0}^{j-1} \mu(i+1)} =
    0 \quad (j>k) \quad \text{ as } \lambda(k) = 0 
    \end{cases}
\end{eqnarray*}

\underline{Telephone network with maximum capacity of $k$ calls} 
\begin{eqnarray*}
  \lambda(n) & = & \lambda \quad (0 \leq n < k), \quad \lambda(k) = 0
  \\
  \mu(n) & = & n\mu \quad \text{$n$ calls in progress} \\
  & \implies &  \text{M/M/k/k queue}
\end{eqnarray*}

$$
p_j = p_0 \frac{\rho^j}{j!} \quad j = 0,1, \dots , k \quad (\rho=
\lambda / \mu)
$$

Probability that a call is lost:
$$
p_k = \frac{\rho^k/k!}{\sum_{j=0}^k \rho^j / j!} \quad \text{\underline{Erlang
  loss formula}}
$$

$$
\text{Throughput} = \lambda(1-p_k) \quad \Big [= \mu \sum_{j=1}^k j p_j \Big]
$$


     %
     %


\subsubsection{Terminal system with parallel processing}

$N$ users logged on to a computer system with $m$ parallel processors
\begin{itemize}
\item exponentially distributed think times, mean $1/\lambda$, before
  submitting next task
\item each processor has exponential service times, mean $1/\mu$

\begin{figure}[tbp]
  \hfill
  \begin{center}
%    \fbox{\includegraphics[height=3.5cm]{pictures/terminal.eps}}
    \fbox{\includegraphics[height=3.5cm]{diags/terminals.eps}}
  \end{center}
  \caption{ Terminals with parallel processors. }
  \label{ssq}
\end{figure}

\item tasks may use any free processor, or queue (FIFO) if there are
  none. 
\item state space $S= \{n| 0 \leq n \leq N\}$ where $n$ is the queue
  length (for all processors)
\item Poisson arrivals, rate $\lambda(n) = (N-n)\lambda$
\item Service rate $\mu(n) = \begin{cases} n\mu \quad (1\leq n \leq m)
    \\ m\mu \quad (m \leq n \leq N) \end{cases} $
\end{itemize}


     %
     %


%\subsubsection{State transition diagram and solution}
Steady state probabilities $\{p_i| 0\leq i \leq N\}$ 
\begin{eqnarray*}
  p_i & = & \frac{\prod \lambda(i-1)}{\prod \mu(i)} = \frac{N(N-1)
  \dots (N-i+1)}{i!} \rho^i p_0 \quad (0\leq i \leq m) \\
\implies p_i & = & 
\begin{cases}
\binom{N}{i} \rho^i p_0 \quad (0\leq i \leq m)\\
\frac{N!}{(N-i)! m! m^{i-m} } \rho^i p_0 \quad (m \leq i \leq N)
\end{cases} \\
p_0 & = & \Big \{ \sum_{i=0}^{m-1} \binom{N}{i} \rho^i + \sum_{i=m}^N
\frac{N!}{(N-i)! m! m^{i-m} } \rho^i \Big \}^{-1}
\end{eqnarray*}
\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=2cm]{pictures/std.eps}}
  \end{center}
  \caption{ State transition diagram. }
  \label{ssq}
\end{figure}
The throughput is either given by 
$$
\mu \Big\{ \sum_{j=1}^{m-1} j p_j + m \sum_{j=m}^N p_j \Big\}
$$
(mean departure rate from processors) or
$$
\lambda \Big\{ N - \sum_{j=1}^N j p_j \Big\}
$$
(mean arrival rate)

Other performance measures as previously


     %
     %


\subsubsection{The case of ``always sufficient processors'' - $m\geq
  N$}
\begin{itemize}
\item Here there is no case $m<i\leq N$
  \begin{eqnarray*}
    \implies p_i & = & \binom{N}{i} \rho^i p_0 \quad (0\leq i \leq N)
    \\
    \implies p_0 & = & \Big \{ \sum_{i=0}^N \binom{N}{i} \rho^i\Big \}^{-1} = (1
    + \rho)^{-N} \\
    \text{Thus } p_i & = & \binom{N}{i} \Big(\frac{\rho}{1+\rho}\Big)^i
    \Big(\frac{1}{1+\rho}\Big)^{N-i} \\
    && \text{(Binomial distribution)}
  \end{eqnarray*}
\item Probabilistic explanation
  \begin{itemize}
  \item $p_i$ is the probability of $i$ ``successes'' in $N$ Bernoulli
    (i.i.d.) trials
  \item probability of success = $\frac{\rho}{1+\rho}$ =
    $\frac{1/\mu}{1/\lambda+1/\mu}$ = fraction of time user is waiting
    for a task to complete in the steady state = probability
    (randomly) observe a user in wait-mode.
  \item probability of failure = fraction of time user is in think
    mode in the steady state = probability (randomly) observe a user
    in think-mode
  \item random observations are i.i.d. in steady state $\implies$
    trials are Bernoulli
  \item hence Binomial distribution
  \end{itemize}
\item result is independent of distribution of think times.
\end{itemize}

     %
     %


\subsubsection{Analogy with a queueing network}
Regard the model as a 2-server, cyclic queueing network


\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=2.0cm]{pictures/analogy1.eps}}
    $\implies$
     \fbox{\includegraphics[height=1.5cm]{pictures/analogy2.eps}}
  \end{center}
  \caption{ Original and equivalent network. }
  \label{ssq}
\end{figure}
 

\begin{itemize}
\item As already observed, IS server is insensitive to its service time
distribution as far as queue length distribution is concerned.
\item Multi-server becomes IS if $M\geq N$ $\implies$ 2 IS servers.
\end{itemize}


     %
     %


\subsection{Little's result/formula/law (J.D.C. Little, 1961)}
Suppose that a queueing system $Q$ is in  steady state (i.e. there
are fixed, time independent probabilities of observing $Q$ in each of
its possible states at random times.)
Let:
\begin{eqnarray*}
  L & = & \text{average number of tasks in $Q$ in steady state} \\
  W & = & \text{average time a task spends in $Q$ in steady state} \\
  \lambda & = & \text{arrival rate (i.e. average number of tasks
  entering or leaving $Q$} \\ && \text{in unit time)}
\end{eqnarray*}
Then
$$
L = \lambda W.
$$
\underline{Intuitive Justification} \\
Suppose a charge is made on a task of \pounds 1 per unit time it spends
in $Q$ 
\begin{itemize}
\item Total collected on average in unit time $ = L$
\item Average paid by one task $ = W$
\item If charges collected on arrival (or departure), average
  collected in unit time $ = \lambda . W$
$$
\implies L= \lambda W
$$
\item Example: M/M/1 queue 
$$
W = (L+1)/\mu \text{ by R.O.P. } $$ $$L = \lambda W \implies W=
\frac{1}{\mu-\lambda} 
$$
\end{itemize}


     %
     %


\subsubsection{Application of Little's law}

Server utilisation $U$.

\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=2cm]{pictures/little.eps}}
  \end{center}
  \caption{ Little's law. }
  \label{little}
\end{figure}

\begin{itemize}
\item Consider server, rate $\mu$ (constant), inside some queueing
  system: i.e. average time for one
  service $= 1/\mu$
\item Let total throughput of server = $\lambda$ = total average
  arrival rate (since in steady state)
\item Apply Little's result to the server only (without the queue) 
  \begin{eqnarray*}
    \text{mean queue length} & = & 0.P(\text{idle}) + 1.P(\text{busy})
    \\
    & = & U \\
    \text{mean service time} & = & \mu^{-1} \\
    \implies U & = & \frac{\lambda}{\mu}
  \end{eqnarray*}
\item Not the easiest derivation! This is a simple work conservation law.
\end{itemize}


     %
     %


\subsection{Single Server Queue with general service times - M/G/1 queue}


Assuming that arrivals are Poisson with constant rate $\lambda$,
service time distribution has constant mean $\mu^{-1}$ (service rate
$\mu$) and that a steady state exists
\begin{eqnarray*}
  \text{(1) Utilisation, } U & = & P(\text{queue length}> 0) \\
  & = & \frac{\mu^{-1}}{\lambda^{-1}} = \lambda / \mu =
  \text{``load''} \\
  &&\text{(average amount of work arriving in unit time,} \\
  &&\text{already know
  this, of course)} \\
 \end{eqnarray*}
\begin{eqnarray*}
  \text{(2) Mean queue length } & = & L \\
  \text{Mean number in queue} & = & L_Q  \\
  \text{Mean waiting time} & = & W \\
  \text{Mean queueing time} & = & W_Q \\
  \text{Little's law} \implies L  &= &\lambda W \\
  L_Q & = & \lambda W_Q \\
  \text{By definition, } W & = & W_Q + 1/\mu 
\end{eqnarray*}
3 equations, 4 unknowns

\subsubsection{The fourth equation}
\begin{itemize}
\item By the random observer property, queue faced on arrival has mean
  length $L_Q$ (excluding task in service, if any)
\item By ``residual life'' result for renewal processes, average service
  time remaining for task in service (if any) is $\frac{\mu M_2}{2}$
  where $M_2$ is the second moment of the service time distribution
  ($M_2 = \int_0^\infty x^2 f(x) dx$ where $f(x)$ is the pdf of service
  time) 
\item Thus, since $\rho = P(\exists \text{ a task being served at
    arrival instant})$ 
  \begin{eqnarray*}
    W_Q & = & L_Q . 1/\mu + \rho . \frac{\mu M_2}{2} 
 \end{eqnarray*}
\item Now 
 \begin{eqnarray*}
   L_Q = \lambda W_Q \implies L_Q,W_Q,L,W 
  \end{eqnarray*}
\begin{eqnarray*}
   L & =& \rho + \frac{\lambda^2 M_2}{2(1-\rho)}
  \end{eqnarray*}
\item Observe that if $\frac{\text{standard deviation}}{\text{mean}}$
  (\& hence the second moment) of service time distribution is large,
  $L$ is also (not trivial as $M_2$ increases with $\mu^{-1}$ - but not
  difficult either!)
\end{itemize}


     %
     %


\subsection{Embedded Markov Chain}
\begin{itemize}
\item State of the M/G/1 queue at time $t$ is $X(t) \in S$ where the
  state space $S  = \{n|n\geq0\}$ as in M/M/1 queue.
\item M/G/1 queue is {\em not} a Markov process
  \begin{itemize}
  \item $P(X(t+s)|X(u), u \leq t) \neq P(X(t+s)|X(t)) \; \forall t,s$
  \item e.g. if a service period does not begin at time $t$
  \item no memoryless property
  \end{itemize}
\item Consider times $t_1, t_2, \dots $ of successive departures and
  let $X_i = X(t_i^+)$ 
\item Given $X_i$, $X(t)$ for $t>t_i$ is independent of $X(t')$ for
  $t'<t_i$ since at time $t_i^+$
  \begin{itemize}
  \item time to next arrival is exponential with parameter $\lambda$ because arrival process is Poisson
  \item instantaneously, no task is in service, so time to next
    departure is a  {\it complete} service time or that plus the time to next arrival (if queue empty)
  \end{itemize}
$\implies \{X_i| i\geq 1\}$ is a Markov Chain with state space $S$
($\{t_i|i\geq 1\}$ are called ``Markov times'')
\item It can be shown that, in steady state of E.M.C., distribution of no. of jobs, $j$,
  ``left behind'' by a departure = distribution of no. of jobs, $j$,
  seen by an arrival = $\lim_{n \to \infty} P(X_n=j)$ by R.O.P. 
\item Here we solve  $p_j
  = \sum_{i=0}^\infty p_i q_{ij} \quad (i\geq 0)$ for appropriate
  one-step transition probabilities $q_{ij}$.

\end{itemize}


     %
     %


\subsection{Balance equations for the M/G/1 queue}

\begin{itemize}
\item Solution for $\{p_j\}$ exists iff $\rho = \frac{\lambda}{\mu}<1
  $, equivalent to $p_0 > 0$ since  $p_0 = 1-U = 1-\rho$
  in the steady state.
\item Valid one-step transitions are $i \to j$ for $j=i-1,i, \dots$
  since may have arbitrary no. of arrivals between successive
  departures. 
\item Let 
$$
r_k = P(k \text{ arrivals in a service period})
$$
then
\begin{eqnarray*}
  q_{ij} & = & P(X_{n+1}= j| X_n=i) \quad (n\geq 0) \\
  & = & r_{j-i+1} \quad (i \geq 1, j \geq i-1) \\
&&(  i \to i-1+(j-i+1)=j)\\
q_{0j} & = & r_j \quad (j\geq 0)
\end{eqnarray*}
[Eventually a job arrives, so $0\to 1$, and then $1\to j$ if there are
$j$ arrivals in its service time since then $1 \to 1 -1 + j = j$]
\begin{eqnarray*}
  \implies p_0 & = & 1 - \rho \\
  p_j & = & p_0 r_j + \sum_{i=1}^{j+1} p_i r_{j-i+1} \quad (j\geq 0)
\end{eqnarray*}
\end{itemize}
where $r_k = \int_0^\infty \frac{\lambda x)^k}{k!} e^{-\lambda x} f(x)
dx$ if service time density function is $f(x)$ (known). This is because $P(k
\text{ arrivals in service time}| \text{service time} = x) =
\frac{\lambda x)^k}{k!} e^{-\lambda x}$ and $P(\text{service time} \in
(x,x+dx)) = f(x) dx$

\subsubsection{Solutions to the balance equations}
\begin{itemize}
\item In principle we could solve the balance equations by ``forward
  substitution'' 
  \begin{itemize}
  \item $p_0$ is known
  \item $j=0 \implies p_1$ \\ 
    \vdots
  \item $j=i: \quad p_0,p_1, \dots , p_i \implies p_{i+1}$
  \end{itemize}
but computationally impractical in general
\item Define generating functions
  \begin{eqnarray*}
    p(z) & = & \sum_{i=0}^\infty p_i z^i \\
    r(z) & = & \sum_{i=0}^\infty r_i z^i 
  \end{eqnarray*}
Recap:
\begin{eqnarray*}
  p'(z) & = & \sum_{i=1}^\infty i p_i z^{i-1} \\
  &&\implies p'(1) =
  \text{mean value of distribution } \{p_j|j\geq0\} \\
  p''(z) & = & \sum_{i=2}^\infty (i^2 -i) p_i z^{i-2} \implies p''(1)
  = M_2 -M_1
\end{eqnarray*}
\item Multiply balance equations by $z^j$ and sum:
  \begin{eqnarray*}
    p(z) & = & p_0 r(z) + \sum_{j=0}^\infty \sum_{i=1}^{j+1} p_i
    r_{j-i+1} z^j \\
    & = & p_0 r(z)+ z^{-1} \sum_{i=0}^\infty \sum_{j=0}^\infty p_{i+1}
    z^{i+1} r_{j-i} z^{j-i} \\
    && \text{where } r_k = 0 \text{ for } k < 0 \\
    & = & p_0r(z) + z^{-1} \sum_{i=0}^\infty p_{i+1}
    z^{i+1} \sum_{j=0}^\infty  r_{j-i} z^{j-i}\\
        & = & p_0r(z) + z^{-1} (p(z) - p_0) r(z) 
  \end{eqnarray*}
\end{itemize}


     %
     %


\subsubsection{Solution for $p(z)$ and the Pollaczek-Khinchine result}
\begin{eqnarray*}
  p(z) & = & \frac{p_0(1-z)r(z)}{r(z)-z} \\
  \text{where } r(z) & = & \int_0^\infty \sum_{k=0}^\infty
  \frac{(\lambda xz)^k}{k!} e^{-\lambda x} f(x) dx \\
  & = & \int_0^\infty e^{-\lambda x(1-z)} f(x) dx \\
  & = & f^*(\lambda - \lambda z) \\
  && \text{Laplace transform of $f$ at the point $\lambda-\lambda z$}
\end{eqnarray*}
Recap: Laplace transform $f^*$ of $f$ defined by 
\begin{eqnarray*}
  f^*(s) & = & \int_0^\infty e^{-sx} f(x) dx \\
  \text{so that } \frac{d^n}{ds^n} f^*(s) & = & \int_0^\infty (-x)^n
  e^{-sx} f(x) dx \\
  \implies \frac{d^nf^*(s)}{ds^n}\Big|_{s=0} & = & (-1)^nM_n \quad
  \text{$n^{th}$ moment of $f(x)$}
\end{eqnarray*}
E.g. $f^*(0) = 1$, $-f^{*\prime}(0) =  \text{mean service time} =
1/\mu$. Thus,
$$
p(z) = \frac{(1-\rho)(1-z)f^*(\lambda - \lambda z)}{f^*(\lambda -
  \lambda z) - z}
$$
$p'(1) \implies$ P-K formula \dots

\subsubsection{Derivation of P-K formula}
$$
\frac{p'}{1-\rho} = \frac{(f^*-z)(-\lambda(1-z)f^{*\prime} -f^*) +
  (1-z) f^* (1 + \lambda f^{*\prime})}{(f^* - z)^2}
$$
where $ f^* = f^* (\lambda - \lambda z) $ etc.
\begin{itemize}
\item When $z=1$, both denominator and nominator vanish ($\implies
  f^*(\lambda - \lambda . 1 ) = f^*(0) = 1$) 
\item L'Hopital rule $\implies$
$$
\frac{p'(1)}{1-\rho} = \lim_{z \to 1} \Big\{ \frac{\lambda (
  (1-2z)f^{*\prime} - \lambda z (1-z) f^{*\prime}) - \lambda
  f^{*\prime} + 2\lambda f^*f^{*\prime}}{-2 (f^*-z)(1+\lambda f^{*\prime})}\Big\}
$$
\item Again, when $z=1$, both denominator and nominator vanish
\item L'Hopital rule now gives
  \begin{eqnarray*}
    \frac{p'(1)}{1-\rho} & = &\frac{\lambda (-2 f^{*\prime}(0) +
  \lambda f^{*\prime \prime}(0) - 2 \lambda
  f^{*\prime}(0)^2)}{2(1+\lambda f^{*\prime}(0))^2} \\
&& (\text{since } f^*(0) = 1) \\
& = & \frac{\lambda^2 M_2 + 2 (\lambda/\mu)(1-
  (\lambda/\mu))}{2(1-(\lambda /\mu))^2} \quad \text{P-K formula!} \\
&& (\text{since } f^{*\prime}(0) = 1/\mu)
  \end{eqnarray*}
\item Hard work compared to previous derivation! But in principle we
  could obtain {\em any} moment (``well known'' result for variance of
  queue length)
\end{itemize}


     %
     %


\subsubsection{Waiting time distribution}
\begin{itemize}
\item The tasks left in an M/G/1 queue on departure of a given task
  are precisely those which arrived during its waiting time
  \begin{eqnarray*}
    \implies p_j & = & \int_0^\infty \frac{(\lambda x)^j}{j!}
    e^{-\lambda x} h(x) dx \\
    &&\text{because } P(j\text{ arrivals}|\text{waiting time}=x) =  \frac{(\lambda x)^j}{j!}  e^{-\lambda x} \\
    && P(\text{waiting
    time} \in (x,x+dx)) =  h(x) dx \\
  \implies p(z) & = & h^*(\lambda - \lambda z)
  \end{eqnarray*}
by the same reasoning as before.
\item Laplace transform of waiting time distribution is therefore (let
  $z = \frac{\lambda - s}{\lambda}$)
$$
h^*(s) = p\Big(\frac{\lambda - s}{\lambda}\Big) =
\frac{(1-\rho)sf^*(s)}{\lambda f^*(s)-\lambda +s}
$$
\item Exercise: Verify Little's Law for the M/G/1 queue:
$$
p'(1) = -\lambda h^{*\prime}(0).
$$
\end{itemize}


     %
     %


\subsection{Example: Paging fixed head disk}
\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=2.5cm]{pictures/paging-drum.eps}}
  \end{center}
  \caption{ Paging fixed head disk. }
  \label{paging-drum}
\end{figure}


Assumptions
\begin{itemize}
\item Tasks in the track queue require random sector
\item arrivals to an empty queue always find the head at the beginning
  of a sector (as with the next task in the queue after a service
  completion)
  \begin{eqnarray*}
    \implies \text{service times may be } & \frac{1}{nR} & \text{
    (requires next sector)} \\
  & \frac{2}{nR} & \text{ (next but one)} \\
  & \vdots & \\
  & \frac{1}{R} & \text{ (the one just gone)}
  \end{eqnarray*}
(strictly, for arrivals to an empty queue, service times have
continuous sample space [$1/nR, 1/nR + 1/R$))
\item Mean service time, $\mu^{-1} = \sum_{j=1}^n \frac{1}{n}
  \frac{j}{nR} = \frac{n+1}{2nR}$
\item Second moment, $M_2 = \sum_{j=1}^n\frac{1}{n} (\frac{j}{nR})^2 =
  \frac{(n+1)(2n+1)}{6n^2R^2}$ 
\end{itemize}

\subsubsection{Solution and asymptotic behaviour}
\begin{itemize}
\item Load is $\rho = \lambda/\mu = \frac{\lambda(n+1)}{2nR}$
$\implies \lambda < \frac{2nR}{n+1}$ if drum track is not to be saturated
\item Mean queue length, $L = \rho +
  \frac{\lambda^2(n+1)(2n+1)}{12n^2R^2(1-\rho)}$
\item As $n \to \infty$, i.e. many sectors on track
  \begin{itemize}
  \item assumption about arrivals to an empty queue becomes exact
    (large $n \implies$ good approximation)
  \item $\rho \to \frac{\lambda}{2R}$
  \item $L \to \frac{\lambda}{2R} + \frac{\lambda^2}{3R(2R-\lambda)}$
  \end{itemize}
\end{itemize}


     %
     %


\section{Queueing Networks}
\subsection{Introduction}
\begin{itemize}
\item Collection of servers/queues interconnected according to some
  topology


\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{pictures/network-example.eps}}
  \end{center}
  \caption{ Network example }
  \label{network-example}
\end{figure}

\item Servers may be
  \begin{itemize}
  \item processing elements in a computer, e.g. CPU I/O devices
  \item stations/nodes in a communication network (may be widely
    separated geographically) 
  \end{itemize}
\item Topology represents the possible routes taken by tasks through
  the system
\item May be several different classes of tasks (multi-class network)
  \begin{itemize}
  \item different service requirements at each node
  \item different routing behaviours
  \item more complex notation, but straightforward generalisation of
    the single-class network in principle
  \item we will consider only the single class case
  \end{itemize}

\end{itemize}

     %
     %


\subsection{Types of network}
\begin{itemize}
\item Open: at least one arc coming from the outside and at least one
  going out
  \begin{itemize}
  \item must be at least one of each type or else the network would be
    saturated or null (empty in the steady state)
  \item e.g. the example above
  \end{itemize}
\item Closed: no external (incoming or outgoing) arc
  \begin{itemize}
  \item constant network population of tasks forever circulating
  \item e.g. the example above with the external arcs removed from
    nodes 1 and 4
  \end{itemize}
\item Mixed: multi-class model which is open with respect to some
  classes and closed with respect to others -- e.g. in the example
  above a class whose tasks only ever alternated between nodes 2 and
  4 would be closed, whereas a class using all nodes would be open 
\end{itemize}

     %
     %


\subsection{Types of server}
\begin{itemize}
\item Server defined by its service time distribution (we assume
  exponential but can be general for non-FCFS disciplines) and its queueing
  discipline (for each class) 
  \begin{itemize}
  \item FCFS (FIFO)
  \item LCFS (LIFO)
  \item IS (i.e. a {\em delay node}: no queueing)
  \item PS ({\em Processor sharing}: service shared equally amongst
    all tasks in the queue)
  \end{itemize}
\item Similar results for queue length probabilities (in S.S.) for all 

\end{itemize}

     %
     %


\section{Open networks (single class)}
\begin{itemize}
\item Notation: \\
$M$ servers, $1,2,\dots ,M$ with FCFS discipline and exponential
service times, mean  $\frac{1}{\mu_i(n_i)}$, when queue length is $n_i$
($1\leq i \leq M$)
\begin{itemize}
\item {\em state dependent service rates} to a limited extent
\item $\mu_i(n_j) \text{ for } i \neq j \implies$ {\em blocking}: rate
  at one server depends on the state of another, e.g. rate $\to 0$
  when finite queue at next is full
\end{itemize}
\item External Poisson arrivals into node $i$, rate $\gamma_i$,
  ($1\leq i \leq M$) ($=0$ if no arrivals)
\item Routing probability matrix $Q=(q_{ij}|1\leq i \leq M)$
  \begin{itemize}
  \item $q_{ij}=$ probability that on leaving node $i$ a task goes to
    node $j$ independently of past history
  \item $q_{i0} = 1 - \sum_{j=1}^M q_{ij} = $ probability of external
    departure from node $i$
  \item at least one $q_{i0}>0$, i.e. at least one row of $Q$ sums to
    less than 1
  \end{itemize}
\item State space of network $S=\{(n_1,\dots ,n_M)\}|n_i\geq0\}$
  \begin{itemize}
  \item queue length vector random variable is $(N_1,\dots , N_M)$
  \item $p(\underline{n}) = p(n_1, \dots , n_M) = P(N_1=n_1, \dots ,
    N_M =n_M)$
  \end{itemize}
\end{itemize}


     %
     %


\subsection{Traffic equations}
\begin{itemize}
\item Determine mean arrival rates $\lambda_i$ to each node $i$ in the
  network


\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{pictures/traffic.eps}}
  \end{center}
  \caption{ Traffic equations }
  \label{traffic}
\end{figure}


\item In the steady state, $\lambda_i = \gamma_i + \sum_{j=1}^M
  \lambda_j q_{ji} \text{ for } (1 \leq i \leq M) \implies $ unique
  solution for $\{\lambda_i\}$ (because of properties of $Q$)
  \begin{itemize}
  \item independent of Poisson assumption since we are only considering
    mean numbers of arrivals in unit time
  \item assumes only the existence of a steady state
  \end{itemize}
\item Example


\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=1.5cm]{pictures/traffic-example.eps}}
  \end{center}
  \caption{ Traffic example. }
  \label{traffic-example}
\end{figure}


  $$\lambda_1 = \gamma + \lambda_1 q \implies \lambda_1 =
  \frac{\gamma}{1-q}$$
\item Arrivals to a node are not in general Poisson, e.g. this simple
  example. If there is no feedback then all arrival processes are
  Poisson because
  \begin{enumerate}
  \item departure process of M/M/1 queue is Poisson
  \item superposition of independent  Poisson processes is Poisson
\end{enumerate}
  \item Similarly, let $R_i$  be the average interval between a task's arrival at 
  server $i$ and its departure from the network 
  \begin{itemize}
  \item $R_i$ is the ``average remaining sojourn time''
  \item $$R_i = W_i + \sum_{j=1}^M q_{ij} R_j$$
  \end{itemize}\end{itemize}

     %
     %


\subsection{Steady state queue length probabilities}
\underline{Jackson's theorem} \\
\begin{enumerate}
\item The number of tasks at any server is independent of the number
  of tasks at every other server in the steady state
\item Node $i$ behaves {\em as if} it were subjected to Poisson arrivals,
  rate $\lambda_i$ ($1\leq i \leq M$)
\end{enumerate}
\begin{itemize}
\item Thus, even though arrivals at each node are not, in general,
  Poisson, we can treat the system as if it were a collection of $M$
  independent M/M/1 queues
  $$\implies p(n_1,\dots ,n_M) \propto \prod_{i=1}^M \Big\{
  \frac{\lambda_i^{n_i}}{\prod_{j=1}^{n_i} \mu_i(j)} \Big\}$$ where
  $\mu_i(j)$ is the rate of server $i$ when the queue length is $j$
\item If service rates are constant, $\mu_i$, 
  $$p(\underline{n}) \approx \prod_{i=1}^M \rho_i^{n_i} =
  \prod_{i=1}^M(1-\rho_i)\rho_i^{n_i}$$ where $\rho_i =
  \frac{\lambda_i}{\mu_i}$
\item $p(\underline{n}) \to$ usual performance measures such as mean
  queue lengths, utilisations, throughput by standard methods -- mean
  waiting times by Little's result
\end{itemize}
N.B. The normalising constant for $\{p(\underline{n})| \underline{n}
\in S\}$ is not shown for the general case: it is the product of those
for the M/M/1 queues.


     %
     %


\subsection{Mean Value analysis}
\begin{itemize}
\item Can exploit Jackson's theorem directly, together with Little's result, if only \underline{average} quantities are required
  \begin{itemize}
  \item values for mean queue lengths $L_i$ are those for isolated
    M/M/1 queues  with arrival rates $\lambda_i$ ($1\leq i \leq M$)
  \item assuming constant service rates $\mu_i$
    \begin{equation*}
      L_i = \frac{\rho_i}{1-\rho_i} \text{ for }1 \leq i \leq M
    \end{equation*}
    (average number of tasks at node $i$)
  \item total average number of tasks in network
    \begin{eqnarray*}
      L= \sum_{i=1}^M L_i = \sum_{i=1}^M \frac{\rho_i}{1-\rho_i}
    \end{eqnarray*}
  \end{itemize}
\item Waiting times
  \begin{itemize}
  \item Average waiting time in the network, $W=L/\gamma$ by Little's
    result where $\gamma = \sum_{i=1}^M \gamma_i$ is the total
    arrival rate 
  \item Average time spent at node $i$ during each visit $$W_i =
    L_i/\lambda_i = \frac{1}{\mu_i(1-\rho_i)}$$
  \item Average time spent queueing on a visit to node $i$ $$W_{Qi} =
    L_{Qi}/\lambda_i = \frac{\rho_i}{\mu_i(1-\rho_i)}$$
  \end{itemize}
\end{itemize}

\subsection{An alternative formulation}
\begin{itemize}

\item Let $v_i$ be the average number of visits made by a task to
  server $i$ $$\gamma v_i = \text{average number of arrivals to node
    $i$ in unit time} = \lambda_i$$ where $\gamma$ is the average
  number of arrivals to the whole network in unit time and $v_i$ the
  average number of visits a given arrival makes to node $i$ $$v_i =
  \frac{\lambda_i}{\gamma} \quad (1\leq i \leq M)$$
\item Let $D_i$ be the total average service demand on node $i$ from
  one task 
  $$D_i = \frac{v_i}{\mu_i}= \frac{\rho_i}{\gamma}$$
  $\rho_i = \gamma D_i = $ average work for node $i$ arriving from
    outside the network in unit time
\item Often specify a queueing network directly in terms of
  $\{D_i|1\leq i \leq M\}$ and $\gamma$; then there is no need to
  solve the traffic equations
\item $$L_i = \frac{\rho_i}{1-\rho_i} = \frac{\gamma D_i }{1-\gamma
    D_i}$$ $L$ and $W$ as before
\item Let $B_i = \text{total average time a task spends at node $i$}$ 
  $$B_i = v_iW_i = \frac{v_i}{\mu_i(1-\rho_i)} = \frac{D_i}{1-\gamma
    D_i}$$
  alternatively apply Little's result to node $i$ with the external
  arrival process directly 
  $$B_i = \frac{L_i}{\gamma} = \frac{D_i}{1-\gamma    D_i}$$
\item However, $D_i$ and $\gamma$ cannot be used to determine $\mu_i$
  and hence neither $W_i$ nor $R_i$
\item Specification for delay nodes (IS) $i$
  \begin{itemize}
  \item $L_i = \rho_i = \gamma D_i$
  \item $B_i = D_i $ (no queueing)
  \item $W_i = 1/\mu_i$
  \item $D_i = v_i / \mu_i$ (as for any work conserving discipline
    also)
  \item service time distribution arbitrary
  \end{itemize}
\end{itemize}

     %
     %


\subsection{Distribution of time delays}
\begin{itemize}
\item Even though arrivals at each node are not, in general, Poisson,
  the Random Observer Property still holds: {\em waiting time
    distribution at node $i$ is exponential, parameter $\mu_i
    -\lambda_i$} again as expected from Jackson's theorem.
\item Networks with no overtaking (``tree-like'' networks) are easy to
  solve for time delay distributions:
\begin{figure}[tbp]
\hfill
\begin{center}
\fbox{\includegraphics[height=3.5cm]{pictures/no-overtaking.eps}}
\end{center}
\caption{ A network with no overtaking. }
\label{no-overtaking}
\end{figure}
  \begin{itemize}
  \item sojourn time on any route is the sum of independent,
    exponential random variables 
  \item this argument is independent of Jackson's theorem and one proof uses the idea of {\em reversibility} of the M/M/1 queue
  \end{itemize}
\item time delay distribution is a convolution of exponentials,
  e.g. $f_1 \star f_2 \star f_3$ for route $r$ where $f_i(t) = (\mu_i
  - \lambda_i) e^{-(\mu_i
  - \lambda_i)t }$ for $i=1,2,3$.
\end{itemize}

     %
     %


\subsection{Time delays in general networks}
\begin{itemize}
\item mean sojourn time for any route is always easy because the mean
  of a sum of random variables is equal to the sum of the means of
  those random variables, whether or not they are independent
\item in networks with overtaking, the distribution of route sojourn
  times remains an open problem. For example, in the network
  \begin{figure}[tbp]
    \hfill
    \begin{center}
      \fbox{\includegraphics[height=3.5cm]{pictures/overtaking.eps}}
    \end{center}
    \caption{ A network with overtaking. }
    \label{overtaking}
  \end{figure}
  \begin{itemize}
  \item sojourn time distribution on route $1\to 3$ is the convolution
    of 2 exponentials
  \item sojourn time distribution on route $1 \to 2 \to 3$ is {\em
      not} the convolution of 3 exponentials because the queue length
    faced at node 3 upon arrival depends on the number of departures
    from node 1 during the sojourn time at node 2.
  \item The 
    Random Observer Property is not applicable since the arrival to
    node 3 is not random when conditioned on the previous  arrival at node 1.
  \end{itemize}
\item Jackson's theorem does not apply because it is concerned only with
  steady state probabilities, i.e. the asymptotic behaviour of
  $p_t(\underline{n})$ at the single point in time $t$ as $t \to \infty$.

\item Subject of many research papers.
\end{itemize}


     %
     %


\section{Closed Queueing Networks}
\label{sec:closed-qn}

\begin{itemize}
\item No external arrivals or departures (no $\gamma_i$ terms).
\item Routing probabilities satisfy
  \begin{equation*}
    \label{eq:route}
    \sum_{j=1}^M q_{ij} = 1 \text{ for } 1 \leq i \leq M
  \end{equation*}
\item State space $S=\{ (n_1, \dots , n_M) | n_i \geq  0 ,
  \sum_{j=1}^M n_j =K\}$ for population $K$
  \begin{itemize}
  \item $|S| = \text{ \# of ways of putting } K \text{ balls into } M
    \text{ bags } = \binom{K+M-1}{M-1} $
  \item finiteness of $S\rightarrow{}$ steady state always exists
  \end{itemize}
\item Traffic equations are 
  \begin{equation*}
    \label{eq:traffic}
    \lambda_i = \sum_{j=1}^M \lambda_j q_{ji}\text{ for }  1 \leq i \leq M
  \end{equation*}
  \begin{itemize}
  \item homogeneous linear equations with an infinity of solutions
    which differ by a multiplicative factor (because $|I-Q|=0$ since
    rows all sum to zero)
  \item let $(e_1, \dots , e_M)$ be any non-zero solution (typically
    chosen by fixing one $e_i$ to a {\em convenient } value, like 1)
  \end{itemize}
  therefore $e_i \propto \text{ arrival rate }$ $\lambda_i$, i.e. $e_i =
  c\lambda_i$ $x_i = \frac{e_i}{\mu_i} \propto \text{ load } = \rho_i$,
  i.e. $x_i = c \rho_i$
\end{itemize}

     %
     %


\subsection{Steady state probability distribution for $S$}
\begin{itemize}
\item Jackson's theorem extends to closed networks which have a product
  form solution
  \begin{equation}
    \label{eq:prob}
    p(n_1, \dots , n_M) = \frac{1}{G} \prod_{i=1}^M
    \frac{e_i^{n_i}}{\prod_{j=1}^{n_i} \mu_i(j)} \text{ where }
    \sum_{i=1}^M n_i = K.
  \end{equation}
  where $\mu_i(j)$ is the service rate of the exponential server $i$
  when its queue length is $j$.
\item $G$ is the normalising constant of the network
  \begin{equation}
    \label{eq:G}
    G = \sum_{\pmb{n} \in S} \prod_{i=1}^M
    \frac{e_i^{n_i}}{\prod_{j=1}^{n_i} \mu_i(j)}
  \end{equation} 
  not easy to compute (see later)
\item Prove the result by using the network's steady state balance
  equations: 
  \begin{eqnarray*}
    \text{Total flux out of state } \pmb{n} & = & p(\pmb{n})
    \sum_{i=1}^M \mu_i(n_i)  \epsilon(n_i) \\
    & = & \text{Total flux into state }\pmb{n}:  \sum_{i,j}\pmb{n}_i^j
    \rightarrow \pmb{n}\\
    &=& \sum_{i=1}^M \sum_{j=1}^M p(\pmb{n}_i^j)
    \epsilon(n_i) \mu_j( (\pmb{n}_i^j)_j )q_{ji}
  \end{eqnarray*}
  where $\epsilon(n) = \begin{cases} 0 \text{ if } n=0 \\ 1  \text{ if
  } n >0 \end{cases}$ \\
$\pmb{n}_i^j = \begin{cases} (n_1, \dots , n_i - 1, \dots , n_j+1,
  \dots , n_M) \text{ if } i \neq j \\ \pmb{n}_i^i = \pmb{n} \text{ otherwise} \end{cases}$ \\
note that 
$(\pmb{n}_i^j)_j = \begin{cases} n_j + 1 \text{ if } i \neq j \\ n_i
  \text{ otherwise} \end{cases}$
\item Try 
  \begin{equation}
    \label{eq:prob}
    p(n_1, \dots , n_M) = \frac{1}{G} \prod_{i=1}^M
    \frac{e_i^{n_i}}{\displaystyle{\prod_{j=1}^{n_i}} \mu_i(j)} \text{ where }
    \sum_{i=1} n_i = K.
  \end{equation} 
 Then require
  \begin{eqnarray*}
     \frac{1}{G} \prod_{i=1}^M \frac{e_i^{n_i}}{\displaystyle{\prod_{j=1}^{n_i}} \mu_i(j)}
    \sum_i \mu_i(n_i) \epsilon(n_i)& = & \frac{1}{G} \prod_{i=1}^M \frac{e_i^{n_i}}{\displaystyle{\prod_{j=1}^{n_i}}
    \mu_i(j)}  \sum_{i,j:i= j} \mu_i(n_i) \epsilon(n_i) q_{ji} \\ 
  && + \frac{1}{G} \prod_{i=1}^M \frac{e_i^{n_i}}{\displaystyle{\prod_{j=1}^{n_i}}
    \mu_i(j)}  \times\\
&&\sum_{i,j: i\neq j} \frac{e_j}{e_i} \frac{\mu_i(n_i)}{\mu_j(n_j+1)} \epsilon(n_i) \mu_j(n_j+1)q_{ji}
  \end{eqnarray*}
  i.e.
  \begin{eqnarray*}
    \sum_i \mu_i(n_i) \epsilon(n_i) =  \sum_i \mu_i(n_i) \epsilon(n_i)
    \{ \frac{\sum_j e_j q_{ji}}{e_i}\}
  \end{eqnarray*}
  OK if $\pmb{e}$ satisfies the traffic equations.
\item Note that if $\pmb{e}' = c \pmb{e}$ is another solution to the traffic equations, the corresponding
  probabilities  $p'(\pmb{n})$ and $G'$ are
  \begin{eqnarray*}
    p'(\pmb{n}) = \frac{1}{G'} G c^{\sum n_i} p(\pmb{n}) = \frac{Gc^K}{G'} p(\pmb{n})
  \end{eqnarray*}
  \begin{eqnarray*}
    G' = c^K G = \sum_{\pmb{n} \in S} G c^{\sum n_i} p(\pmb{n})
  \end{eqnarray*}
and therefore $p'(\pmb{n}) = p(\pmb{n})$. This confirms the
arbitrariness of $\pmb{e}$ up  to a multiplicative factor.
\end{itemize}


     %
     %


\subsection{Computation of the normalising constant}
\label{sec:G}

\begin{itemize}
\item We consider only the case of servers with constant service rates
  to get an efficient algorithm. 
\item There are also algorithms for networks with servers having state
  dependent service rates, e.g. the convolution algorithm. 
\item Less
  efficient but important since the alternative {\em MVA algorithm}
  also requires constant rates.
\item Define $G = g(K,M)$ where
$$ 
g(n,m) = \sum_{\underline{n} \in S(n,m)} \quad \prod_{i=1}^m x_i^{n_i}
$$
where $S(n,m) = \{(n_1, \dots , n_m) | n_i \geq 0, \sum_{i=1}^m n_i =n \}$
and $x_i = \frac{e_i}{\mu_i}$ $(1\leq i \leq m)$.
\begin{itemize}
\item state space for subnetwork of nodes $1,2,\dots,m$ and population $n$.
\end{itemize}
\item For $n,m > 0$
  \begin{eqnarray*}
    g(n,m) &=& \sum_{\underline{n} \in S(n,m), n_m =0} \quad \prod_{i=1}^m
    x_i^{n_i} + \sum_{\underline{n} \in S(n,m), n_m > 0}  \quad \prod_{i=1}^m
    x_i^{n_i} \\
&=& \sum_{\underline{n} \in S(n,m-1)} \quad  \prod_{i=1}^{m-1} x_i^{n_i} +
%    x_m \\ &&\sum_{k_i = n_i (i \neq m), k_m = n_m -1, \underline{n} \in
    \quad x_m 
\sum_{\substack{k_i = n_i (i \neq m)\\ k_m = n_m -1\\ \underline{n} \in
    S(n,m)}} 
\quad  \prod_{i=1}^m x_i^{k_i} \\
&=& g(n,m-1) + x_m g(n-1,m) 
  \end{eqnarray*}
because $\{\underline{k} | k_i \geq 0; \displaystyle{\sum_{i=1}^m k_i} = n-1\} =
S(n-1,m)$.
\item Boundary conditions:
  \begin{eqnarray*}
    g(0,m) & = & 1 \text{ for } m >0 \\
    g(n,0) & = & 0  \text{ for } n \geq 0 \\
  \end{eqnarray*}
\item Note 
$$
g(0,m) = \sum_{\underline{n} = (0, \dots , 0)} \prod_{i=1}^m
    x_i^{0} = 1
$$
and 
$$
g(n,1) =  x_1
    g(n-1,1) = x_1^n
$$

\end{itemize}


     %
     %


\subsection{Marginal queue length probabilities and performance
  measures}


\begin{itemize}
\item Although $p(\underline{k}) \propto \prod p_i(k_i)$ it is {\em
    not} the case that $P(N_i = k_i) = p_i(k_i)$, as in the open
  networks. The use of M/M/1 factors is just a convenient
  mathematical device, there is no probabilistic interpretation. 
\item Probability that a server is idle ($= 1- \text{utilisation}$) 
$$ P(N_M = 0) = \frac{1}{g(K,M)}  \sum_{\underline{n} \in S(n,m), n_m
  =0}  \quad \prod_{i=1}^{M-1}     x_i^{n_i} = \frac{g(K,M-1)}{g(K,M)}.$$ In
general 
$$ 
P(N_i = 0) = \frac{G_i(K)}{G(K)} \text{ for } 1 \leq i \leq M
$$
where $G(K) = g(K,M)$ and $G_i(k)$ is the normalising constant for the
same network with the server $i$ removed and population $k$
$$
G_i(k) =\sum_{\underline{n} \in S(k,M-1)}  \quad \prod_{j=1}^{M-1} y_j^{n_j} 
$$
where 
$$
y_j = \begin{cases} 
  x_j \text{ for } 1 \leq j < i\\
  x_{j+1} \text{ for } i \leq j \leq M-1\\
  \end{cases}
$$

\item utilisation of node $i$
$$ 
U_i(K) = 1 - \frac{G_i(K)}{G(K)}
$$
\end{itemize}

     %
     %


\subsection{Alternative Formulation: Cumulative Probabilities}
\begin{itemize}
\item For $1 \leq k \leq K$ and $1 \leq i \leq M$
  \begin{eqnarray*}
P(N_i \geq k) & = &\sum_{\underline{n} \in S(K,M), n_i \geq k}
 \quad \prod_{j=1}^{M}     \frac{x_j^{n_j}}{G(K)}   \\
& =& \frac{x_i^{k}}{G(K)}
\sum_{\substack{ m_i =
    n_i -k\\ 
     n_i \geq k \,  m_j=n_j (j\neq i)\\
    \underline{n} \in S(K,M)}}  \quad \prod_{j=1}^{M}   x_j^{m_j}\\
& =& \frac{x_i^{k}}{G(K)} \sum_{\underline{m} \in S(K-k,M)} \quad \prod_{j=1}^{M}
x_j^{m_j} \\
& =&  x_i^{k} \frac{G(K-k)}{G(K)}.
  \end{eqnarray*}
Therefore the utilisation is given by 
$$
U_i = x_i \frac{G(K-1)}{G(K)}
$$

\item Equating two expressions for $U_i$ yields the recurrence
  relation for $g(k,m)$ previously
\item Throughput of server $i$,
$$
T_i(k) = \mu_iU_i(k) = e_i  \frac{G(K-1)}{G(K)}
$$
proportional to visitation rate as expected.
\item Queue length distribution at server $i$ is $P(N_i=k) = p_i(k)$ for $0 \leq k \leq K$ and $1 \leq i \leq M$
  \begin{eqnarray*}
    p_i(k)  & = & P(N_i \geq k) - P(N_i \geq k+1)\\
    & = & x_i^k \frac{G(K-k) - x_i G(K-k-1)}{G(K)}.
  \end{eqnarray*}
where $G(-1) = 0$ by definition.
\item Notice that the previous formulation gives a more concise formulation
  for $p_i(k)$
  \begin{eqnarray*}
    p_i(k) & =& \frac{1}{G(K)} \sum_{\underline{n} \in S(K,M), n_i =
    k}  \quad \prod _{j=1}^M x_j^{n_j} \\
  &=& \frac{x_i^k}{G(K)}\sum_{\underline{n} \in S(K-k,M), n_i =
    0}  \quad \prod _{j=1}^M x_j^{n_j} \\
  &=& x_i^k \frac{G_i(K-k)}{G(K)}
  \end{eqnarray*}
for $0 \leq k \leq K$ and $1 \leq i \leq M$. This is a generalisation
of the result obtained for $U_i(k)$.
\item Mean queue length at server $i$, $1 \leq i \leq M$, $L_i(k)$
  \begin{eqnarray*}
    L_i(K) & = & \sum_{k=1}^K k P(N_i =k) \\
    & = &  \sum_{k=1}^K P(N_i \geq k)  \\
    & = & \frac{\sum_{k=1}^K x_i^k G(K-k)}{G(K)}.
  \end{eqnarray*}
\end{itemize}


     %
     %


\subsection{Equivalent open networks and the use of mean value
    analysis}
  \begin{itemize}
  \item Consider an irreducible network -- one in which every arc is traversed
    within finite time from any given time with probability 1 
 \begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=2.5cm]{pictures/equivalent-open-network.eps}}
  \end{center}
  \caption{ Equivalent open network. }
  \label{equivalent-open-network}
\end{figure}
\item 
   i.e. a network represented by an irreducible positive recurrent
    Markov process (finite state space)
  \item we introduce a node, $0$, in
    one of the arcs and replace arc $a$ by arc $a_1$, node $0$ and arc
    $a_2$ 
    \begin{itemize}
    \item Source of $a_1$ is source of a
    \item destination of $a_2$ is destination of $a$
    \end{itemize}

  \item Whenever a task arrives at node $0$ (along arc $a_1$), it
    departs from the network and is immediately replaced by a
    stochastically identical task which leaves node $0$ along arc $a_2$
    
  \item Define the network's throughput, $T$, to be the average rate
    at which tasks pass along arc $a$ in the steady state. 
    \begin{itemize}
    \item i.e. $T$ is 
    mean number of tasks traversing $a$ in unit time.
  \item One can choose
    any arc in an irreducible network.
    \end{itemize}

  \end{itemize}

     %
     %


\subsection{Visitation rates and application of Little's result}
\begin{itemize}
\item Let the visitation rate be $v_i$ and the average arrival rate be
  $\lambda_i$ for node $i$, $1 \leq i \leq M$, then $\lambda_i = T
  v_i$ where $T$ is the external arrival rate. 
\item The set $\{ v_i | 1
  \leq i \leq M\}$ satisfies the traffic equations, as we could have
  derived directly by a similar argument. 
\item Suppose arc $a$ connects node $\alpha$ to node $\beta$ in the
  closed network $1 \leq \alpha , \beta \leq M$, then $v_0 = v_\alpha
  q_{\alpha \beta}$ because all traffic from $\alpha$ to $\beta$ goes
  through node $0$ in the open network. 
\item But every task enters node $0$
  exactly once, hence $v_\alpha = \frac{1}{q_{\alpha \beta}}$ since
    $v_0 =1$. This determines $\{ v_i | 1
  \leq i \leq M\}$ uniquely.
\item Little's result now yields
$$ L_i = \lambda_i W_i = T v_i W_i $$
$$ \sum_{i=1}^M L_i = K = T \sum_{i=1}^M v_iW_i$$ since the sum of all
queue lengths is exactly $K$ in the closed network $$\implies \quad T = \frac{K}{\sum_{i=1}^M v_iW_i}$$
\end{itemize}


     %
     %

\subsection{Mean waiting times}

\begin{itemize}
\item $$W_i = \frac{1}{\mu_i} [ 1+Y_i]$$ where the first term is
  the arriving task's mean service time and $Y_i$ is the mean number
  of tasks seen by new arrivals at server $i$. 
\item For an IS
  (delay) server $W_i = \frac{1}{\mu_i}$, otherwise \dots
\item Do {\em not} have the random observer property
  \begin{itemize}
  \item number of tasks seen on arrival does not have the same steady
    state distribution as the queue length since $K$ tasks are seen
    with probability $0$. (arrival cannot ``see itself'')
  \item do not have $Y_i = L-1$ as in open networks
  \end{itemize}
\item Do have the analogous Task (or Job) Observer Property: \\
The state of a closed queueing network in equilibrium seen by a new
arrival at any node has the same distribution as that of the state of
the same network in equilibrium with the arriving task removed
(i.e. with population $K-1$)
\begin{itemize}
\item arriving task behaves as a random observer in a network with
  population reduced by one
\item intuitively appealing since the ``one'' is the task itself
\item but requires a lengthy proof (omitted)
\end{itemize}
\end{itemize}

     %
     %


\subsection{Recurrence relations for throughput, mean waiting times
  and  mean queue length}
\begin{itemize}
\item Task observer property yields
  $$ Y_i(K) = L_i(K-1)$$ for $1\leq i \leq M, K >0$.
\item Hence we obtain the recurrence relations
  \begin{eqnarray*}
    W_i(K) & = &\frac{1}{\mu_i} [1+ L_i(K-1)] \\
    T(K) & = & \frac{K}{\sum_{i=1}^M v_i W_i(K)} \\
    L_i(K) & = & v_i T(K) W_i(K)
  \end{eqnarray*}
for $1\leq i \leq M, K >0$ and the initial condition $L_i(0) = 0$
 \item $T(K),W_i(K), L_i(K)$ easily computed by a simple iteration
  \begin{eqnarray*}
    \{L_i(K-1)\} \to && \underbrace{\{ W_i(K)\} \to T(K)} \\
    &&\to \{ L_i(K)\} \to \dots
  \end{eqnarray*}
\item $2M+1$ quantities computed each time round the loop
\end{itemize}

     %
     %


\subsection{Alternative formulation}

\begin{itemize}
\item Total average time spent at node $i$ when population is $K$
$$B_i(K) = v_i W_i (K)$$ for $1 \leq i \leq M$
\item Total average service time (demand) a task requires from node
  $i$ $$D_i = \frac{v_i}{\mu_i}$$ for $1 \leq i \leq M$ independent of
  population $K$. 
\item Therefore
  \begin{eqnarray*}
    B_i(K) & = & D_i[1+L_i(K-1)] \\
    T(K) & = & \frac{K}{\sum_{i=1}^M B_i(K)} \\
    L_i(K) & = & T(K) B_i(K)
  \end{eqnarray*}
for $1 \leq i \leq M$
\item Total average time in network spent by a task $$W(K) =
  \sum_{i=1}^M v_i W_i(K) = \sum_{i=1}^M B_i(K) = \frac{K}{T(K)}$$ as
  expected by Little's law
\item Utilisation of node $i$
  $$U_i(K) = \frac{\lambda_i}{\mu_i} = T(K) \frac{v_i}{\mu_i} =
  T(K)D_i \leq 1$$ which implies $$T(K)  \leq \min_{1\leq i \leq M}
  \frac{1}{D_i}$$ which implies the maximum throughput is dictated by
  a bottleneck server (or servers) - that (those) with maximum $D_i$
\end{itemize}

     %
     %


\subsection{A faster approximate algorithm}

\begin{itemize}
\item Algorithms given above need to evaluate $2M+1$ quantities for
  each population between $1$ and $K$
\item Would be faster if we could relate $Y_i(K)$ to $L_i(K)$ rather
  than $L_i(K-1)$, which implies that we do not need to worry about
  populations less than $K$
\item $$Y_i(K) = \frac{K-1}{K} L_i(K)$$ is a good approximation, exact
  for $K=1$ and correct asymptotic behaviour as $K \to \infty$
  (exercise)
\item then $$B_i(K) = D_i \Big[1+ \frac{K-1}{K} T(K) B_i(K)\Big]$$
\item $$B_i(K) = \frac{ D_i}{1-\frac{K-1}{K} D_iT(K)}$$
\item Thus we obtain the fixed point equation
$$T(K) = f(T(K))$$ where $$f(x) = \frac{K}{\sum_{i=1}^M \frac{
    D_i}{1-\frac{K-1}{K} D_i x}}$$ there are many numerical methods to
solve such equations
\end{itemize}

     %
     %


\subsection{Example: Multiprogramming computer system}


\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{pictures/multi-programming-computer.eps}}
  \end{center}
  \caption{ A multiprogrammed computer. }
  \label{multi-programming-computer}
\end{figure}


\begin{itemize}
\item Insert node $0$ in route from CPU back to itself, {\it
    throughput} is the rate at which tasks are interrupted at CPU node,
  this is an arbitrary choice
\item visitation rates 
$$v_1 = a_1 v_1 + \sum_{j=2}^M v_j$$
$$v_i = a_i v_1  \text{ for } i \geq 2$$
$$v_1 = \frac{1}{a_1} \text{ with choice of position of node 0}$$
therefore
$$v_i = \frac{a_i}{a_1} \text{ for } i \geq 2$$

\item Sanity check: is the first equation is satisfied?
$$D_i = \frac{a_i}{a_1 \mu_i} \text{ for } i \geq 2$$
$$D_1 = \frac{1}{a_1 \mu_1}$$
therefore $T(K)$ follows by solving recurrence relations
\item total CPU throughput $ = \frac{T(K)}{a_1}$ (fraction $a_1$ recycles)

\end{itemize}

     %
     %


\subsection{Application: A batch system with virtual memory}
\begin{itemize}

\item Most medium-large computer systems use virtual memory
\item It is well-known that above a certain degree of multiprogramming
  performance deteriorates rapidly
\item Important questions are
  \begin{itemize}
  \item How does the throughput and turnaround time vary with the
    degree of multiprogramming
  \item What is the threshold below which to keep the degree of
    multiprogramming? 
  \end{itemize}
\end{itemize}

\subsubsection{Representation of paging}
\begin{itemize}
\item Suppose node 2 is the paging device (e.g. fast disk)
  \begin{itemize}
  \item assume dedicated to paging
  \item easy to include non-paging I/O also
  \end{itemize}
\item We aim to determine the service demand of a task at node 2,
  $D_2$, from tasks' average paging behaviour
  \begin{itemize}
  \item use results from '70s working set theory 
  \item consider the page fault rate for varying amounts of main store
    allocated to a task
  \end{itemize}
\item Let the population be $K$
\item paging behaviour will be represented by a {\em lifetime
    function}:
  \begin{itemize}
  \item $h(x) = $ average CPU time between consecutive page faults
    when a task has  $x$ pages of memory
  \item $h(0) \approx 0$ (the first instruction causes a fault)
  \item $h(x) = C$, the total CPU time of a task for $x\geq$ some
    constant value $m$, where $m$ is the ``size'' of the task
  \item $h$ is an {\em increasing function}: the more memory a task
    has the less frequent will be its page faults on average.
  \item Two possible types of lifetime functions:

  
\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{diags/lifetimes.eps}}
  \end{center}
  \caption{ Lifetime functions }
  \label{lifetimea}
\end{figure}

% \begin{figure}[tbp]
%   \hfill
%   \begin{center}
%     \fbox{\includegraphics[height=3.5cm]{pictures/lifetime2.eps}}
%   \end{center}
%   \caption{ Lifetime function b. }
%   \label{lifetimeb}
% \end{figure}
  
 
  \item For (a) $h(x) = \begin{cases} ax^b \text{ for }x\leq m\\ am^b = C \text{ for } x \geq m \end{cases}$, $b>1$
  \item For (b) $h(x) = \frac{C}{1+(a/x)^2}$
  \item Note that  for (b) there is no $m$ s.t. $h(m)=C$
  \end{itemize}
\item Let each task have $P/K$ pages of memory of size $P$ pages
  \begin{itemize}
  \item average CPU time between faults $= h(P/K)$
  \item average number of faults during life of task $=
    \frac{D_1}{h(P/K)}$
  \item average time at node 2 per fault $ = \frac{1}{\mu_2}$
  \end{itemize}
\item Therefore average paging time of a task (from node 2) 
  \begin{itemize}
  \item $D_2 = H(K) = \frac{D_1}{\mu_2h(P/K)}$
  \item $D_2 = d_2 + H(K)$ if average non-paging requirement of a task
    from node 2 is $d_2$
  \end{itemize}
\item As $K \to \infty$, $T(K) \to 0$  since $T(K) \leq \min \frac{1}{D_i}$ and $D_2 \to \infty$ 
\end{itemize}

\subsection{Solution}
\begin{itemize}
\item We again use the alternative form of the MVA algorithm
\item For population $K$ and $1\leq k \leq K$ 
  \begin{eqnarray*}
    B_i(k) & = & D_i(K) [ 1 + L_i(k-1)] \quad (1\leq i \leq M) \\
    T(k) & = & \frac{k}{\sum_{i=1}^MB_i(k)} \\
    L_i(k) & = & T(k) B_i(k)  \quad (1\leq i \leq M)
  \end{eqnarray*}
\item Notice that only $D_2$ is a function of $K-D_1$ and $D_3$ are
  constant
\item Computation for a range of populations requires 2 nested loops:
  \begin{itemize}
  \item outer loop for $K=1,2, \dots , K_{max}$
  \item inner loop for $k=1,2, \dots , K$
  \item need new value for $H(K)$ each time round the outer loop
  \end{itemize}
\end{itemize}
\subsection{Choice of lifetime function}
\begin{itemize}
\item We assume (arbitrarily) the lifetime function
$$
h(x) = \begin{cases}
  D_1(\frac{x}{10000})^{3.5}, \quad x\leq 10000 \\
  D_1 \text{ otherwise}
       \end{cases}
$$
for all tasks, i.e.
\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=2.5cm]{diags/lifetime1chart.eps}}
  \end{center}
  \caption{ Lifetime function example. }
  \label{lifetime-example}
\end{figure}


\end{itemize}

\subsection{Thrashing curves}


\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{diags/batchpagingWchart.eps}}
  \end{center}
  \caption{ Mean job processing time vs. number of jobs.}
  \label{mean-job-processing-time}
\end{figure}

\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{diags/batchpagingTchart.eps}}
  \end{center}
  \caption{Throughput vs. number of Jobs.  }
  \label{/throughput-v-number-of-jobs}
\end{figure}


     %
     %


\subsection{Decomposition}

\begin{itemize}
\item Separable queueing models are prone to break down in some
  situations:
  \begin{itemize}
  \item There is usually a good model for small parts of any network
  \item Composing the bits can  be hard unless the criteria for
    separability are (reasonably) met
  \end{itemize}
\item Ideally, we need a method to:
  \begin{itemize}
  \item Reduce complexity (breaking a large problem into smaller
    manageable ones)
  \item Provide approximate solutions where the necessary assumptions
    do not hold
  \end{itemize}
\item The most powerful general method is {\em decomposition}
  (sometimes called {\em aggregation})
\item The idea is to replace a subnetwork of a (complex) system with a
  single {\em flow-equivalent server} (FES)


\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=4cm]{diags/decomposition.eps}}
  \end{center}
  \caption{ Flow equivalent server. }
  \label{jobtime-job}
\end{figure}


\item We exploit the fact that servers in separable networks may have
  {\em state dependent} service rates, i.e. $\mu_n$ for queue length $n$
\item To find the required $\mu_n$, we compute the ``throughput'',
  $\tau_n$, of the subnetwork when the population is $n$ and set 
$$\mu_n = \tau_n, \quad n = 0,1, \dots$$
\item The throughput can be determined by {\em any} method, e.g. an
  explicit Markov model, standard queueing network algorithm
  (e.g. MVA), or even simulation
\item The technique is analogous to Norton's Theorem in electrical
  circuit theory.
\end{itemize}
\subsection{Method}
\begin{enumerate}
\item Identify the subnetwork to be aggregated

\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=2.5cm]{pictures/fes-step1.eps}}
  \end{center}
  \caption{ Flow equivalent server - step 1. }
  \label{fes-step1}
\end{figure}


\item Short-circuit the rest of the network (i.e. use the same
  visitation rates as in the original subnetwork)


\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=2.5cm]{pictures/fes-step2.eps}}
  \end{center}
  \caption{ Flow equivalent server - step 2. }
  \label{fes-step2}
\end{figure}


\item Compute the throughput $\tau_n$, on the short-circuit arc for
  all populations $0,1,\dots$

\begin{figure}[tbp]
  \hfill
  \begin{center}
%    \fbox{\includegraphics[height=2.5cm]{pictures/fes-step3.eps}}
    {\scalebox{0.6}{\includegraphics[height=2.5cm]{diags/aggregation3.eps}}}
  \end{center}
  \caption{ Flow equivalent server - step 3. }
  \label{fes-step3}
\end{figure}


\item Define a single FES with service rate 
  $\mu_n = \tau_n, n=0,1,\dots$
\item Replace the subnetwork with the FES and re-solve
\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=2.5cm]{pictures/fes-step5.eps}}
  \end{center}
  \caption{ Flow equivalent server - step 5. }
  \label{fes-step5}
\end{figure}


\item If necessary repeat the process (i.e. hierarchically)
\item Note: We can either aggregate the tricky part or aggregate the
  easy part leaving a smaller tricky network behind!
\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=5cm]{pictures/fes-trick.eps}}
  \end{center}
  \caption{ Flow equivalent server - trick. }
  \label{fes-trick}
\end{figure}


\end{enumerate}


     %
     %


\subsection{Application: A multiprogramming system with virtual
  memory}
\begin{itemize}
\item Let us now revisit the multiprogrammed virtual memory system,
  this time with terminals rather than batch processing:
\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{pictures/mpvm-system.eps}}
  \end{center}
  \caption{Multiprogramming system with virtual memory  }
  \label{mpvm-system}
\end{figure}

\item The average think time of each user is 30 seconds
\item Q: How does the system throughput and response time vary with
  the number of terminals, $K$, and at what point does the system
  start to thrash?
\item As users submit jobs, they increase the population at the
  computer system and consume memory
\item The problem is that the performance of the computer subsystem
  depends on the number of jobs there
\item But the number of jobs at the subsystem is governed by the
  number of terminals
\item Specifically, the Disk 1 service rate depends on the population
  of the entire subnetwork: \underline{no simple (product-form) solution}
\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{pictures/mpvm-system-step1.eps}}
  \end{center}
  \caption{Multiprogramming system with virtual memory  - step 1}
  \label{mpvm-system-step1}
\end{figure}


\item In the batch model we computed previously the
  throughput,$\tau_n$, when there were $n$ batch jobs being processed
\item We can thus aggregate the server subsystem into a {\em
    flow-equivalent server} whose service rate at population $n$ is 
  $$\mu_n = \tau_n$$
\item That is
 \begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{pictures/mpvm-system-fes.eps}}
  \end{center}
  \caption{Multiprogramming system with virtual memory  - FES}
  \label{mpvm-system-fes}
\end{figure}

\item approximation because of lack of product-form solution

\item The resulting reduced network is a simple M/M/1 queue with state
  dependent arrival and service rates:
  \begin{eqnarray*}
    \lambda_n & = & (K-n)\lambda\\
    \mu_n & = & \tau_n
  \end{eqnarray*}
  where $0\leq n \leq K$ and where $\lambda = 1/30$ is the submission
  rate for a single terminal in the ``think'' state
\item Let $p_n$ be the equilibrium probability that there are $n$ jobs
  at the subsystem (i.e. $K-n$ users in ``think'' mode)
\item The magic formula for the M/M/1 queue gives
  \begin{eqnarray*}
    p_n & = &  \frac{(K-n+1)\lambda}{\mu_n} \frac{(K-n)\lambda}{\mu_{n-1}}
    \dots \frac{K\lambda}{\mu_1} p_0 \\
    & = & \lambda^n \frac{K!}{(K-n)!}\prod_{i=1}^{n}\frac{1}{\mu_i}
  \end{eqnarray*}
  with the usual constraint that $$p_0 + p_1 + \dots + p_k = 1$$
\item The algorithm requires one loop to compute $p_0$ and a further
  loop to find the $p_n$, $1\leq n \leq K$
\item We are interested here in the throughput and mean response time
  for a given value of $K$
\item The throughput is:
  $$\tau = p_0\mu_0 + p_1\mu_1 + \dots + p_K\mu_K$$
\item We next need to find the mean population of the subsystem:
  $$L = 0 \times p_0 + 1 \times p_1 + \dots + K \times p_K$$
\item The mean waiting time then follows from Little's law
  $$W = L/\tau$$
\item With the same paging model as the batch system earlier, this is
  what happens to $\tau$ as we increase $K$

\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{diags/multitermTchart.eps}}
  \end{center}
  \caption{ Throughput vs. number of terminals. }
  \label{mpvm-system-tau-v-K}
\end{figure}


\item The saturation point is clear! Beyond this the mean population
  of the computer subsystem is such that thrashing occurs
\item Here is the behaviour of the mean waiting time as we increase
  $W$
\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{diags/multitermWchart.eps}}
  \end{center}
  \caption{ Mean response time  vs. number of terminals. }
  \label{mpvm-system-W-v-K}
\end{figure}

\item And finally (for the record), here is $L$:

\begin{figure}[tbp]
  \hfill
  \begin{center}
    \fbox{\includegraphics[height=3.5cm]{diags/multitermLchart.eps}}
  \end{center}
  \caption{ Central server population  vs. number of terminals. }
  \label{mpvm-system-L-v-K}
\end{figure}

 
\end{itemize}


     %
     %


\section{The Markov Modulated  Poisson Process (MMPP)}
\subsection{Introduction}
\begin{itemize}
\bigskip \item Poisson process is a good model for many arrival streams
\begin{itemize}
\item more short interarrival times than long
\item superposition of many sparse independent renewal processes is asymptotically
Poisson
\item right intuitive properties given minimal information (i.e.
arrival {\it rate})
\end{itemize}
\bigskip \item But cannot model many characteristics seen in modern networks
\begin{itemize}
\item multiple traffic types with known switching pattern between types
\item correlated traffic, i.e. non-zero correlation between
interarrival times
\item self-similarity
\end{itemize}
\bigskip \item MMPP accounts for some of these limitations
\begin{itemize}
\item models stochastically changing modes with different rates
\item has non-zero autocorrelation which can be calculated
\item integrates well into queueing models -- the MMPP/M/1 queue
\end{itemize}
\end{itemize}


\subsection{Definition}
\begin{itemize}
\bigskip \item The MMPP is a Poisson process with rate that depends on the state
(called {\it phase}) of an independent continuous time finite Markov chain $X(t)$
\begin{itemize}
\item $N$ is the number of phases
\item $\lambda_k$ is the rate of the Poisson process in phase $k$
\item so the probability of an arrival in a small interval of length $h$ in
phase $k$ is $$\lambda_k h + o(h)$$
\end{itemize}
\bigskip \item The modulating Markov chain $X(t)$ is assumed irreducible and so
has a steady state
\begin{itemize}
\item let the generator matrix (instantaneous transition rate matrix) be $Q$
\item let the equilibrium probability of being in state $k$ be $$\pi_k = \lim_{t
\rightarrow \infty} P(X(t)=k)$$
\end{itemize}
\bigskip \item Average arrival rate in steady state is then $${\overline \lambda}
= \sum_{k=1}^N \pi_k \lambda_k $$
\end{itemize}


\subsection{State at Arrival Instants}
\begin{itemize}
\bigskip \item Important for response time distributions
\bigskip \item Do {\it
not} have the Random Observer Property
\bigskip \item In a long period of time $T$ at equilibrium
\begin{itemize}
\item expected time spent in state $k$ is $\pi_kT$
\item expected number of arrivals that see state $k$ = $\lambda_k\pi_kT$
\item total number of arrivals has mean ${\overline \lambda} T$
\end{itemize}
\bigskip \item So probability that an arrival sees state $k$ in steady state
is $$\pi'_k = \frac{\lambda_k\pi_k}{{\overline \lambda}} $$
\end{itemize}

\subsection{The MMPP/M/1 queue}
\begin{itemize}
\bigskip \item Suppose the queue has MMPP arrival process as defined above
and server with exponential service times of constant mean $\mu^{-1}$
\bigskip \item Simple CTMC
\begin{itemize}
\item easy to write down the balance equations
\item apply steady state theorem
\item conditions harder to derive rigorously but expect that the queue has a
steady state iff $${\overline \lambda} < \mu$$
\end{itemize}
\bigskip \item The CTMC has a 2-dimensional state space
\begin{itemize}
\item phase of the MMPP horizontally (say)
\item queue length vertically
\item infinite lattice strip for unbounded queue
\end{itemize}
\bigskip \item Let the steady state probability for phase $k$ and queue length
$j$ be denoted by $\pi_{jk}$ and let the vector ${\bf v}_j =
(\pi_{j1},\pi_{j2},\ldots,\pi_{jN})$
\bigskip \item Balance equations are, for state $(i,j+1)$
\begin{eqnarray*}
\pi_{j+1,i}\left(  \lambda_{i} + \mu + \sum_{k \neq i} q_{ik}
\right) =  \sum_{k \neq i}  \pi_{j+1,k} q_{ki}
+\lambda_i \pi_{j,i}
+\mu \pi_{j+2,i}
\end{eqnarray*}
\end{itemize}

\subsection{The MMPP/M/1 queue (continued)}
\begin{itemize}
\bigskip \item This may be written in matrix form as
$${\bf v}_j \Lambda + {\bf v}_{j+1}(Q - \Lambda - \mu {I}) + \mu {\bf
v}_{j+2} I = {\bf 0}$$ where $I$ is the unit $N \times N$  matrix and $\Lambda$
is the diagonal matrix with the arrival rates $\lambda_k$ down the diagonal, i.e.
$\Lambda =
\mbox{diag}(\lambda_1,\ldots,\lambda_N)$
\bigskip \item For states $(i,0)$ we get $$
{\bf v}_{0}(Q - \Lambda) + \mu {\bf
v}_{1} I = {\bf 0}$$
\bigskip \item The normalising equation is $$
\sum_{j=0}^\infty {\bf v}_{j}\cdot(1,1,\ldots,1) = 1$$
\item These equations give the unique equilibrium probabilities if ${\overline
\lambda} <
\mu$
\end{itemize}
\subsection{Exact solution methods}
\begin{itemize}
\bigskip \item We have a matrix recurrence relation of the form
$${\bf v}_j Q_0 + {\bf v}_{j+1} Q_1 + {\bf
v}_{j+2} Q_2 = {\bf 0}$$
for $j \geq 0$ and constant matrices $Q_0, Q_1, Q_2$
\bigskip \item {\bf Spectral Analysis} (Chakka and Mitrani; see Mitrani's book,
last chapter)
\begin{itemize}
\item find the eigenvalues $\xi_k$ and eigenvectors ${\bf e}_k$ of a certain
$N-$dimensional matrix given by $Q$ and $(\lambda_1,\ldots,\lambda_N)$
\item solution can then be shown to be, for a queue of unbounded capacity $$\pi
= \sum_{k=1}^N  a_k \xi^k {\bf e}_k$$ where the  $a_k$ are constants determined
by the equations at queue length 0 and normalisation
\item more complex result for queues of finite capacity
\end{itemize}
\bigskip \item {\bf Matrix Geometric Method}
\begin{itemize}
\item Closely related to Spectral Analysis but a completely different approach
\item Try a solution of the form ${\bf v}_j = M^j{\bf u}$ for some matrix $M$
and vector ${\bf u}$
\item Then solve $$Q_0 + MQ_1 + M^2Q_2=0$$ for $M$
\item ${\bf u}$ from the other equations
\end{itemize}
\end{itemize}
\subsection{Approximate solution methods}
\begin{itemize}
\bigskip \item It often happens that phase changes are rare in comparison to
arrivals and departures
\begin{itemize}
\item rates in Q-matrix $<< \lambda_1,\ldots,\lambda_N, \mu$
\item e.g. phase changes represent change of traffic type, such as data, video,
voice
\end{itemize}
\bigskip \item Often then a good approximation to use (an approximate)
decomposition, cf. queueing networks
\bigskip \item For each phase $k$, solve the M/M/1 queue assuming the arrival
rate is constant,
$\lambda_k$, giving equilibrium probability $p_j^{(k)}$ for queue length $j$, $1\leq k\leq N$
\begin{itemize}
\item i.e. solve all the `columns' in the 2-D state space
\item can only work if every $\lambda_k < \mu$
\end{itemize}
\bigskip \item Then solve the phase process for the equilibrium probability $r_k$
of being in phase $k$, i.e. solve $${\bf r}\cdot Q = {\bf 0}$$
\bigskip \item Approximate the solution by
$$ \pi_{jk} = p^{(k)}_j r_k
$$
\end{itemize}
\subsection{Fitting MMPPs}
\begin{itemize}
\bigskip \item MMPP can be fitted to observed data by matching:
\begin{itemize}
\item average arrival rate ${\overline \lambda}$ (or mean interarrival time)
\item higher moments of interarrival time (variance etc.)
\item autocorrelation function (ACF)
\item Hurst parameter, a measure of self-similarity
\end{itemize}
\bigskip \item Matching moments is easy but often gives a poor match on
correlation, the main reason for using the MMPP in the first place!
\bigskip \item Can use ACF to model correlated traffic directly
\begin{itemize}
\item next slide
\item computationally difficult and expensive
\item possibly use spectral methods, cf. Fourier transforms
\end{itemize}
\bigskip \item Hurst parameter gives a good representation of `how correlated'
traffic is, but only contributes one parameter in the matching process
\begin{itemize}
\item  Better used to validate another matching process
\end{itemize}

\end{itemize}

\subsection{Autocorrelation function of the MMPP}
\begin{itemize}
\bigskip \item The autocorrelation function at lag $k\ge 1$ for the above MMPP
is
\begin{eqnarray*}
\rho_k = \frac{\pi^* R^{-2}\Lambda[\{I - {\bf e} \pi^* R^{-1}
\Lambda\}\{R^{-1}\Lambda\}^{k-1}]R^{-2}\Lambda {\bf e}}{2\pi^*
R^{-3} \Lambda {\bf e} - ({\pi^* R^{-2} \Lambda {\bf
e}})^{2}}\label{eq:acfm}
\end{eqnarray*}
where the matrix $R = \Lambda - Q$,  ${\bf e} = (1,1,\ldots,1)$ and $\pi^* = \frac{\pi\Lambda}{\pi . \lambda}$.
\bigskip \item Autocorrelation function is easily calculated by the above
formula, {\it given} the parameters of the MMPP
\bigskip \item But the converse, to determine the parameters from the ACF, is
hard!
\begin{itemize}
\item gives non-linear equations in several variables (the sought-after
parameters)
\item fixed point methods of solution tend to be unstable
\item research area
\end{itemize}
\end{itemize}


     %
     %


\begin{thebibliography}{99}

\bibitem[IM]{Mitrani}Isi Mitrani, {\em Probabilistic Modelling}, CUP
  1998, ISBN 0 521 58530 9
\end{thebibliography}
\end{document}