Extensions of the Single Server Efficiency Model
For the full paper, which includes all proofs, click here.
Abstract
Editor’s note: this paper comprises the third chapter of the PhD dissertation by Rachel Traylor. Visit here and here to see chapters one and two, respectively. Herein we further generalize the single server model presented in [3]. In particular, we consider a multichannel server under the cases of both singular and clustered tasks. In the instance of singular tasks, we present a load balancing allocation scheme and obtain a stochastic breakdown rate process, as well as derive the conditional survival function as a result. The model of a multichannel server taking in clustered tasks gives rise to two possibilities, namely, independent and correlated channels. We derive survival functions for both of these scenarios.
Load Balancing Allocation for a Multichannel Server
Model Description
Previously, we had assumed that a web server functions as a single queue that attempts to process jobs as soon as they arrive. These jobs originally brought a constant stress \eta to the server, with the system stress reducing by \eta at the completion of each job.
Now, suppose we have a server partitioned into K channels. Denote each channel as Q_k, k = 1,\ldots,K. Jobs arrive via a nonhomogenous Poisson process with rate \lambda(t). Upon arrival, each job falls (or is routed) to the channel with the shortest queue length. If all queue lengths are equal or multiple channels have the shortest length, the job will enter one of the appropriate queues with equal probability.
We retain the previous notation for the baseline breakdown rate, or hazard function. This is denoted by r_0(t) and is the hazard function under an idle system. We also retain the assumption that the arrival times \mathbf{T} are independent. In addition, the service times \mathfrak{W} are i.i.d. with distribution G_W(w). We assume that all channels are serving jobs at the same time, i.e., a job can be completed from any queue at any time. We do not require load balancing for service. In other words, any queue can empty with others still backlogged. We also retain the FIFO service policy for each queue.
Since we have now “balanced,” or distributed, the load of jobs in the server, not all jobs will cause additional stress to the system. Suppose all jobs bring the same constant stress \eta upon arrival. Under load balancing, we will define the additional stress to the system as \eta\max_kQ_k. Figure 1 shows an example server with current stress of 4\eta.
Figure 1. Partitioned Server with Load Balancing.
Examples
Due to the dynamic nature of arrival times, allocation to queues, and service times, we have many possible configurations of jobs at any point in time. Therefore, the allocation scheme adds an additional layer of variation to the service times and order of service. The placement of jobs in the various queues (and thus the order of service and service times) is wholly dependent on all arrival times and service times of the prior arrivals. The following examples illustrate the effect on the workload stress added to the system in various scenarios.
Figure 2(a). Example 1 Load Balancing Queue Configuration
Figure 2(b). Breakdown Rate Process Trajectory
Example 1. Suppose for simplicity we have 2 channels. Suppose at the time of observation of the system, 3 jobs have arrived and none have finished. WLOG, suppose job 3 fell into Q_{1}. See Figure 2(a). The stress to the system at t=t_{\text{obs}} is r_{0}(t_{\text{obs}}) + 2\eta, as shown in Figure 2(b).
Note in Example 1 that Job 2 does not add any additional stress to the system. Job 1 sees an empty queue upon arrival, and \max_{K}Q_{K} = 1 when it falls into any particular queue. Job 2 arrives as Job 1 is still being processed, and thus the placement of Job 1 forces Job 2 into the empty channel. Since \max_{K}Q_{K} is still 1, the stress to the system doesn’t change. Job 3 arrives as Jobs 1 and 2 are in service, and thus its choice of queue is irrelevant due to the configuration of the two queues at T_{3}. Regardless of which queue Job 3 falls into, \max_{K}Q_{K} = 2. Thus the arrival of Job 3 increases the breakdown rate by \eta again.
The next example shows the change in system stress Job 1 from Example 1 when one job has finished processing before T_{3}.
Figure 3(a). Example 2 Load Balancing Queue Configuration
Figure 3(b). Example 2 Breakdown Rate Process Trajectory
Example 2. Consider the same twochannel system from Example 1. However, now suppose WLOG that T_{3} < T_{1}+W_{1}. In other words, service for Job 1 was completed before Job 3 arrived. Hence Job 3 will fall into the opposite queue as Job 2. The stress to the system at the time of observation would be r_{0}(t) + \eta. See Figures 3(a) and 3(b).
In this scenario, the workload due to Job 3 does not contribute any additional stress to the server. Also observe that upon completion of Job 1, the workload stress to the server does not decrease, as Job 2 still resides in the system and is being served.
Contrast this behavior with the breakdown rate process given in [3]. In the singlechannel, singleserver model described in both [1] and [3], each job adds stress to the server upon arrival. Under the load balancing allocation scheme, the additional stress to the server depends on the arrival and service times of all prior jobs. From a stochastic perspective, this breakdown rate process has full memory.
The examples above illustrate that \max_{K}Q_{K} depends on the intersection of the intervals I_{j} = [T_{j}, T_{j} + W_{j}], j = 1,\ldots,N(t). The next section details the methodology to obtain the configuration of jobs in the server at time t by deccomposition of \bigcup_{j=1}^{N(t)}I_{j} into disjoint atoms and derives the stochastic breakdown rate process under the load balancing allocation scheme.
Breakdown Rate Process and Conditional Survival Function
Let \epsilon = (\epsilon_{1},\ldots,\epsilon_{N(t)}) be a N(t)tuple whose components \epsilon_{j}\in\{\emptyset, c\}, where \emptyset denotes the empty set, and c denotes the complement of the set. Let E = \{\epsilon \mathrel{:} \epsilon_{j}\in\{\emptyset, c\}\} denote the set of all possible \epsilon, excepting \epsilon = (c,\ldots,c). Then by Lemma 1 (see Appendix),
\bigcup_{j=1}^{N(t)}I_{j} = \bigcup_{\epsilon \in E}\bigcap_{j=1}^{N(t)}I_{j}^{\epsilon_{j}}Remark. \cap_{j=1}^{N(t)}I_{j}^{\epsilon_{j}} indicates which jobs are still in the server at time t. The union is disjoint; thus only one \epsilon will describe the server configuration at any given time t. For example, if 3 jobs have arrived to the server at time t_{\text{obs}}, E = 3\times 2  1 = 5. These may be enumerated:

 I_{1} \cap I_{2} \cap I_{3}

 I_{1}^{c} \cap I_{2}^{c} \cap I_{3}
 I_{1} \cap I_{2}^{c} \cap I_{3}^{c}

 I_{1}^{c} \cap I_{2} \cap I_{3}
 I_{1}^{c} \cap I_{2} \cap I_{3}^{c}
As an illustration, refer to Example 1. All three jobs are in the system at t = t_{\text{obs}} (that is, none have completed service), and thus t_{\text{obs}} \in I_{1} \cap I_{2} \cap I_{3}. Expanding, t_{\text{obs}} \in [T_{1}, T_{1}+W_{1}], [T_{2}, T_{2}+W_{2}], and [T_{3}, T_{3}+W_{3}].
Compare the case with that of Example 2. In this case, three jobs have arrived at t = t_{\text{obs}}, but Job 1 has finished by t_{\text{obs}}. Thus t_{\text{obs}} \not\in I_{1}, but since Jobs 2 and 3 are still in the system, t_{\text{obs}} \in I_{2} \cap I_{3}. Thus t_{\text{obs}} \in I_{1}^{c} \cap I_{2} \cap I_{3}.
Now, since the additional workload stress to the server is a multiple of \eta\max_{K}Q_{K}, it remains to derive the appropriate multiplier that accounts for the number of jobs that contribute additional stress to the system.
Let n = \sum_{j=1}^{N(t)}\mathbf{1}(\epsilon_{j} = \emptyset  \epsilon_{j} \in \epsilon) for a particular \epsilon, and let \alpha_{\epsilon} be the multiplier that indicates the number of jobs that contribute stress \eta to the system. Under [1] and the generalization in [3], every uncompleted job in the system contributes stress, thus \alpha_{\epsilon} = n.
Under the load balancing scheme, \alpha_{\epsilon} = \lfloor\frac{n+1}{K}\rfloor, where K is the number of channels in the server. This is due to the allocation scheme’s attempts to evenly distribute jobs across channels. Thus, for Example 1, n=3, and K=2, meaning \alpha_{\epsilon} = 2, as illustrated in Figure 2(b) and for Example 2, \alpha_{\epsilon} = \lfloor\frac{3+1}{2}\rfloor = 1, as in Figure 3(b).
Then, the stochastic breakdown rate process under the load balancing allocation scheme is given by
\mathcal{B}(t) = r_{0}(t) + \eta\sum_{\epsilon \in E}\alpha_{\epsilon}\mathbf{1}_{I_{1}^{\epsilon_{1}}\cap I_{2}^{\epsilon_{2}} \cap I_{N(t)}^{\epsilon_{N(t)}}}(t)
Under this expression, only one indicator function will be nonzero at any given point in time, since all atoms are disjoint. Now, I_{1}^{\epsilon_{1}} \cap I_{2}^{\epsilon_{2}} \cap \ldots \cap I_{N(t)}^{\epsilon_{N(t)}} may be expressed as one interval [L_{\epsilon},R_{\epsilon}], where
\begin{aligned}L_{\epsilon}&=\max\left(\{T_{j}\mathrel{:}\epsilon_{j} = \emptyset\}_{j=1}^{N(t)}\right);\\R_{\epsilon}&=\min\left(\{T_{j} + W_{j}\mathrel{:}\epsilon_{j} = \emptyset \}_{j=1}^{N(t)},\{T_{j}\mathrel{:}\epsilon_{j} = c\}_{j=1}^{N(t)}\right)\end{aligned}Thus, for a server with K channels under a load balancing routing scheme with all jobs bringing constant stress \eta, the breakdown rate process \mathcal{B}(t) may be expressed as
\mathcal{B}(t) = r_{0}(t) + \eta\sum_{\epsilon \in E}\alpha_{\epsilon}\mathbf{1}_{[L_{\epsilon}, R_{\epsilon}]}(t)\qquad(*)Thus, the conditional survival function under the load balancing scheme is given by
\begin{aligned}S_{Y\mathfrak{T},\mathfrak{W}, N(t)}(t\mathfrak{t},\mathfrak{w}, n) &= e^{\int_{0}^{t}\mathcal{B}(s)ds}\\&=\bar{F}_{0}(t)\exp\left(\eta\int_{0}^{t}\sum_{\epsilon \in E}\alpha_{\epsilon}\mathbf{1}_{[L_{\epsilon}, R_{\epsilon}]}(s)ds\right)\\&=\bar{F}_{0}(t)\exp\left(\eta\sum_{\epsilon\in E}\alpha_{\epsilon}\min(tL_{\epsilon},R_{\epsilon})\right) \end{aligned}
Remarks
Finding the survival function of the singlechannel environment relied on the independence of the set of arrival times and service times. From (*), the independence is clearly lost. As noted before, the random breakdown process has full memory, and thus is completely dependent upon the entire trajectory up to t = t_{\text{obs}}.
Clustered Tasks in a Multichannel Server
Figure 4. Clustered Tasks in a Multichannel Server
The previous multichannel server model in Section 1 implicitly assumed each job comes with one task, and all channels are identical in their ability to serve any task brought by a job. A classic illustration is a block of registers at a retail establishment. Each customer will survey the length of the various queues at each register before choosing the shortest queue. Viewing each of these separate registers as a channel in a single server under these conditions gave rise to the load balancing allocation model detailed in the previous section. This section presents a different interpretation of a multichannel, singleserver model.
Suppose a server has multiple channels Q_{1},\ldots,Q_{K}, but each channel serves a different type of task. A customer arrives to the server and may select any number from 0 to K tasks for the server to perform. Said customer will select each possible task j with probability p_{j}. Figure 4 illustrates an example of such a situation in which three customers visit the server and each customer picks a different number and set of tasks at random. A customer is considered fully serviced (i.e. the job is complete) upon completion of the last task belonging to that particular customer.
Model Assumptions
The following mathematical assumptions are made for the multichannel server with clustered tasks:
 Customers arrive to the server with K channels via a nonhomogenous Poisson process (NHPP) with intensity \lambda(t).
 The breakdown rate of the idle server is given by r_{0}(t).
 Each channel corresponds to a different task the server can perform.
 The selection of each task is a Bernoulli random variable with probability p_{k}. Thus the number of tasks selected by each customer is a binomial random variable.
 The workload stress to the server is a constant multiple \eta of the number of tasks requested by the customer, i.e. the additional stress is given by \eta N, where N is the number of tasks requested.
 The PDF of each channel’s service time is given by g_{i}(w), i = 1,\ldots ,K. Since the customer’s service is not complete until all requested tasks have finished, the service life distribution for the customers is given by \max_{i}G_{i}(w).
Under these assumptions, this model is a special interpretation of the random stress environment developed in [3]. In this case, the random workload stress is \eta N, where N is a binomial random variable, and the service life distribution G_{W}(w) = \max_{i}G_{i}(w), which may be easily obtained through the mathematical properties of order statistics. Two variations are considered in the next section: independent channels and correlated channels.
Independent Channels in a Clustered Task Server
Suppose the selection probabilities for each task in the server are identical, that is, p_{1}=p_{2}=\ldots=p_{K}=p. Then N\sim\text{Bin}(K,p). Using Theorem 3 in [3], the survival function of the multichannel server is given in the following theorem.
Theorem 1 (Survival Function of Multichannel Server with Clustered Tasks and Independent Channels). Suppose model assumptions (1)(6) above are satisfied. In addition, assume p_{1}=p_{2}=\ldots=p_{K}=p. Then the survival function of the server is given by
\begin{aligned}S_{Y}(t)&=\bar{F}_{0}(t)\exp\left(K\eta\left[e^{\eta t}\left(1p+pe^{\eta t}\right)^{K1}p(1p)^{K1}\right]\right.\\&\qquad\left.\times\int_{0}^{t}m(tw)\bar{G}_{W}(w)dw\right)\end{aligned}
where m(x) = \int_{0}^{x}\lambda(s)ds, \bar{F}_{0}(t) = e^{\int_{0}^{t}r_{0}(s)ds}, \bar{G}_{W}(w) = 1G_{W}(w), and G_{W}(w) = \max_{i}G_{i}(w).
Correlated Channels in a Cluster Server
Now suppose the server tasks are correlated, in that the selection of one particular task may affect the selection of any or all of the other tasks. Thus the channels are a sequence of dependent Bernoulli random variables. The construction of dependent Bernoulli random variables is given in [2], and a summary is given.
Dependent Bernoulli Random Variables and the Generalized Binomial Distribution
Korzenwioski [2] constructs a sequence of dependent Bernoulli random variables using a binary tree that distributes probability mass over dyadic partitions of [0,1]. Let 0\leq\delta\leq1, 0<p<1, and q=1p. Then define the following quantities:
\begin{aligned}q^{+}\mathrel{:=}q+\delta p&\qquad p^{+}\mathrel{:=}p+\delta q\\q^{}\mathrel{:=}q(1\delta)&\qquad p^{}\mathrel{:=}p(1\delta)\end{aligned}
The quantities above satisfy the following conditions:
\begin{aligned}q^{+}+p^{}=q^{}&+p^{+}=q+p=1\\qq^{+}+pq^{}=q&,\quad qp^{}+pp^{+}=1\end{aligned}
Figure 5. Construction of Dependent Bernoulli Random Variables
Figure 5 shows the construction shows the dependencies. The following examples using coin flips illustrate the effect of the dependency coefficient \delta:
Example 1 (\delta=1). For \delta = 1, q^{+} = q+p=1, q^{} = 0, p^{+} = p+q = 1, and p^{} = 0. Supposing the first coin flip \epsilon_{1} = 1. Then every successive \epsilon_{i} will also be 1. Similarly if \epsilon_{1} = 0. Thus the result of the first coin flip completely determines the outcomes of all the rest.
Example 2 (\delta = 0). For \delta = 0, q^{+} = q^{} = q, and p^{+} = p^{} = p. Thus, the first coin flip (and all subsequent ones) have no effect on the ones that follow.
Example 3 (\delta = \frac{1}{4}). Suppose p=q=\frac{1}{2}. Then p^{+} = q^{+} = \frac{5}{8}, and p^{} = q^{} = \frac{3}{8}. Then the subsequent outcomes \epsilon_{i}, i \geq 2 are more likely to match the outcomes of \epsilon_{1} than not.
Now suppose p = \frac{1}{4}, q = \frac{3}{4}. Then p^{+} = \frac{7}{16}, p^{} = \frac{3}{16}, q^{+} = \frac{13}{16}, and q^{} = \frac{9}{16}. In this example of an unfair coin, the dependency coefficient \delta still attempts to skew the results following the first coin flip in favor of the outcome of \epsilon_{1}. However, the dependency here heightens the effect of \epsilon_{1} = 0 on subsequent flips, and cannot overcome the discrepancy between the probability of success and failure to skew \epsilon_{i}, i \geq 2 in favor of a 1 following the outcome of \epsilon_{1} = 1.
Using these dependent Bernoulli random variables, [2] presents a Generalized Binomial Distribution for identically distributed but dependent Bernoulli random variables.
Generalized Binomial Distribution
Let X = \sum_{i=1}^{n}\epsilon_{i}, where \epsilon_{i}, i = 1,\ldots ,n are identically distributed Bernoulli random variables with probability of success p and dependency coefficient \delta. Then
P(X=k) = q\tbinom{n1}{k}(p^{})^{k}(q^{+})^{n1k} + p\tbinom{n1}{k1}(p^{+})^{k1}(q^{})^{n1(k1)}
Survival Function of Correlated Channels in a Cluster Server
Suppose the selection of tasks may be modeled by the dependent Bernoulli random variables given in the previous section. That is, suppose the customer selects Tasks 1K in sequence, and the selection or rejection of Task 1 affects all subsequent tasks by a dependency coefficient \delta. From [2], the correlation between task selections \epsilon_{i} and \epsilon_{j} is given by
\rho = \text{Cor}(\epsilon_{i},\epsilon_{j})=\begin{cases}\delta,&i=1;j=2,\ldots,K\\\delta^{2},&i\neq j;i,j\geq 2\end{cases}
This illustrates the dependency of Tasks 2K on the outcome of Task 1, and notes that while Tasks 2K are still correlated with each other, the dependency is much lower. In a similar fashion to the independent channel server, the survival function is derived.
Theorem 2 (Survival Function of Multichannel Server with Clustered Tasks and Dependent Channels). Suppose model assumptions 16 are satisfied. In addition, suppose the selection of channels 1K are determined by identically distributed Bernoulli random variables with dependency coefficient \delta as defined in [2]. Then the survival function of the server is given by
S_{Y}(t)=\bar{F}_{0}(t)\exp\left(\eta\int_{0}^{t}m(tw)\bar{G}_{W}(w)\mathscr{S}(w)dw\right)
where m(x) = \int_{0}^{x}\lambda(s)ds, and
\begin{aligned}\mathscr{S}(w)&=\sum_{n=0}^{K}e^{\eta n w}\sum_{j=0}^{Kn1}\tbinom{K1}{n1, j,K1nj}p^{K1j}(1p)^{j+1}\delta^{K1nj}(1\delta)^{n}\\&\quad+\sum_{n=0}^{K}ne^{\eta nw}\sum_{i=0}^{n1}\tbinom{K1}{K1n,i,n1i}p^{i+1}(1p)^{Kn}\delta^{n1j}(1\delta)^{Knj}\end{aligned}Conclusion
The generalized model of a server under random workload proposed in [3] admits further expansion by way of relaxing the assumption that incoming tasks have exactly one queue to enter on arrival. In considering a server partitioned into several channels, a cost is incurred, namely that additional stress to the server is dependent upon arrival and service times of all previous jobs. However, even under these circumstances, we may obtain a breakdown rate process and satisfactory conditional survival function for the server, and the door is opened to further discussion. By examining the multichannel server, we consider the interrelations of the channels themselves, and derive survival functions to meet the case when the channels are independent as well as when they are correlated.
This work is licensed under a Creative Commons AttributionNonCommercialNoDerivatives 4.0 International License.