Optimizing Cloud Utilization via Switching Decisions
Eugene Feinberg
Xiaoxuan Zhang
Applied Mathematics and Statistics
Stony Brook University
Stony Brook, NY 11794
IBM T.J. Watson Research Center
Yorktown Heights, NY 10598
efeinberg@notes.cc.sunysb.edu
ABSTRACT
This paper studies a control problem for optimal switching on and
off a cloud computing services modeled by an M/M/∞ queue with
holding, running and switching costs. The main result is that an
average-optimal policy either always runs the system or is an (M, N)policy defined by two thresholds M and N, such that the system is
switched on upon an arrival epoch when the system size accumulates to N and it is switched off upon a departure epoch when the
system size decreases to M. We compare the optimal (M, N)-policy
with the classical (0, N)-policy and show the non-optimality of it.
Keywords
Cloud Computing, M/M/∞, Queueing Control, Markov Decision
Process
1. INTRODUCTION
As cloud computing starts to play an important role in the areas related to big data analytics, the performance evaluation and
stochastic control of cloud computing centers are paid more and
more attentions in recent years. Cloud providers like Amazon Elastic Compute Cloud [1], Google Compute Engine [5], HP Cloud
Compute [?], IBM Smart Cloud [9] and Microsoft [15] offer computing and storage sharing facilities for lease at various prices depending on the type of services. A major business requirement in
cloud computing is that it must be able to support extensive scalability at an acceptable and competitive cost. Most cloud computing
is expected to be able to support thousands of servers with reasonable performances [10], which is realized by using peer-to-peer
parallel and distributed techniques. Such facilities are highly scalable and capable of handling huge number of simultaneous user
requests using queueing based models [15]. Large financial and
environmental costs associated with high electricity consumption
of running a data center is one of the major considerations for these
cloud providers [6]. How to operate a cloud efficiently and environmental friendly is becoming more challenging nowadays. This
paper studies the optimal decisions for a server provider of cloud
computing services, who needs to make decisions on whether to
pay for the cloud facility (switch on the service) or not.
Many research works on resource allocation and stochastic control of a cloud center consider modeling the cloud computing as
a multi-server queueing system [11, 13]. However, a cloud center can have a huge number of server nodes, typically of the order
∗
This research was partially supported by NSF grants CMMI0928490.
Copyright is held by author/owner(s).
∗
zhangxiaoxuan@live.com
of hundreds or thousands [6, 1]. Given that n is large, traditional
M/M/n queueing system is not suitable in this case. Thus we look
into M/M/∞ queue in this paper.
In addition to cloud computing, another motivation comes from
the application in IT software maintenance. [12] studied the software maintenance problem as a queue formed by software maintenance requests relevant to software bugs experienced by customers.
Once a customer is served and the appropriate bug is fixed in the
new software release or patch, it also provides service to some other
customers in the queue. In [12] it was assumed that the number
of customers leaving the queue at a service completion time has a
binomial distribution, this problem was modeled in [12] as an optimal switching problem for an M/G/1 queue in which a binomially
distributed number of customers is served each time, and the best
policy was found among the policies that turn the system off when
it is empty and turn it on when there are N or more customers in
the system. Here we observe that after an appropriate scaling, this
problem and the optimal switching problem for an M/M/∞ queue
have the same fluid approximations. So, the result on optimality
of (M.N)-policies described below provide certain insights to the
software maintenance problem in [12].
In this paper, we consider an M/M/∞ queue with Poisson arrivals and independent exponential service times. The number of
servers is unlimited. The system can be switched on and off any
time. All occupied servers operate when the system is on, and all
the servers are off when the system is off. The costs include the
linear holding cost h for a unit time that a customer spends in the
system, the start-up cost s1 , the shut-down cost s0 , and the running
costs c per unit time. We assume that h > 0, and s0 + s1 > 0.
To simplify the initial analysis, we assume that the server can
be turned on and off only at time 0, customer arrival times, and
customer departure times. These times are jump epochs for the
process X(t) of the number of customers in the system at time t. Let
t0 , t1 , . . . be the sequence of jump epochs. We initially consider the
servers can be switched on and off only at jump epochs. Switching
takes place only at these times is not restrictive, and the optimal
policies described in the paper are also optimal when the system
can be turned on and off any time.
The main result of this paper is that either the policy that always
keeps the server on is average-optimal or for some integers M and
N, where N > M ≥ 0, the so-called (M, N)-policy, is averageoptimal. The (M, N)-policy switches the running system off when
the number of customers in the system is not greater than M and
it turns the idling system on when the number of customers in the
queue reaches or exceeds N. In particular, (0, N)-policies are known
in the literature under the name of N-policies. It is well-known [8]
that for an M/G/1 queue either the policy that always runs the sever
is average-optimal or for some natural number N the N-policy is
average-optimal. We will show that N-policies may not be averageoptimal for M/M/∞ queues.
We model the problem as a Continuous-Time Markov Decision
Process in Section 2. The transition rates in this CTMDP are not
bounded. This is an additional complication. CTMDPs with unbounded transition rates were recently studied by [7]. However,
it was assumed there that any stationary policy defines an ergodic
continuous-time Markov chain. This condition does not hold for
the problem we consider because the policy that always keeps the
system off defines a transient Markov chain. First we study the
expected total rewards in Section 3.1. Then, in Section 3.2 we investigate the discrete-time total-reward problem limited to the policies that never turn the running system off. Section 3.3 investigates
the reduction of the original model to the finite state space model
and proves the existence of average-optimal policies. In Section 4
we prove the existence of stationary average-optimal policies and
describe their structure. Section 5 provides the conditions for the
full-service policies and (M, N)-policies. We provide the linear
programming method to compute the average-optimal polices in
Sectin 6.
2. PROBLEM FORMULATION
We model the above described problem as a continuous time
Markov decision process (CTMDP). The state space is Z = N ×
{0, 1}, where N = {0, 1, . . .}. If the state of the system at the decision epoch n is zn = (Xn , δn ) ∈ Z, this means that the number of
customers in the system is Xn and the state of the system is δn , with
δn = 1 if the system is on and δn = 0 if the system is off. The action
set is A = {0, 1}, where a = 0 indicating turning or keeping the
system off and a = 1 meaning turning or keeping the system on.
The transition rate from state z = (i, δ) with action a to state
z′ = ( j, a) is q(z′ |z, a) = q( j|i, a) where
λ, if j = i + 1,
q( j|i, a) =
(1)
iµ if i > 0, a = 1, j = i − 1,
0, otherwise.
X
X
At state z = (i, δ), define q(z, a) = q(i, a) =
q(z′ |z, a) =
q( j|i, a),
z′ ∈Z
j∈N
for z′ = ( j, a). Now we define the cost function.
The cumulaRt
tive cost C(t) over the interval t is C(t) = 0 (hX(t) + cδ(t))dt +
N(t)
X
sδ(tn ) |δ(tn ) − δ(tn −)|, where X(t) is the system size, δ(t) is the
n=0
system state at t, and N(t) is the number of jump epochs up to t.
A stationary policy is defined by a mapping π : Z → A such
that π(z) ∈ A, z ∈ Z. For each initial state of the system x0 = (i, δ),
and for any policy π, the policy π defines a stochastic sequence
{xn , an , tn , n = 0, 1, . . .}, where t0 = 0 and tn+1 ≥ tn . We denote by
E πx0 the expectation of this process. For any initial state of the system z0 = (i, δ), and for any policy π, the expected total discounted
cost over the infinite horizon is
Z ∞
π
Vαπ (i, δ) = E(i,δ)
e−αt dC(t)
0
Z ∞
∞
X
π
−αtn n
−αt
=E(i,δ)
e
|an − δn |san . (2)
e (hX(t) + δ(t)c)dt +
0
n=0
Define the average cost per unit time as vπ (i, δ) = lim sup t−1 Ezπ0 C(t).
3. DISCOUNTED COST CRITERION
3.1
Reduction to Discrete Time and Existence
of Stationary Discount-Optimal Policies
At state (i, 1), the time until the next jump has an exponential
distributions with intensity λ + iµ → ∞ as i → ∞. Since the
jump rates are unlimited, it is impossible to present the problem
as a discounted MDP with the discount rate smaller than 1. Thus,
we shall present our problem as a negative MDP. By Theorem
5.6 in [4], the expected transition time until the next jump epoch
1
, and the one-step cost as
is τα (z, a) = τα ((i, δ), a) =
α + q(z, a)
Cα ((i, δ), a) = |a − δ|sa + (hi + ac)τα ((i, δ), a). We use C(z, a) and
τ(z, a) to denote Cα (z, a) and τα (z, a) respectively if α = 0. The
value function V(z) satisfies the optimality equation
V(z) = min {Cα (z, a) +
a∈A(z)
X q(z′ |z, a)
(z′ |z, a)V(z′ )}, z ∈ Z.
α + q(z, a)
z′ ∈Z
By [4], for an MDP with a countable state set Z, action sets A(Z),
transition rate q(z′ |z, a) , and nonnegative one-step rewards C(z, a),
a stationary a stationary policy φ is optimal if and only if for all
X q(z′ |z, a)
z ∈ Z it satisfies V(z) = Cα (z, φ(z)) +
V(z′ ), where
α
+
q(z,
φ(z))
′
z ∈Z
V(z) is the infimum of the expected total costs starting from state
z. We followY
the conventions that q(−1|i, a) = 0, Vα (−1, δ) = 0,
X
= 0, and
= 1.
∅
∅
Theorem 1. For any α > 0 the following statements hold:
(i) For i = 0, 1, . . . , Vα (i, δ) ≤ (1 − δ)s1 +
hi
hλ
c
+
+ .
µ + α α(µ + α) α
(ii) For all i = 0, 1, . . . and all δ = 0, 1 the value function Vα (i, δ)
satisfies the optimality equation
(
q(i − 1|i, a)
Vα (i − 1, a)
Vα (i, δ) = min C((i, δ), a) +
a∈{0,1}
α + q(i, a)
)
q(i + 1|i, a)
+
Vα (i + 1, a)
(4)
α + q(i, a)
(iii) There exists a stationary discount-optimal policy, and a stationary policy φ is discount-optimal if and only if for all
i = 0, 1, . . . and for all δ = 0, 1
(
q(i − 1|i, a)
Vα (i, δ) = min Cα ((i, δ), φ(i, δ)) +
×
φ(i,δ)∈[0,1]
α + q(i, φ(i, δ))
)
q(i + 1|i, a)
Vα (i − 1, φ(i, δ)) +
Vα (i + 1, φ(i, δ)) .
α + q(i, φ(i, δ))
3.2
Full Service Policies
The class of the policies that never turn the running server off
is the class of all policies in the MDP with Z, A, A(i, 0) = A and
A(i, 1) = {1}. This is a sub-model of our original model. Defined by
(2), let Uα (i, δ) be the optimal total discount cost under the policy
that always runs the system.
t→∞
Let Vα (i, δ) = inf Vαπ (i, δ), and v(i, δ) = inf vπ (i, δ). A policy ϕ is
π
π
called discount-optimal if Vαϕ (i, δ) = Vα (i, δ) for any policy π and
for any initial state (i, δ). A policy ϕ is called average-optimal if
vϕ (i, δ) = v(i, δ) for any policy π and for any initial state (i, δ).
(3)
Theorem 2. For any α > 0 the following statements hold:
(i) For all i = 0, 1, . . . Uα (i, 1) =
hλ
c
hi
+
+ .
µ + α α(µ + α) α
(ii) For all i = 0, 1, . . ., Uα (i, 0) satisfies the optimality equation
(
λ
hi + c
+
Uα (i + 1, 1)+
Uα (i, 0) = min s1 +
α + λ + iµ α + λ + iµ
)
iµ
hi
λ
Uα (i − 1, 1),
+
Uα (i + 1, 0) . (5)
α + λ + iµ
α+λ α+λ
Definition 1. For an integer n ≥ 0, a policy is called n-full
service if it never turns the running sever off and turns the inactive
server on if and only if there are n or more customers in the system.
In particular, the 0-full service policy turns on the server at time 0,
if it is off, and always keeps it on. A policy is called full service if
and only if it is n-full service for some n ≥ 0.
Theorem 3. A policy φ is discount optimal within the class of
the policies that never turn off the server if and only if
1, if i > A(α),
φ(i, 0) =
0, if i < A(α),
for all i = 0, 1, . . ., where A(α) =
3.3
(µ + α)(c + αs1 )
.
hµ
Reduction to Finite State Space and Existence of Average-Optimal Policies
Let Z1 = {i ≥ 0 : Vα0 (i, 1) ≤ Vα1 (i, 1)}, define
max Z1 , if Z1 , ∅,
∗
Mα =
−1,
otherwise.
Nα∗ = min{i > Mα∗ : Vα1 (i, 0) ≤ Vα0 (i, 0)}.
(6)
(7)
Theorem 4. For each α > 0 either the nα -full service policy is
optimal there exists a stationary discount-optimal policy φα with
the following properties:
1, if i > Mα∗ and δ = 1;
1, if i = Nα∗ and δ = 0;
(8)
φα (i, δ) =
1, if i ≥ nα and δ = 0;
∗
0,
if
i
=
M
and
δ
=
1;
α
0, if M ∗ ≤ i < N ∗ and δ = 0.
α
α
4. STRUCTURE OF AVERAGE-OPTIMAL
POLICIES
Definition 2. For two nonnegative integers M and N with N >
M, a stationary policy is called an (M, N)-policy if
1, if i > M and δ = 1;
1, if i ≥ N and δ = 0;
φ(i, δ) =
0, if i ≤ M and δ = 1;
0, if i < N and δ = 0.
Theorem 5. There exists a stationary average-cost optimal policy and, depending on the model parameters, either the n-full service policy is optimal for n = 0, 1, . . . , or an (M, N)-policy is optimal for some N > M ≥ 0 and N ≤ n∗ , where
c
n∗ = ⌊ + 1⌋.
(9)
h
In addition, the optimal average-cost value v(i, δ) is the same for
all initial states (i, δ); that is v(i, δ) = v.
5. CONDITIONS FOR AVERAGE-OPTIMALITY
OF N -FULL SERVICE POLICY AND (M, N)POLICIES
Theorem 6. The following statements hold for n∗ = ⌊c/h + 1⌋ :
h(n∗ − 1)
λ
, any n-full service policy is
(i) when c < ∗ (s0 + s1 ) +
n
2
average-optimal for n = 0, 1, . . .;
h(n∗ − 1)
λ
, both the n-full service pol(ii) when c = ∗ (s0 + s1 ) +
n
2∗
icy, n = 0, 1, . . ., and the (0, n )-policy are average-optimal;
λ
h(n∗ − 1)
(iii) when c > ∗ (s0 +s1 )+
, an (M, N)-policy is averagen
2
optimal for some N > M ≥ 0 and n∗ ≥ N.
6. COMPUTATIONS OF AN AVERAGE-OPTIMAL
POLICY
According to Theorem 6, there is an optimal policy φ with φ(i, δ) =
c
1 when i ≥ n∗ = ⌊ + 1⌋. Thus, the goal is to find the optimal valh
ues of φ(i, δ) when i = 0, 1, . . . , n ∗ −1 and δ = 0, 1. To do this,
we truncate the state space Z to Z ′ = {0, 1, . . . , n∗ − 1} × {0, 1}. If
the action 1 is selected at state (n∗ − 1, 1), the system moves to the
state (n∗ − 2, 1), if the next change of the number of the customer
in the system is a departure and the system remains in (n∗ − 1, 1),
if an arrival takes place. In the latter case, the number of customers
increases by one at the arrival time and then it moves according to
the random work until it hits the state (n∗ − 1, 1) again. Thus the
system can jump from the state (n∗ − 1, 1) to itself and therefore it
cannot be described as an CTMDP. However, it can be described as
a Semi-Markov Decision Process (SMDP) [14, Chapter 5].
We describe our problem as an SMDP with the state set Z ′ and
action set A(z) = A = {0, 1}. If an action a is selected at state z ∈ Z ′ ,
the system spent an average time τ′ in this state until it moves to
the next state z′ ∈ Z ′ with the probability p(z′ |z, a) and during this
time the expected cost C ′ (z, a) is incurred. In states z′ = (i, δ) with
i = 0, 1, . . . , n∗ − 2 and δ = 0, 1, these characteristics are the same
as for the original CTMDP, for z = (i, δ) with i = 0, 1, . . . , n∗ − 2
and δ = 0, 1,
1,
if a = 0, z′ = (i + 1, 0),
λ
, if a = 1, z′ = (i + 1, 1),
λ + iµ
′
p(z |z, a) =
(10)
iµ
if a = 1, z′ = (i − 1, 1),
λ + iµ
0,
otherwise,
1
λ,
′
τ ((i, δ), a) =
1
λ + iµ
if a = 0,
(11)
if a = 1,
and C ′ ((i, δ), a) = |a − δ|sa + (hi + ac)τ′ ((i, δ), a). The transition
probabilities in states (n∗ − 1, δ) with δ = 0, 1 are defined by p((n∗ −
2, 1)|(n∗ − 1, δ), 1) = (n∗ − 1)µ/(λ + (n∗ − 1)µ), p((n∗ − 1, 1)|(n∗ −
1, δ), 1) = λ/(λ + (n∗ − 1)µ), and p((n∗ − 1, 1)|(n∗ − 1, δ), 0) = 1.
In the latter case, the number of customers increases by 1 to n∗ , the
system turns on, and the number of customers becomes n∗ − 1.
Let T i be the expected time between an arrival sees i customers
in an M/M/∞ queue and the next time when a departure leaves i
customers behind, i = 0, 1, . . . . Applying the memoryless property of the exponential distribution, T i = Bi+1 − Bi , where Bi is the
expected busy period for M/M/∞
the
startingi−1with i customers
in
k
j
X
X
ρ
k! ρ
1
e −
, by
system and B0 = 0. Bi = eρ − 1 +
k
λ
ρ
j!
j=0
k=1
λ
formula (34b) in [2],where ρ = . Thus T n∗ −1 = Bn∗ − Bn∗ −1 =
µ
∞
ρk+1
1X
. The expected time τ′ ((n∗ − 1, δ), 1),
∗
∗
λ k=0 n (n + 1) . . . (n∗ + k)
where δ = 0, 1, is the expected time until the next arrival plus
T n∗ −1 , if the next event !is an arrival. Thus, τ′ ((n∗ − 1, δ), 1) =
λ
1
+ T n∗ −1 , δ = 0, 1. In addition τ′ ((n∗ − 1, δ), 0) =
λ + (n∗ − 1)µ λ
1
+ T n∗ −1 , δ = 0, 1.
λ
To compute the one-step cost C ′ ((n∗ − 1, 1), 1), we define mi
as the average number of visits to state (i, 1) starting from state
(n∗ − 1, 1) before returning to (n∗ − 1, 1), i = n∗ − 1, n∗ , . . . . And
define mi,i+1 as the expected number of jumps from (i, 1) to (i+1, 1),
i = n∗ − 1, n∗ , . . ., and mi,i−1 as the expected number of jumps from
λ
(i, 1) to (i − 1, 1), i = n∗ , n∗ + 1, . . . . Then mi,i+1 =
mi ,
λ + iµ
iµ
mi and mi,i+1 = mi+1,i . Since mn∗ −1 = 1, mi =
mi,i−1 =
λ + iµ
∗
i−n
Y
λ
λ + (n∗ + j)µ
, i = n∗ , n∗ +1, . . . . Thus C ′ ((n∗ −
∗ − 1 + j)µ
λ
+
(n
(n∗ + j)µ
j=0
P
P∞
hi+c
1, 1), 1) = ∞
i=n∗ −1 mi C((i, 1), 1) =
i=n∗ −1 mi λ+iµ , where C((i, 1), 1) =
hi + c
, i = n∗ − 1, n∗ , . . . is the cost incurred in state (i, 1) under
λ + iµ
action 1 for the original state space model; see Section 3.1. The
one-step cost C ′ ((n∗ − 1, 0), 1) = s1 + C ′ ((n∗ − 1, 1), 1). Let Cn∗
be the total cost incurred in M/M/∞ until the number of customers
becomes (n∗ − 1) if at time 0 there are n∗ customers in the system
h(n∗ − 1) + c
+
and the system is running. Then C ′ ((n∗ − 1), 1) =
λ + (n!∗ − 1)µ
∗
(n − 1)µ ′ ∗
λ
Cn∗ , and this implies Cn∗ = 1 +
C ((n −
λ + (n∗ − 1)µ
λ
h(n∗ − 1) + c
h(n∗ − 1)
1), 1) −
. We also have C ′ ((n∗ − 1, 0), 0) =
+
λ′ ∗
λ
s1 + Cn∗ and C ((n − 1, 1), 0) = s0 + C ′ ((n∗ − 1, 0), 0).
We formulate the LP according to Section 5.5 in [14] as
XX
C ′ (z, a)xz,a
Minimize
z∈Z ′ a∈A
s.t.
X
a∈A(z)
xz,a −
X X
z′ ∈Z ′
X X
p(z|z′ , a)xz′ ,a = 0, z ∈ Z ′ ,
a∈A(z′ )
(12)
τ′ (z, a)xz,a = 1, xz,a ≥ 0, z ∈ Z ′ , a ∈ A
z∈Z ′ a∈A(z)
Let x∗ be the optimal basic solution of (12). According to general
results on SMDPs [3, Section III], for each z ∈ Z ′ , there exists at
∗
∗
most one δ = 0, 1 such that xz,a
> 0. If xz,a
> 0, then for the optimal
∗
∗
policy φ, φ(z) = a, for a = 0, 1. If xz,0
= xz,1
= 0, then φ(z) can
be either 0 or 1. Theorem 7 explains how to use x∗ to construct an
average-optimal policy φ with the properties in Theorem 6.
Theorem 7. For the optimal basic solution x∗ of (12) and z =
(i, δ) ∈ Z ′ , i = 0, 1, . . . , n∗ − 1, δ = 0, 1,
∗
(i) If x(0,1),1
> 0, then any n-full service policy is average-optimal,
n = 0, 1, . . . .
∗
∗
(ii) Let Z1′ = {i = 1, . . . , n∗ − 1 : x(i,0),1
> 0}. If x(0,1),0
> 0, then
the (0, N)-policy is average-optimal with N defined as
if min Z1′ = ∅;
n∗ ,
(13)
N=
′
min Z1 , if min Z1′ , ∅.
∗
∗
(iii) If x(0,1),0
= x(0,1),1
= 0, then the (M, N)-policy is average∗
optimal with M = min{i = 1, . . . , n∗ − 1 : x(i,1),0
> 0} > 0 and
N being the same as in (13).
7. REFERENCES
[1] Amazon Elastic Compute Cloud, User Guide.
http://aiweb.techfak.uni-bielefeld.de/content/
bworld-robot-control-software/, December 2012.
Online; accessed 03/01/2013.
[2] S. Browne and O. Kella. Parallel service with vacations.
Operations Research, 43(5):870–878, 1995.
[3] E. Denardo. On linear programming in a Markov Decision
Problem. Management Science, 16(5):281–288, 1970.
[4] E. Feinberg. Reduction of discounted continuous-time MDPs
with unbounded jump and reward rates to discrete-time
total-reward MDPs. In In Optimization, Control, and
Applications of Stochastic Systems,
D. Hernaández-Hernaández and J.A. Minjárez-Sosa
Birkhäuser (Eds.), pages 77–97. Springer, New York, 2012.
[5] Google Compute Engine. https:
//cloud.google.com/products/compute-engine,
2013.
[6] A. Greenberg, J. Hamilton, D. Maltz, and P. Patel. The cost
of a cloud: research problems in data center networks. ACM
SIGCOMM Computer Communication Review, 39(1):68–73,
2008.
[7] X. Guo and W. Zhu. Denumerable-state continuous-time
Markov Decision Processes with unbounded transition and
reward rates under the discounted criterion. Journal of
Applied Probability, 39(2):233–250, 2002.
[8] D. Heyman. Optimal operating policies for M/G/1 queuing
systems. Operations Research, 16(2):362–382, 1968.
[9] HP Cloud Compute Overview.
https://docs.hpcloud.com/compute, 2013.
[10] Ibm SmartCloud. www.ibm.com/cloud, 2012. Online;
accessed 03/01/2013.
[11] Ibm smartcloud provisioning, version 2.1.
http://pic.dhe.ibm.com/infocenter/tivihelp/
v48r1/topic/com.ibm.scp.doc_2.1.0/Raintoc.pdf,
2009.
[12] H. Khazaei, J. Misic, and V. Misic. Performance analysis of
cloud computing centers using m/g/m/m+ r queuing systems.
Parallel and Distributed Systems, IEEE Transactions on,
23(5):936–943, 2012.
[13] V. Kulkarni, S. Kumar, V. Mookerjee, and S. Sethi. Optimal
allocation of effort to software maintenance: A queuing
theory approach. Production and Operations Management,
18(5):506–515, 2009.
[14] M. Mazzucco, D. Dyachuk, and R. Deters. Maximizing
cloud providers’ revenues via energy aware allocation
policies. In Cloud Computing (CLOUD), 2010 IEEE 3rd
International Conference on, pages 131–138, 2010.
[15] H. Mine and S. Osaki. Markovian Decision Processes,
volume 25. Elsevier Publishing Company, New York, 1970.
[16] Introducing the windows azure platform. David Chappell &
Associates White Paper, 2010.