Rastegardoost 2018
Rastegardoost 2018
Rastegardoost 2018
however, they fail to reflect the disturbance and the resulting small cell (with a single BS). Downlink (DL) and Uplink (UL)
latency imposed on the WiFi activity due to LTE-U interrup- traffic is assumed for the WiFi network, and the utilization of
tion in the ABS regime. the entire WiFi network, or its traffic intensity, is denoted by
On the other hand, inefficient utilization of the spectrum has ρ ∈ (0, 1).
resulted in extensive vacant portions known as white space, Distributed CSMA-CA protocol is implemented among
containing nothing but white noise. White spaces in the unli- WiFi nodes for channel access coordination based on a physi-
censed spectrum caused by random activities of the incumbent cal layer Clear Channel Assessment (CCA) and random back-
technologies, provide considerable spectral opportunities for off mechanism. Particularly, if the WiFi receiver detects any
emerging technologies such as LTE-U, at no cost. Statistical WiFi signal with decodable preamble that exceeds the Carrier
characterization of WiFi white spaces is carried out in [9] Sensing (CS) threshold, or any other signal exceeding the
using a Markov modulated batch Poisson process (MMBPP) Energy Detection (ED) threshold, it will hold a busy CCA flag
framework, where authors demonstrate the abundance and until the end of that transmission. As a result, WiFi nodes can
duration of idle opportunities in the WiFi channel. The idea detect LTE-U transmissions using the ED feature.
of exploiting WiFi white spaces for opportunistic LTE-U We assume WiFi APs and STAs as well as the co-located
access is introduced in [10] and further investigated in [11], LTE-U BS and the associated User Equipments (UEs) are
where an MMBPP-based opportunistic coexistence mechanism distributed in the same geographical area such that they are all
is proposed for LTE-U that enables the base station (BS) within the CCA range of each other, i.e., their simultaneous
to dynamically predict the duration of WiFi white spaces transmission results in severe interference, and no two nodes1
upon sensing one, and schedule transmissions accordingly. can transmit at the same time. This scenario may apply to
This approach is shown to achieve comparable throughput dense networks situations, where no other idle channels are
to ABS, while significantly reducing the latency imposed on available and the two co-located operators have to share a
WiFi activity. single channel. This assumption also helps with abstracting the
However, the MMBPP-based approach for LTE-U coexis- physical layer, which can later be improved using stochastic
tence is practically difficult to implement, due to the parameter geometry methods to account for spatial distributions and
estimation and computational complexities related to MMBPP randomness in the received SINR. Nevertheless, here we relax
calculations that have to repeatedly be carried out upon every the spatial dimension to emphasize on temporal design issues
white space occurrence. Moreover, the LTE-U probabilistic from a MAC layer perspective.
operation is subject to error, and thus, can cause alteration Here, the LTE-U supplement downlink (SDL) mode is
of the WiFi system statistics, which is not captured in the considered, where the excess DL traffic is offloaded in the
model. To address the above issues, in this paper we propose unlicensed band, and the UL and control traffic remain in the
a practically simpler method to the problem of opportunistic licensed bands, and thus, the BS is assumed to have full buffer
coexistence of LTE-U and WiFi. The proposed approach DL traffic. In accessing the shared unlicensed channel, LTE-U
is based on reinforcement learning, particularly Q-Learning, BS also implements ED to sense WiFi transmissions and avoid
which provides a robust and model-free decision-making interrupting them. Once the access is granted, the BS centrally
framework that enables online and distributive coexistence of schedules and allocates available resources for blocks of 1 ms
LTE-U small cells with WiFi. Q-Learning (QL) is employed long (equal to one LTE subframe duration) among LTE UEs.
in [12] for a distributed channel selection strategy for LTE-
U, that dynamically identifies the utilization condition of each B. LTE-U Duty Cycle
channel towards optimum selection. In [6], [13] Q-Learning is As illustrated in Fig. 2a, the real-life WiFi traffic is sig-
used for dynamic ABS duty cycle selection according to the nificantly bursty, leaving lots of white spaces between IEEE
coexisting WiFi network traffic load. However, these works 802.11 frames [14]. White space is referred to the idle periods
ignore the interference and latency imposed on WiFi activity, in WiFi channel that correspond to no transmissions, excluding
which is addressed in our proposed framework by employing the short silent gaps related to the back-off times and WiFi
carrier sensing at the LTE-U BS. protocol-related space intervals. We define a variable and adap-
The rest of the paper is organized as follows. Section II
tive duty cycle for LTE-U based on temporal characteristics
provides an overview on the system structure. The Q-Learning
of WiFi traffic. As illustrated in Fig. 2b, which is an enlarged
based opportunistic coexistence scheme is proposed in Section
view of Fig. 2a, the LTE-U duty cycle is comprised of an
III. Section IV provides simulation results. Finally, Sections
OFF period, corresponding to WiFi frame transmission or
V concludes the paper.
busy time, followed by an ON period, which relates to WiFi
II. S YSTEM OVERVIEW white space or idle time. Idle and busy times in WiFi traffic
A. Deployment Scenario are naturally alternating, even though the busy times may be
comprised of consecutive frame transmissions (irrespective of
We focus on a co-channel coexistence scenario, where two the transmitter node).
networks operate in a single unlicensed frequency channel; a
legacy WiFi network (with a single or multiple access points 1 The term ”node” is used to refer to both service providers (AP/BS) and
(APs) and N user stations (STAs)), and a co-located LTE-U users (STA/UE)
2018 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN)
A. Q-Learning Algorithm
Fig. 2: (a) Traffic trace of a real-life WiFi network [14]. (b)
Closer look: WiFi cycle (variable LTE-U duty cycle), WiFi frame Q-learning is an online model-free reinforcement learning
transmission period (LTE-U OFF), and white space (LTE-U ON). technique, where the learning agent improves its behavior
through interaction with the environment, and by estimating
the values of state-action pairs based on experience. In each
We assume LTE-U BS constantly performs carrier sensing state, the agent is able to compare the expected utility of the
to avoid interrupting WiFi transmissions in the common chan- available actions and choose the best amongst them. The Q-
nel while fully utilizing the idle resources. When the channel value, Q(s, a), is defined to be the expected discounted sum of
is idle, the BS knows that there is a white space, and starts future payoffs achieved by taking action a in state s following
the LTE-U ON period. Since our goal here is to minimize the optimal policy that minimizes the cumulative cost function,
the disturbance introduced to WiFi activity, the BS has to that is, Q : S × A → R.
stop transmission by the end of the white space, i.e. when We consider the Q-learning algorithm with -greedy policy,
a packet arrives at the WiFi network. However, due to the where the action at each step is selected as follows. A uniform
back-off mechanism of the CSMA-CA, the WiFi nodes will random number r ∈ U(0, 1) is generated first and compared
not start transmitting as long as they sense the LTE signal. against the parameter ∈ (0, 1). If r is smaller than , an action
Consequently, LTE-U will continue transmitting, and thus, is selected randomly. Otherwise, the optimal action with the
delaying WiFi traffic further and further. lowest Q-value is selected, i.e.,
To avoid this situation, in this paper we use Q-Learning to
provide the LTE-U BS with the ability to estimate the duration at = arg min Q(s, a). (1)
a∈A
of an ongoing white space at each duty cycle, such as to
vacate the channel for the upcoming WiFi frame transmission, The -greedy parameter allows for exploration of the conse-
without requiring detailed knowledge of the WiFi system quences of all actions. In fact, it ensures that all the state-action
parameters and statistics. pairs are explored with some non-zero probability, , so that
all possibilities are discovered.
III. Q-L EARNING -BASED O PPORTUNISTIC C OEXISTENCE Once the action is taken, and upon observing a new state,
M ECHANISM FOR LTE-U st+1 , and collecting the immediate payoff, ct , that results from
action at , the agent updates the Q-values as follows:
Consider an LTE-U small cell sharing an unlicensed chan-
nel with a WiFi network in time domain. As discussed Q(st , at ) ← (1−α)Q(st , at )+α[ct +γmina Q(st+1 , a)]. (2)
previously, the WiFi network generates random and bursty
traffic in uplink and downlink directions, resulting in abundant Here α ∈ [0, 1] is the learning rate, that determines how
white spaces between WiFi frames. Here, we formulate a fast the learning can occur. If α is too small, it will take
decision-making learning-based approach to the opportunistic long time to complete the learning process, whereas if it is
coexistence problem of LTE-U with WiFi, where the LTE- too big, the algorithm might not converge. Also, γ ∈ [0, 1]
U BSs are reinforcement learning agents who autonomously is the discount factor that controls the value placed on the
schedule transmissions while adapting their performance to the future rewards. If γ is too small, learning will not depend as
coexisting WiFi network activity. much on future rewards, and immediate rewards are optimized
In the co-channel coexistence scenario, the LTE-U BS instead. On the contrary, if it is too big, learning will heavily
performs carrier sensing to detect any WiFi activity in the depend on future rewards. It is worth mentioning that the
common unlicensed channel. As long as the channel is occu- above procedure involves simple computations, which helps
pied by WiFi transmissions, LTE-U remains in the OFF period with efficient energy consumption.
and continuously listens to the channel until it becomes idle,
or equivalently, a WiFi white space is initiated. Once a white B. QL-Based Opportunistic LTE-U Algorithm
space is sensed, the BS starts the ON period and transmits Consider LTE-U BS as the agent who applies carrier sensing
for a × tSf ms, where a ∈ A = {a1 , . . . , aM } is the integer to detect and opportunistically utilize white spaces in the WiFi
2018 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN)
time instant and size of the latest WiFi packet frame arrival, 1
LTE-U throughput
a white space initiation, the BS uses the following equation ABS+sensing
0.6 ABS
based on the MMBPP model to estimate WiFi frame inter-
Q-Learning
arrival time, Tban , for a = Kn−1 , 0.4
0.2
Tban = −ν a G−1
aa 1. (5) 0
0.2 0.5
ρ
This is the conditional expectation of a phase type distribution Fig. 3: Maximum achievable LTE-U throughput.
with r phases and parameters (ν a , Gaa ), where r is the
number of underlying states in the MMBPP arrival process 12
MMBPP
that represents the number of WiFi network traffic load levels, 10
behavior are provided to or learned by the BS. Specifically, Fig. 4: WiFi latency.
the BS may employ maximum likelihood parameter estimation
methods, such as EM algorithm, to estimate the statistics of
the MMBPP process. Next, the BS predicts the time of next and the set of states as follows,
WiFi frame arrival, Tbn , as follows,
1, ct < 0.5
2, 0.5 ≤ ct < 1
Tbn = Tn−1 + Tban ,
(6)
st = 3, 1 ≤ ct < 2 (7)
4 2 ≤ ct < 3
and estimates the white space duration by calculating the
5, ct ≥ 3,
remainder of the time until Tbn . Consequently, the BS is able to
schedule transmissions (LTE-U ON) for the estimated duration where ct is the immediate cost function, given in equation (3).
of WiFi white space and stop transmissions (LTE-U OFF) by Fig. 3 demonstrates LTE-U throughput vs. the coexisting
the end of this period, such that the channel is available for WiFi network traffic intensity in terms of the maximum
the upcoming WiFi frame arrival. For details of the MMBPP percentage of total resources that LTE-U can exploit, for
arrival process and the associated opportunistic algorithm, the four aforementioned mechanisms. As expected, LTE-U
refer to [9], [11]. throughput generally decreases with ρ increasing, mainly
because with higher WiFi traffic the duration of white spaces
drops down. We can see that the ABS-plus-sensing achieves
A. Simulations higher throughput compared to ABS, mainly because of the
plenitude of exploitable white spaces in WiFi traffic that it
We consider a WiFi network with a single AP and N = 5 can identify using carrier sensing. Also, it achieves higher
STAs with DL traffic that is generated according to an throughput compared to the MMBPP-based approach because
MMBPP arrival process with r = 2 states, indicating low and of the fact that it is less conservative and does not vacate
high traffic load levels. Each WiFi frame arrival is assumed to the channel dynamically based on WiFi prediction. It is also
be comprised of up to 64 IEEE 802.11ac packets according to observed that the LTE-U throughput with the MMBPP-based
a state-dependent normal probability distribution. The state- opportunistic mechanism is comparable to that with ABS,
dependent arrival rates are varied to obtain different traffic which is due to the efficient exploitation of white spaces. We
intensities (WiFi utilization), ρ. The coexisting LTE-U BS can see that under the proposed QL-based algorithm, LTE-
employs the Q-Learning-based algorithm to opportunistically U is generally able to achieve a relatively high throughput,
exploit the white spaces in the WiFi channel and schedule especially when the traffic load of the WiFi network is higher.
transmissions such that the channel will most likely be vacated The latency imposed on the coexisting WiFi network is
for upcoming WiFi transmissions. shown in Fig. 4. As ρ increases, LTE-U is off for longer
For the Q-Learning algorithm, we put α = 0.5, γ = 0.9, periods in all the four mechanisms, and thus, is less likely to
and = 0.1. We consider the set of possible actions, A = interrupt WiFi transmissions. It is evident that ABS extremely
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, indicating number of subframes disturbs WiFi traffic and results in significant latency. How-
in an LTE frame that the BS transmits (LTE-U ON period), ever, the latency with the other mechanisms is considerably
2018 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN)
0.6 ABS
0.4
Q-Learning lightly loaded scenarios, even though extensive computations
0.2 and parameter estimation of the Markovian solution were
0
avoided. The proposed approach also provided a means to
0.2 0.5
ρ control the trade-off between LTE-U utilization and WiFi
latency in the coexisting networks.
Fig. 5: Power measure; ratio of the LTE-U achievable throughput to
the latency imposed on the coexisting WiFi network. R EFERENCES
[1] H. Zhang, X. Chu, W. Guo, and S. Wang, “Coexistence of WiFi and
heterogeneous small cell networks sharing unlicensed spectrum,” IEEE
less, mainly because they employ spectrum sensing to avoid Communications Magazine, vol. 53, no. 3, pp. 158–164, March 2015.
interrupting WiFi transmissions. Also, notice that the latency [2] J. Li, X. Wang, D. Feng, M. Sheng, and T. Q. S. Quek, “Share in
the commons: Coexistence between LTE unlicensed and WiFi,” IEEE
imposed by the MMBPP-based mechanism is always the Wireless Communications, vol. 23, no. 6, pp. 16–23, December 2016.
minimum. WiFi latency is equal to 0.58 ms, which is ten [3] A. Babaei, J. Andreoli-Fang, and B. Hamzeh, “On the impact of LTE-
times lower than that of ABS (5.5 ms), with moderate WiFi U on WiFi performance,” in 2014 IEEE 25th Annual International
Symposium on Personal, Indoor, and Mobile Radio Communication
traffic (ρ = 0.5). The reason is that by accurately estimating (PIMRC), Sept 2014, pp. 1621–1625.
the duration of each white space, LTE-U BS is very likely [4] E. Almeida, A. M. Cavalcante, R. C. D. Paiva, F. S. Chaves, F. M.
to vacate the spectrum for upcoming WiFi transmissions in Abinader, R. D. Vieira, S. Choudhury, E. Tuomaala, and K. Doppler,
“Enabling LTE/WiFi coexistence by LTE blank subframe allocation,” in
a timely manner. It is observed that under the proposed QL- 2013 IEEE International Conference on Communications (ICC), June
based algorithm, the latency imposed on WiFi is comparably 2013, pp. 5083–5088.
low for lightly loaded WiFi network, and it is elevated with [5] J. Xiao and J. Zheng, “An adaptive channel access mechanism for LTE-
U and WiFi coexistence in an unlicensed spectrum,” in 2016 IEEE
increased WiFi traffic load, yet, it outperforms the ABS International Conference on Communications (ICC), May 2016, pp. 1–
mechanism. 6.
Fig. 5 shows the power measure, i.e., the ratio of the LTE-U [6] N. Rupasinghe and I. Guvenc, “Reinforcement learning for licensed-
assisted access of LTE in the unlicensed spectrum,” in 2015 IEEE
achievable throughput to the latency imposed on the coexisting Wireless Communications and Networking Conference (WCNC), March
WiFi network. This measure expresses the overall coexistence 2015, pp. 1279–1284.
performance under different schemes, which we tend to max- [7] S. Chatterjee, M. J. Abdel-Rahman, and A. B. MacKenzie, “Optimal
distributed allocation of almost blank subframes for LTE/WiFi co-
imize. It is observed that the MMBPP-based scheme outper- existence,” in 2017 15th International Symposium on Modeling and
forms the three other mechanisms under both scenarios. This Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt), May
is achieved at the cost of extensively complex computations. 2017, pp. 1–6.
[8] M. G. S. Sriyananda, I. Parvez, I. Guvene, M. Bennis, and A. I. Sarwat,
However, we can see that in a lightly loaded WiFi network, “Multi-armed bandit for LTE-U and WiFi coexistence in unlicensed
the QL-based approach provides similar performance to that of bands,” in 2016 IEEE Wireless Communications and Networking Con-
the MMBPP-based, while no information of the WiFi network ference, April 2016, pp. 1–6.
[9] N. Rastegardoost and B. Jabbari, “Statistical characterization of WiFi
statistics are necessary and the computational complexities are white space,” IEEE Communications Letters, vol. 21, no. 12, pp. 2674–
avoided at the same time. 2677, Dec 2017.
Note that at ρ = 0.5, higher throughput is achieved [10] N. Rastegardoost and B. Jabbari, “A stochastic-modeling approach to
MAC coexistence of LTE-U and WiFi,” in Proceedings of 2017 IEEE
under the QL-based approach compared to the MMBPP-based 28th Annual International Symposium on Personal, Indoor, and Mobile
mechanism at the price of increased latency. Yet, the power Radio Communication (PIMRC), Oct. 2017.
measure performance is considerably superior to that of the [11] N. Rastegardoost and B. Jabbari, “WiFi white spaces for opportunistic
LTE-U,” in GLOBECOM 2017 - 2017 IEEE Global Communications
ABS scheme. It is worth mentioning that alternative definition Conference, Dec 2017, pp. 1–6.
of the states and the cost function, would result in different [12] O. Sallent, J. Prez-Romero, R. Ferrs, and R. Agust, “Learning-based
performances. They also provide a means to control the trade- coexistence for LTE operation in unlicensed bands,” in 2015 IEEE
International Conference on Communication Workshop (ICCW), June
off between LTE-U utilization and WiFi latency. 2015, pp. 2307–2313.
[13] Y. Y. Liu and S. J. Yoo, “Dynamic resource allocation using reinforce-
ment learning for LTE-U and WiFi in the unlicensed spectrum,” in 2017
V. C ONCLUSION Ninth International Conference on Ubiquitous and Future Networks
(ICUFN), July 2017, pp. 471–475.
A Q-Learning-based approach was proposed in this paper [14] J. Huang, G. Xing, G. Zhou, and R. Zhou, “Beyond co-existence:
for a model-free decision-making implementation of oppor- Exploiting WiFi white space for ZigBee performance assurance,” in 18th
tunistic coexistence of LTE-U with WiFi, which enabled IEEE ICNP, Oct 2010, pp. 305–314.
the LTE-U BS to dynamically identify and further exploit
white spaces in the WiFi channel, without requiring detailed
knowledge of the WiFi system. By adaptively adjusting LTE-
U duty cycle to WiFi activity, the proposed algorithm enabled
maximal utilization of idle resources for LTE-U transmissions,