Cellular_Traffic_Prediction_Based_on_an_Intelligen

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Hindawi

Mobile Information Systems


Volume 2021, Article ID 6050627, 15 pages
https://doi.org/10.1155/2021/6050627

Research Article
Cellular Traffic Prediction Based on an Intelligent Model

1 2
Fawaz Waselallah Alsaade and Mosleh Hmoud Al-Adhaileh
1
College of Computer Science and Information Technology, King Faisal University, P.O. Box 400, Al-Hofuf,
Al-Ahsa, Saudi Arabia
2
Deanship of E-learning and Distance Education King Faisal University Saudi Arabia, P.O. Box 400, Al-Hofuf,
Al-Ahsa, Saudi Arabia

Correspondence should be addressed to Mosleh Hmoud Al-Adhaileh; madaileh@kfu.edu.sa

Received 23 June 2021; Revised 11 July 2021; Accepted 22 July 2021; Published 2 August 2021

Academic Editor: omar cheikhrouhou

Copyright © 2021 Fawaz Waselallah Alsaade and Mosleh Hmoud Al-Adhaileh. This is an open access article distributed under the
Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided
the original work is properly cited.
The evolution of cellular technology development has led to explosive growth in cellular network traffic. Accurate time-series
models to predict cellular mobile traffic have become very important for increasing the quality of service (QoS) with a network.
The modelling and forecasting of cellular network loading play an important role in achieving the greatest favourable resource
allocation by convenient bandwidth provisioning and simultaneously preserve the highest network utilization. The novelty of the
proposed research is to develop a model that can help intelligently predict load traffic in a cellular network. In this paper, a model
that combines single-exponential smoothing with long short-term memory (SES-LSTM) is proposed to predict cellular traffic. A
min-max normalization model was used to scale the network loading. The single-exponential smoothing method was applied to
adjust the volumes of network traffic, due to network traffic being very complex and having different forms. The output from a
single-exponential model was processed by using an LSTM model to predict the network load. The intelligent system was
evaluated by using real cellular network traffic that had been collected in a kaggle dataset. The results of the experiment revealed
that the proposed method had superior accuracy, achieving R-square metric values of 88.21%, 92.20%, and 89.81% for three one-
month time intervals, respectively. It was observed that the prediction values were very close to the observations. A comparison of
the prediction results between the existing LSTM model and our proposed system is presented. The proposed system achieved
superior performance for predicting cellular network traffic.

1. Introduction greatly help predict the incidence of network congestion,


which permits us to efficiently allocate network resources
With the high-paced development of smartphone technol- successfully. This is crucial for competitive network support,
ogy, it is estimated that there will also be rapid cellular traffic ordinary maintenance, and the scheduling of sources. There
growth. Also, the existence of services with completely are several applicable reviews on the traffic forecasting of
different needs can result in perpetually dynamic traffic base stations, in particular in public locations where the
patterns and network capability requirements. Smartphone number of users is always changing [2].
Internet not only augments people’s lives with entertain- The speed of telecommunication technology and the
ment but also provides an increasing amount of necessary number of users accessing the mobile internet have both
information and access to needed services for daily living. been increasing, which present many challenges to a cellular
Ericsson has estimated a 54% increase in global mobile traffic network. The presence of many, varied users at densely
over 2020, which is the biggest challenge to telecommuni- populated locations (high-speed rail stations, tourist at-
cation companies in managing the large network flow while tractions, business centres, playgrounds, sports competi-
increasing the QoS [1]. An accurate base traffic load in a tions, concert venues, and many others) can create rapidly
cellular network wherein the number of humans varies can increasing cellular traffic that puts massive stress on its
2 Mobile Information Systems

network structure [3‒6]. Modelling and predicting mobile were applied to predict loading traffic in telecommunication
network traffic can help companies find ways to enhance the networks; in previous research works, circuit-switched
QoS of the network. traffic forecasting was addressed by developing different
Traffic-exchange prediction is primarily based on an statistical time-series models based on experimental data.
hourly granularity, which is used to help control the on- Traditional time-series models like ARIMA, estimated short-
demand allocation of network resources in order to decrease term network traffic demand, and seasonal ARIMA (SAR-
network operation costs. For fairs or other large-scale events, IMA) were used to predict seasonal traffic [18], and some
the number of users and size of the mobile traffic at public used exponential smoothing models (such as the Holt–
locations ought to be predicted rapidly and appropriately Winters method) [19, 20] for finding trends and seasonality
based on the change in users and the tidal effect of the traffic. in demand traffic. Researchers have extended the linear
The modelling and prediction can help operators grasp up- time-series model ARMA to the generalized autoregressive
coming congestion and make network enlargements, ad- conditionally heteroskedastic (GARCH) technique [12] to
justments, and optimizations earlier; also, confined Wi-Fi predict long-range dependencies. Dietterich [21] proposed a
services can be used to fulfil network peaks. Network planning hybrid wavelet-based deep learning framework to predict
has to evolve in order to allow entry to clients without the number of users connected to a mobile network. Linear
degrading service in the case of unexpected increases in traffic. regression has also been used (ARIMA) [22, 23].
Due to the impact of congestion and blocking on a large-scale Currently, advanced time-series models have been used to
network, traffic and routing must be scheduled in a timely predict cellular network traffic, along with applied Bayesian
manner to ensure that the network maintains a proper entry linear regression (BLR) [24], advanced learning machines
rate, network connections are free in crucial regions, and user [25], support vector regression (SVR) [26], and artificial
access is being maintained. Therefore, the prediction of cel- neural networks (ANNs) [27‒30]. Qiang et al. [31] employed
lular network traffic for a base station for a variety of multiple support vector machine regression to forecast daily tourist
users to maintain connectivity in densely populated public traffic. In addition, SVR was implemented to predict a toxicity
locations is of great importance to network safety [7]. assessment [32], battery life forecasting [33, 34], chemical
Cellular network traffic prediction plays an important prediction [35, 36], and financial support [37, 38] and to
role in the design, management, and optimization modelling increase agricultural production through the use of a pre-
of a telecommunication network. The prediction of cellular diction model [39, 40]. However, research has found it much
traffic can permit the planning capacity of a network and the more challenging to predict loading packets [41]. Machine-
improvement of a network’s QoS. At the present time, the learning models have been used to classify abnormalities in
study of predicting 4G Long-Term Evolution (LTE) and 5G circuit-switched traffic. In [42], the short-term traffic volume
traffic is of significant interest in order to enhance QoS in in a cellular 3G network was predicted by using traditional
telecommunications. The prediction of cellular network time-series models like Kalman filtering. In [43], an ARIMA
traffic can be distinguished by two categories: long-range model was applied to predict the use rate in the volume of
and short-range prediction. Long-range prediction provides mobile traffic. Artificial intelligence has been used for deep
a projection for a long period and is used for validating a learning based on LSTM units [44–46]. In [47], a convolu-
detailed predicting network and providing network traffic tional neural network was used for prediction and modelling
patterns that can help to more easily design networks. Short- traffic spatial dependencies, the same as the approach in [48].
range prediction provides projections for a short period and As indicated in [49], deep learning schemes, such as LSTM
can help improve networks. Artificial intelligence models [50], convolutional neural networks [51] and recurrent neural
have been widely used in many industrial applications, such networks [46], have also been applied to coarser time reso-
as developing a prediction model to handle cellular network lutions (e.g., an hour) to extend the forecasting horizon to
traffic for the current year. For example, [7] used linear several days. Artificial neural network (ANN) models have
regression and [8] applied support vector machine regres- been introduced to predict network traffic in the short term
sion (SVMR) to predict cellular network traffic. A number of (minutes and seconds) [52, 53]. The models were used to
studies have presented advanced prediction models based on manage dynamic radio resource management [54].
deep learning (such as LSTM) [9] to cellular network traffic. In this study, a proposed hybrid model was used to predict
Shu et al. [10] proposed a convolutional neural network cellular network traffic, specifically three occurrences of
(STDenseNet) to predict cellular traffic. monthly rush-hour data traffic per cell. The mobile network
In the literature, early work covers traffic predictions for traffic data had been collected from a real live 4G LTE net-
circuit-switching networks by developing statistical time- work. The main contributions of this research are as follows:
series models based on observation data like autoregressive
(1) Network traffic data is very complex, with many
integrated moving averages (ARIMA) [11, 12]. Additionally,
sources of noise and data formats; this makes it a big
a number of modern models are used to handle packet data
challenge for researchers to find an accurate model.
traffic prediction with advanced time-series models based on
We have developed a system that can help predict
artificial intelligence in the use of a mobile network [13, 14].
cellular network traffic more intelligently.
A number of time-series models have been introduced for
predicting short-term traffic (in minutes and seconds) by (2) We have developed an intelligent system to predict
employing deep learning [15, 16]. Some designed a model to LTE network traffic with superior prediction
predict radio frequency planning [17]. Time-series models performance.
Mobile Information Systems 3

2. Materials and Methods The alpha values are 0 ≤ α ≤ 10 ≤ α ≤ 1 for smoothing the
training data.
Figure 1 shows the framework of the proposed system to
predict 4G mobile network traffic.
2.4. Long Short-Term Memory (LSTM). The LSTM layer
contains a series of many LSTM units that together are called
2.1. Dataset. The LTE 4G network traffic dataset was the LSTM model [54, 55]. LSTM models contain three
identified and downloaded from Kaggle; the data had been multiplicative units. First, the input gate is used to memorise
collected from 4G cell traffic (i.e., the radio transmitter the information of the present. Second, the output gate is
serving as the device was a 4G cell). All the LTE network used to display the results. Third, the forget gate is used to
traffic was generated from individuals using the mobile cells select some forgotten information from the past. Multipli-
(although they are not uniquely identified in the data). In the cative units consist of a sigmoid function and dot product
current research, we have utilised three months of the data to operation. The sigmoid function has a range between zero
examine the proposed system. Table 1 shows the data and one, while the dot product operation determines the
samples. Figure 2 shows the cellular traffic for the three amount of information to transfer. If the value of a dot
months being examined. The public dataset is available at product operation is zero, information is not transferred,
https://www.kaggle.com/naebolo/predict-traffic-of-lte- while information is transmitted when the value of a dot
network. product operation is one. The model is described as follows:
ft � σ 􏼐wf 􏼂ht−1 , xt 􏼃 + bf 􏼑, (4)
2.2. Normalization. LTE network traffic data is very complex
and is composed of underlying signals with very different it � σ wi 􏼂ht−1 , xt 􏼃 + bi 􏼁, (5)
characteristics. However, finding the transformation be-
haviour in cellular networks hopefully will be an aid to C􏽥t � tanh wC 􏼂ht−1 , xt 􏼃 + bC 􏼁, (6)
improving network traffic prediction models. In order to
avoid loading packets with greater numeric values in the Ct � ft ∗ Ct−1 + it ∗ C􏽥t , (7)
network from dominating those with smaller numeric
values, the data will be scaled; this will also increase the
processing speed of the model while maintaining good ot � σ wo 􏼂ht−1 , xt 􏼃 + bo 􏼁, (8)
accuracy. A min-max method was used to transform the data
to values between zero and one; scaling the data can help in ht � ot ∗ tanh Ct 􏼁, (9)
improving the system for predicting network traffic. The two
where it , ft , and ot are the input, forget, and output gates,
main advantages of scaling are to avoid instances of greater
respectively, and ht is the number of hidden layers in the
numeric ranges dominating those with smaller numeric
cells. The weighted neural network is presented by wf , wo ,
ranges and to prevent numerical difficulties during the
and wc , and Ct is the internal memory cell for the hidden
prediction. The transformation is accomplished as follows:
layer. The bias of the neural network is indicated by bf and
x − xmin bo ; xt is the network traffic data.
zn � 􏼐Newmaxx − Newminx 􏼑 + Newminx , (1)
xmax−xmin Equation (3) represents the forget gate, which takes the
input at time t as the input to the activation function in order
where xmin is the minimum of the data and xmax is the to provide its output. Equation (4) represents the input gate,
maximum of the data. Newminx is the minimum number and the parameters are the same as in equation (2). Equation
zero, and Newmaxx is the maximum number one. (3) works to calculate the candidate value in memory, where
“tanh” is the activation function. Equation (6) works on
combining memories of the past and the present. Equation
2.3. Single-Exponential Smoothing (SES) Model. The single-
(5) represents the output gate, and the parameters are the
exponential smoothing (SES) model is one of the common
same as in equation (3). Equation (8) represents the cell
statistical algorithms used to predict data without a trend or
output, and “tanh” is the activation function. W represents
seasonality. The model uses one significant parameter (al-
the matrix of weight vectors, and b represents the bias vector.
pha) to adjust the weight of the observation data for the
The parameters of the LSTM model and their values are
obtained prediction data. Selecting a value of this parameter
shown in Table 2.
depends on the evaluation metrics. The model is defined as
follows:
􏽐nt�1 Xyt 2.5. Model Evaluation Criteria. The mean square error
ℓ0 � X � , (2) (MSE), root mean square error (RMSE), mean absolute error
n
(MAE), correlation coefficient (R), and squared correlation
PT+1 � αyt +(1 − α)Pt , (3) (R2) metrics are employed as evaluation criteria. The eval-
uation equations are used to find the differential between the
where ℓ0 is the level of the trend, X is the input sample, n is observed and predicted data and are described in the
the number of samples in the dataset, and yt is the output. following:
4 Mobile Information Systems

Methodology
Data
preprocessing Single-exponential
smoothing (SES)
Data exploration
Evaluation

Output from SES SES-LSTM


model Prediction metrics
Normalization
MSE, RMSE,
correlation, R2
Bidirectional LSTM
Splitting
data

Training Testing

Figure 1: Proposed system.

proposed system. The LSTM model was applied to predict


Table 1: Input samples.
the loading of the cellular traffic derived from the network.
Time period Size of sample Min-max normalization was proposed to scale the data into
January 2018 41,992 an appropriate format. Due to the network characteristics of
February 2018 37,958 many bursts and high complexity, a single-exponential
March 2018 42,402 smoothing method was used to adjust the weighting of the
observation values to obtain the new output. Single-expo-
nential smoothing was proposed to handle overlapping
1 n 2 values in order to improve the LSTM results. The SES model
MSE � 􏽘 x − xt 􏼁 ,
N k�1 t depends on the smoothing constant, which has a significant
parameter alpha. The values of alpha range from 0.1 to 0.5.
􏽶�������������
􏽴 According to the MSE metric, we found that 0.5 was an
1 n 2 appropriate value to obtain a good prediction. The data sets
RMSE � 􏽘 x − xt 􏼁 ,
N k�1 t were divided into 80% training and 20% testing. The hybrid
(10) model obtained superior results; the prediction values were
􏽱����������������� very close to the prediction values according to the evalu-
22
1/N􏽐nk�1 xt − xt 􏼁 ation metrics. Table 4 shows the numbers in the samples in
NRMSE � , the training and testing stages.
x
2
􏽐 xt − xt 􏼁
R2 � 1 − 2 × 100%, 3.2.1. Training of the Hybrid Model. Eighty percent of the
􏽐 xt − xt 􏼁 cellular network traffic dataset was used for the training
where xt are the observed responses, xt are the estimated process. The empirical results of the hybrid system in the
responses, and N is the total number of observations. training phases were superior in predicting the loading
traffic in the cellular network.
Table 5 demonstrates the prediction results of the SES-
3. Experiment Results LSTM model during the training process. The prediction
results were closer to the observation data, according to the
In this section, the results of the LSTM model to predict
evaluation criteria. The MSE values were 0.00017, 0.00104,
network traffic are presented.
and 8.1547 × 10−05 for the months of January, February, and
March 2018, respectively.
3.1. Environment Setup. The proposed framework was Figure 3 shows the time-series plot of the hybrid model
evaluated using different hardware and software environ- for predicting loading traffic. While the target (x-axis) values
ments. Table 3 shows the equipment used to develop the represent the errors of the model, the output (y-axis) values
proposed system. represent the numbers in the sample. The prediction errors
varied less according to the evaluation metrics, namely, the
MSE, RMSE and NRMSE. The prediction errors of the
3.2. Analysis of Results. The cellular network traffic was January, February and March 2018 input data were
gathered from a real 4G LTE network over a time interval of MSE � (8.93 × 10−05), MSE � (0.000104) and MSE �
three (01/01/2018 to 30/03/2018) and was used for testing the (3.1547 × 10−05), respectively.
Mobile Information Systems 5

Mean = 0.0674, STD = 0.031657 Mean = 0.07976, STD = 0.037246


0.3 0.3

0.25 0.25

0.2 0.2

0.15 0.15

0.1 0.1

0.05 0.05

0 0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 0 0.5 1 1.5 2 2.5 3 3.5 4
×104 ×104

(a) (b)
Mean = 0.044326, STD = 0.017813
0.18

0.16

0.14

0.12

0.1

0.08

0.06

0.04

0.02

0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
×104

(c)

Figure 2: Cellular traffic for the three months being examined: (a) January 2018, (b) February 2018, and (c) March 2018.

Table 2: Significant values of the LSTM parameters. Table 5: Performance of the SES-LSTM model in the training
No. of hidden layers 4 phase.
Max. epochs 20 Time period MSE RMSE NRMSE
Min. batch size 32
Max. iterations 100 January 2018 8.93 × 10−05 0.00944 0.1437
Shallow hidden layer size [29, 49] February 2018 0.000104 0.01020 0.123
Delays [1, 2, 4, 8] March 2018 8.1547 × 10−05 0.00561 0.1259
Optimizer Adam

Figure 4 illustrates the histogram error obtained from


Table 3: System requirements. the SES-LSTM model at the training phase for predicting the
Hardware/software Environment loading traffic. Histogram errors are metrics used to find the
Operating system Windows 10 differences between the observation and prediction data. In
CPU Intel Core i5 the training phase, the mean error in the histogram is
Memory 4 0.00192 for the training data of January 2018, as shown in
MATLAB R2020a Academic Figure 4(a); in February 2018, the mean error is 0.0025, as
shown in Figure 4(b), and the mean error of March 2018 is
5.44 × 10−05, as shown in Figure 4(c).

Table 4: Splitting loading traffic data.


3.2.2. Testing of the ANFIS Model. The testing phase was
Input data Training Testing used to validate the use and to test and evaluate the SES-
January 2018 33,577 8,394 LSTM model in predicting the loading of cellular network
February 2018 30,050 7,587 traffic. The testing state uses unseen data to forecast future
March 2018 33,905 8,476 traffic. Table 6 presents the testing results of the proposed
6 Mobile Information Systems

MSE = 8.93e – 05, RMSE = 0.0094499, NRMSE = 0.14392 MSE = 0.00010412, RMSE = 0.010204, NRMSE = 0.12318
0.1 0.1

0.08 0.08

0.06
0.06
0.04
0.04
0.02
0.02
0

0 –0.02

–0.02 –0.04
0 0.5 1 1.5 2 2.5 3 3.5 0 0.5 1 1.5 2 2.5 3 3.5
Error per sample ×104 Error per sample ×104

(a) (b)
MSE = 3.1547e – 05, RMSE = 0.0056167, NRMSE = 0.12599
0.1

0.08

0.06

0.04

0.02

–0.02
0 0.5 1 1.5 2 2.5 3 3.5
Error per sample ×104

(c)

Figure 3: Time-series plots for predicting cellular network traffic at the training phase: (a) January 2018, (b) February 2018, and (c) March 2018.

system for the three-month time period of the data. demonstrated the effectiveness and efficiency of the pro-
According to the evaluation metrics, the proposed system posed system.
achieved the best prediction results, MSE values of 0.000175,
8.6238 × 10−05, and 2.9927 × 10−05 in terms of the three 4. Results and Discussion
months (January, February, and March 2018, respectively) in
the testing stage. The self-sufficient prediction of cellular network traffic de-
The time-series plots of the SES-LSTM model in predict mand will be a key function in future telecommunication
loading traffic are presented in Figure 5. The prediction companies. Considering the fact that e-business, banking,
values were very close to the observation values according to and industrial business enterprises are notably associated
the evaluation metrics. with special and valued information that is communicated
In addition, Figure 6 displays the histogram errors ob- inside a network, it is far from meaningless to mention the
tained from the hybrid SES-LSTM model. The histogram significance of network traffic analysis in achieving suitable
metric for the testing process is to find the difference be- information security. Cellular network traffic analysis and
tween the observation and unseen data obtained as future prediction is a proactive strategy in the desire to maintain a
loading traffic. The means and standard divisions of the healthy system; the network is also monitored to make sure
histogram errors are shown at the tops of the graphic that security breaches no longer arise inside it. Cellular
representations. It was noted that the histogram error of the network traffic prediction is an important phase for de-
SES-LSTM model was very low for forecasting future load. veloping a growing successful system, protecting it and
The maximum mean error (0.00380) of the histogram is preventing congestion through control schemes and dis-
shown in Figure 6(a). The histogram error testing phase covering abnormal packets in the network traffic. The
Mobile Information Systems 7

Error mean = 0.0019216, error StD = 0.0092526 Error mean = –0.0025521, error StD = 0.0098796
3500 3000

3000
2500

2500
2000

2000
1500
1500

1000
1000

500
500

0 0
–0.02 0 0.02 0.04 0.06 0.08 0.1 –0.02 0 0.02 0.04 0.06 0.08
Error histogram Error histogram

(a) (b)
Error mean = 5.44e – 05, error StD = 0.0056165

2000

1500

1000

500

–0.01 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
Error histogram

(c)

Figure 4: Histogram errors plot for the proposed system in predicting cellular loading traffic at the training phase: (a) January 2018,
(b) February 2018, and (c) March 2018.

Table 6: Performance of the SES-LSTM model in the testing phase.


Table 7 shows the empirical results of SES-LSTM model
Time period MSE RMSE NRMSE and existing LSTM model systems; it is noted that the
January 2018 0.000175 0.0132 0.1783 proposed SES-LSTM model was superior compared with the
February 2018 8.6238 × 10−05 0.0092 0.137 existing deep learning LSTM model. According to the in-
March 2018 2.9927 × 10−05 0.00547 0.1264 dividual correlation metrics, the prediction accuracy of the
January 2018 data was R2 � 88.21%; the prediction accuracy
significance of this integral subject matter and our urge to of the February 2018 data was R2 � 95.09%; and the pre-
make contributions in fixing the lookup problem in intel- diction accuracy of the March 2018 data was R2 � 89.81% in
ligent cellular traffic prediction is the essential purpose of the training phase. Figure 7 shows the correlation plots in the
this study. training phase for the prediction cellular loading traffic by
Modelling and predicting network traffic can help in using our proposed SES-LSTM model. In addition, Figure 8
updating the polling on a cellular network. In previous shows the regression plots for the predicted cellular loading
studies, researchers used statistical approaches to predict the traffic by using the existing LSTM model at a training phase.
loading network traffic. In this study, we have developed a This plot is used to find the relationship between the pre-
hybrid SES-LSTM model to predict loading traffic for a 4G dicted and the actual values by using Pearson’s correlation
LTE network. Single-exponential smoothing was applied to coefficient. It was observed that the SES-LSTM model
adjust the observation values in the computations. Predic- outperformed the existing system.
tion values obtained from the SES method were processed by The hybrid model was appropriate for predicting
using a deep leaning model. unseen load traffic in a cellular network. The experimental
8 Mobile Information Systems

MSE = 0.00017541, RMSE = 0.013244, NRMSE = 0.17836 MSE = 8.6238e – 05, RMSE = 0.0092864, NRMSE = 0.13725
0.12 0.1

0.1 0.08

0.08
0.06
0.06
0.04
0.04
0.02
0.02
0
0

–0.02 –0.02

–0.04 –0.04
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 0 1000 2000 3000 4000 5000 6000 7000 8000
Error per sample Error per sample

(a) (b)
MSE = 2.9927e – 05, RMSE = 0.0054705, NRMSE = 0.12647
0.05

0.04

0.03

0.02

0.01

–0.01

–0.02
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
Error per sample

(c)

Figure 5: Time-series plots of predicting cellular network traffic at the testing phase: (a) January 2018, (b) February 2018, and (c) March
2018.

Error mean = 0.0038082, error StD = 0.012686 Error mean = –0.0032252, error StD = 0.008709
1400 1500

1200

1000
1000

800

600

500
400

200

0 0
–0.02 0 0.02 0.04 0.06 0.08 0.1 0.12 –0.02 0 0.02 0.04 0.06 0.08
Error histogram Error histogram

(a) (b)
Figure 6: Continued.
Mobile Information Systems 9

Error mean = 7.737e – 05, error StD = 0.0054667


1200

1000

800

600

400

200

0
–0.01 0 0.01 0.02 0.03 0.04
Error histogram
(c)

Figure 6: Histogram error plots of the proposed system of predicting cellular loading traffic at the testing phase: (a) January 2018,
(b) February 2018, and (c) March 2018.

Table 7: Performance of the SES-LSTM and existing LSTM model systems in the training phase.
Time period Models R2 (%)
Proposed SES-LSTM 88.20
January 2018
Existing LSTM 6.01
Proposed SES-LSTM 92.09
February 2018
Existing LSTM 5.22
Proposed SES-LSTM 89.81
March 2018
Existing LSTM 16.07

Train data, R2 = 0.8821 Train data, R2 = 0.92095


0.25 0.3

0.25
0.2

0.2
0.15
Target

Target

0.15
0.1
0.1

0.05
0.05

0 0
0 0.05 0.1 0.15 0.2 0.25 0 0.05 0.1 0.15 0.2 0.25 0.3
Prediction Prediction

Data Data
Fit Fit
Y=T Y=T

(a) (b)
Figure 7: Continued.
10 Mobile Information Systems

Train data, R2 = 0.89816


0.18

0.16

0.14

0.12

0.1

Target
0.08

0.06

0.04

0.02

0
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18
Prediction

Data
Fit
Y=T
(c)

Figure 7: Regression plots of the SES-LSTM model at the training phase: (a) January 2018, (b) February 2018, and (c) March 2018.

Train data, R2 = –6.0176 Train data, R2 = –5.2251


0.45 3000

0.4
2500
0.35

0.3 2000

0.25
Target

Target

1500
0.2

0.15 1000

0.1
500
0.05

0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 1000 2000 3000 4000 5000 6000
Prediction Prediction

Data Data
Fit Fit
Y=T Y=T

(a) (b)
Figure 8: Continued.
Mobile Information Systems 11

Train data, R2 = –16.0759


4000

3500

3000

2500

Target
2000

1500

1000

500

0
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
Prediction

Data
Fit
Y=T
(c)

Figure 8: Regression plot of the existing LSTM model at the training phase: (a) January 2018, (b) February 2018, and (c) March 2018.

Test data, R2 = 0.88209 Test data, R2 = 0.86164


0.3 0.2

0.18
0.25
0.16

0.14
0.2
0.12
Target

Target

0.15 0.1

0.08
0.1
0.06

0.04
0.05
0.02
0 0
0 0.05 0.1 0.15 0.2 0.25 0.3 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18
Prediction Prediction

Data Data
Fit Fit
Y=T Y=T

(a) (b)
Figure 9: Continued.
12 Mobile Information Systems

Test data, R2 = 0.87247


0.14

0.12

0.1

0.08

Target
0.06

0.04

0.02

0
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14
Prediction

Data
Fit
Y=T
(c)

Figure 9: Regression plots of the SES-LSTM model at the testing phase: (a) January 2018, (b) February 2018, and (c) March 2018.

Test data, R2 = –5.7886 Test data, R2 = –6.6873


0.4 2500
0.35
2000
0.3
0.25 1500
Target
Target

0.2
0.15 1000

0.1
500
0.05
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 1000 2000 3000 4000 5000 6000
Prediction Prediction

Data Data
Fit Fit
Y=T Y=T

(a) (b)
2
Test data, R = –18.311
2000
1800
1600
1400
1200
Target

1000
800
600
400
200
0
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
Prediction

Data
Fit
Y=T

(c)

Figure 10: Regression plot of the existing LSTM model at the testing phase: (a) January 2018, (b) February 2018, and (c) March 2018.
Mobile Information Systems 13

results of the proposed model in the testing phase were Conflicts of Interest
optimal. The prediction accuracy of the January 2018 data
was R2 � 88.20%, the prediction accuracy of the February The authors declare no conflicts of interest regarding the
2018 data was R2 � 86.16%, and the prediction accuracy of publication of this paper.
the March 2018 data was R2 � 87.24% in the testing phase.
Figure 9 shows the regression plots of the SES-LSTM Acknowledgments
model for the prediction of cellular loading traffic. The
graphical representations of the prediction results of the The authors extend their appreciation to the Deanship of
existing LSTM system are displayed in Figure 10. Overall, Scientific Research at King Faisal University for funding this
the SES-LSTM model achieved the best results in research work through project no. 216016.
the unseen data compared with the existing LSTM model.
We believe the efficiency and effectiveness of our pro-
posed system will help improve network traffic by References
preventing congestion and providing good planning for [1] Ericsson, Ericsson Mobility Report, Ericsson, Stockholm,
any network. Sweden, 2018.
[2] T. H. H. Aldhyani and M. R. Joshi, “Integration of time series
5. Conclusion models with soft clustering to enhance network traffic fore-
casting,” in Proceedings of the 2016 Second International
Network traffic modelling and forecasting play an im- Conference on Research in Computational Intelligence and
portant role in determining network performance. Also, Communication Networks (ICRCICN), pp. 212–214, Kolkata,
these models can help to obtain accurate data for inter- India, September 2016.
[3] NGMN, “5G. white paper,” in White PaperNGMN (Next
preting the important characteristics of traffic, which re-
Generation Mobile Networks), Frankfurt am Main, Germany,
quires very efficient analytical study. Thus, modelling 2015.
network traffic has become an essential part of assisting the [4] J. Moysen, L. Giupponi, and J. Mangues-Bafalluy, “A mobile
design of networks and controlling bandwidth waste. A network planning tool based on data analytics,” Mobile In-
good network traffic prediction model should be able to formation Systems, vol. 2017, Article ID 6740585, 16 pages,
capture prominent traffic characteristics, such as long- 2017.
range dependence (LRD), short-range dependence (SRD), [5] M. Toril, R. Ferrer, S. Pedraza, V. Wille, and J. J. Escobar,
and self-similarity. In this study, a hybrid SES-LSTM model “Optimization of half-rate codec assignment in GERAN,”
was proposed to predict network traffic from real cellular Wireless Personal Communications, vol. 34, no. 3, pp. 321–331,
4G LTE network data. In conclusion, we can draw the 2005.
[6] H. Sun, H. X. Liu, H. Xiao, R. R. He, and B. Ran, “Use of local
following points:
linear regression model for short-term traffic forecasting,”
(i) Measuring 4G LTE network behaviours can be Transportation Research Record: Journal of the Transportation
attained only if an accurate model is designed. Our Research Board, vol. 1836, no. 1, pp. 143–150, 2003.
system can intelligently enhance the quality of [7] N. Sapankevych and R. Sankar, “Time series prediction using
service (QoS) of a cellular network for best future support vector machines: a survey,” IEEE Computational
Intelligence Magazine, vol. 4, no. 2, pp. 24–38, 2009.
performance.
[8] J. Wang, J. Tang, Z. Xu et al., “Spatiotemporal modeling and
(ii) Real 4G LTE network data were used to evaluate and prediction in cellular networks: a big data enabled deep
examine the proposed system. learning approach,” in Proceedings of the IEEE INFOCOM
(iii) The proposed system was novel in that it combined 2017—IEEE Conference on Computer Communications,
a statistical SES model with an advanced artificial Atlanta, GA, USA, May 2017.
[9] C. Zhang, H. Zhang, D. Yuan, and M. Zhang, “Citywide
intelligence LSTM model to improve the accuracy of
cellular traffic prediction based on densely connected con-
the prediction values. volutional neural networks,” IEEE Communications Letters,
(iv) The hybrid SES-LSTM model has shown optimal vol. 22, no. 8, pp. 1656–1659, 2018.
results with fewer prediction errors. [10] Y. Shu, M. Yu, O. Yang, J. Liu, and H. Feng, “Wireless traffic
modeling and prediction using seasonal ARIMA models,”
(v) The results of the proposed system were compared
IEICE Transactions on Communications, vol. E88B, no. 10,
with an existing LSTM model system; it was noted pp. 3992–3999, 2003.
that the proposed hybrid achieved superior pre- [11] B. Zhou, D. He, and Z. Sun, “Traffic modeling and prediction
diction results. using ARIMA/GARCH model,” in Modeling and Simulation
(vi) We believe that the proposed system can be used in Tools for Emerging Telecommunication Networks, pp. 101–121,
any real-time application for predicting future Springer, Berlin, Germany, 2006.
demand. [12] R. Li, Z. Zhao, X. Zhou et al., “Intelligent 5G: when cellular
networks meet artificial intelligence,” IEEE Wireless Com-
munications, vol. 24, no. 5, pp. 175–183, 2017.
Data Availability [13] C. Zhang, P. Patras, and H. Haddadi, “Deep learning in
mobile and wireless networking: a survey,” IEEE Commu-
The public dataset is available at https://www.kaggle.com/ nications Surveys & Tutorials, vol. 21, no. 3, pp. 2224–2287,
naebolo/predict-traffic-of-lte-network. 2019.
14 Mobile Information Systems

[14] C. W. Huang, C. T. Chiang, and Q. Li, “A study of deep [32] F.-K. Wang and T. Mamo, “A hybrid model based on support
learning networks on mobile traffic forecasting,” in Pro- vector regression and differential evolution for remaining
ceedings of the IEEE 28th Annual International Symposium on useful lifetime prediction of lithium-ion batteries,” Journal of
Personal, Indoor, and Mobile Radio Communications Power Sources, vol. 401, pp. 49–54, 2018.
(PIMRC), pp. 1–6, Montreal, Canada, October 2017. [33] J. Wei, G. Dong, and Z. Chen, “Remaining useful life pre-
[15] L. Fang, X. Cheng, H. Wang, and L. Yang, “Mobile demand diction and state of health diagnosis for lithium-ion batteries
forecasting via deep graph-sequence spatiotemporal modeling using particle filter and support vector regression,” IEEE
in cellular networks,” IEEE Internet of Things Journal, vol. 5, Transactions on Industrial Electronics, vol. 65, no. 7,
no. 4, pp. 3091–3101, 2018. pp. 5634–5643, 2018.
[16] A. R. Mishra, Fundamentals of Network Planning and Opti- [34] G. Golkarnarenji, M. Naebe, K. Badii, A. S. Milani, R. N. Jazar,
misation 2G/3G/4G: Evolution to 5G, John Wiley & Sons, and H. Khayyam, “Support vector regression modelling and
Hoboken, NJ, USA, 2nd edition, 2018. optimization of energy consumption in carbon fiber pro-
[17] Y. Yu, J. Wang, M. Song, and J. Song, “Network traffic duction line,” Computers & Chemical Engineering, vol. 109,
prediction and result analysis based on seasonal ARIMA and pp. 276–288, 2018.
correlation coefficient,” in Proceedings of the 2010 Interna- [35] K. Sivaramakrishnan, J. Nie, A. de Klerk, and V. Prasad, “Least
tional Conference on Intelligent System Design and Engineering squares-support vector regression for determining product
Application, pp. 980–983, Changsha, China, October 2010. concentrations in acid-catalyzed propylene oligomerization,”
[18] D. Tikunov and T. Nishimura, “Traffic prediction for mobile Industrial and Engineering Chemistry Research, vol. 57,
network using Holt-Winter’s exponential smoothing,” in pp. 13156–13176, 2018.
Proceedings of the 2007 15th International Conference on [36] H. Jiang and W. W. He, “Grey relational grade in local support
Software, Telecommunications and Computer Networks, vector regression for financial time series prediction,” Expert
pp. 1–5, Split, Croatia, September 2007. Systems with Applications, vol. 39, pp. 2256–2262, 2012.
[19] J. A. Bastos, “Forecasting the capacity of mobile networks,” [37] Y. Peng, P. H. M. Albuquerque, J. M. Camboim de Sá,
Telecommunication Systems, vol. 72, no. 2, pp. 231–242, 2019. A. J. A. Padula, and M. R. Montenegro, “The best of two
[20] C. Qiu, Y. Zhang, Z. Feng, P. Zhang, and S. Cui, “Spatio- worlds: forecasting high frequency volatility for crypto-
temporal wireless traffic prediction with recurrent neural currencies and traditional currencies with support vector
network,” IEEE Wireless Communications Letters, vol. 7, no. 4, regression,” Expert Systems with Applications, vol. 97,
pp. 554–557, 2018. pp. 177–192, 2017.
[21] T. G. Dietterich, Machine Learning for Sequential Data: A [38] M. A. Ghorbani, S. Shamshirband, D. Z. Haghi, A. Azani,
Review, Springer, Berlin, Germany, 2002. H. Bonakdari, and I. Ebtehaj, “Application of firefly algo-
[22] R. W. Kinney Jr., “ARIMA and regression in analytical review: rithm-based support vector machines for prediction of field
an empirical test,” The Accounting Review, vol. 53, no. 1, capacity and permanent wilting point,” Soil and Tillage Re-
pp. 48–60, 1978. search, vol. 172, pp. 32–38, 2017.
[23] T. J. Mitchell and J. J. Beauchamp, “Bayesian variable selection [39] A. Felipe, M. Marco, O. Miguel, Z. Alex, and F. Claudio, “A
in linear regression,” Journal of the American Statistical As- method to construct fruit maturity color scales based on
sociation, vol. 83, no. 404, pp. 1023–1032, 1988. support vector machines for regression: application to olives
[24] G.-B. Huang, Q.-Y. Zhu, and C.-K. Siew, “Extreme learning and grape seeds,” Journal of Food Engineering, vol. 162,
machine: theory and applications,” Neurocomputing, vol. 70, pp. 9–17, 2015.
no. 1–3, pp. 489–501, 2006. [40] J. Bastos, “Forecasting the capacity of mobile networks,”
[25] B. Schölkopf, A. J. Smola, R. C. Williamson, P. L. Bartlett, and Telecommunication Systems, vol. 72, no. 2, pp. 1–12, 2019.
L. B. Peter, “New support vector algorithms,” Neural Com- [41] M. D. Jnr, J. D. Gadze, and D. K. Anipa, “Short-term traffic
putation, vol. 12, no. 5, pp. 1207–1245, 2000. volume prediction in UMTS networks using the Kalman filter
[26] M. Paliwal and U. A. Kumar, “Neural networks and statistical algorithm,” International Journal of Mobile Network Com-
techniques: a review of applications,” Expert Systems with munications and Telematics, vol. 3, pp. 31–40, 2013.
Applications, vol. 36, no. 1, pp. 2–17, 2009. [42] Y. Hua, Z. Zhao, Z. Liu, X. Chen, R. Li, and H. Zhang, “Traffic
[27] D. P. Mandic and J. Chambers, Recurrent Neural Networks for prediction based on random connectivity in deep learning
Prediction: Learning Algorithms, Architectures and Stability, with long short-term memory,” in Proceedings of the 2018
John Wiley & Sons, Inc., Hoboken, NJ, USA, 2001. IEEE 88th Vehicular Technology Conference (VTC-Fall),
[28] K. Greff, R. K. Srivastava, J. Koutnı́k, B. R. Steunebrink, and pp. 1–6, Chicago, IL, USA, August 2018.
J. Schmidhuber, “LSTM: a search space odyssey,” IEEE [43] H. D. Trinh, L. Giupponi, and P. Dini, “Mobile traffic pre-
Transactions on Neural Networks and Learning Systems, diction from raw data using LSTM networks,” in Proceedings
vol. 28, no. 10, pp. 2222–2232, 2017. of the IEEE 29th Annual International Symposium on Per-
[29] C. Harpham, C. W. Dawson, and M. R. Brown, “A review of sonal, Indoor and Mobile Radio Communications (PIMRC),
genetic algorithms applied to training radial basis function pp. 1827–1832, Bologna, Italy, September 2018.
networks,” Neural Computing and Applications, vol. 13, no. 3, [44] J. Feng, X. Chen, R. Gao, M. Zeng, and Y. Li, “DeepTP: an
pp. 193–201, 2004. end-to-end neural network for mobile cellular traffic pre-
[30] R. Chen, C.-Y. Liang, W.-C. Hong, and D.-X. Gu, “Fore- diction,” IEEE Network, vol. 32, no. 6, pp. 108–115, 2018.
casting holiday daily tourist flow based on seasonal support [45] L. Nie, D. Jiang, S. Yu, and H. Song, “Network traffic pre-
vector regression with adaptive genetic algorithm,” Applied diction based on deep belief network in wireless mesh
Soft Computing, vol. 26, pp. 435–443, 2015. backbone networks,” in Proceedings of the 2017 IEEE Wireless
[31] S. Qiang, W. Lu, D. S. Du, F. X. Chen, B. Niu, and K. C. Chou, Communications and Networking Conference (WCNC),
“Prediction of the aquatic toxicity of aromatic compounds to pp. 1–5, San Francisco, CA, USA, March 2017.
tetrahymena pyriformis through support vector regression,” [46] C. W. Huang, C. T. Chiang, and Q. Li, “A study of deep
Oncotarget, vol. 8, pp. 49359–49369, 2017. learning networks on mobile traffic forecasting,” in
Mobile Information Systems 15

Proceedings of the IEEE 28th Annual International Symposium


on Personal, Indoor, and Mobile Radio Communications
(PIMRC), pp. 1–6, Montreal, Canada, October 2017.
[47] H. Assem, B. Caglayan, T. S. Buda, and D. O’Sullivan, “ST-
DenNetFus: a new deep learning approach for network de-
mand prediction,” in Proceedings of the Joint European
Conference on Machine Learning and Knowledge Discovery in
Databases, pp. 222–237, Dublin, Ireland, September 2018.
[48] C. Zhang and P. Patras, “Long-term mobile traffic forecasting
using deep spatio-temporal neural networks,” in Proceedings
of the 18th ACM International Symposium on Mobile Ad Hoc
Networking and Computing, pp. 231–240, Los Angeles, CA,
USA, June 2018.
[49] J. Wan, J. Tang, Z. Xu et al., “Spatiotemporal modeling and
prediction in cellular networks: a big data enabled deep
learning approach,” in Proceedings of the IEEE Conference on
Computer Communications (INFOCOM), pp. 1–9, Atlanta,
GA, USA, May 2017.
[50] C. Zhang, H. Zhang, D. Yuan, and M. Zhang, “Citywide
cellular traffic prediction based on densely connected con-
volutional neural networks,” IEEE Communications Letters,
vol. 22, no. 8, pp. 1656–1659, 2018.
[51] C. Qiu, Y. Zhang, Z. Feng, P. Zhang, and S. Cui, “Spatio-
temporal wireless traffic prediction with recurrent neural
network,” IEEE Wireless Communications Letters, vol. 7, no. 4,
pp. 554–557, 2018.
[52] L. Fang, X. Cheng, H. Wang, and L. Yang, “Mobile demand
forecasting via deep graph-sequence spatiotemporal modeling
in cellular networks,” IEEE Internet of Things Journal, vol. 5,
no. 4, pp. 3091–3101, 2018.
[53] N. Bui, M. Cesana, S. A. Hosseini, Q. Liao, I. Malanchini, and
J. Widmer, “A survey of anticipatory mobile networking:
context-based classification, prediction methodologies, and
optimization techniques,” IEEE Communications Survey and
Tutorials, vol. 19, no. 3, pp. 1790–1821, 2017.
[54] M. I. A. Ibrahim, T. H. H. Aldhyani, M. H. Al-Adhaileh et al.,
“Human-animal affective robot touch classification using
deep neural network,” Computer Systems Science and Engi-
neering, vol. 38, no. 1, pp. 25–37, 2021.
[55] H. Alkahtani, H. Theyazn, and H. Aldhyani, “Intrusion de-
tection system to advance internet of things infrastructure-
based deep learning algorithms,” Complexity, vol. 2021, Ar-
ticle ID 5579851, 18 pages, 2021.

You might also like