Vafeiadis2012 PDF
Vafeiadis2012 PDF
Vafeiadis2012 PDF
a r t i c l e i n f o a b s t r a c t
Article history: With the expansion of computer networks, there is a strong need for monitoring their
Received 27 January 2012 properties in order to diagnose any problems and manage them in the best possible
Received in revised form 3 July 2012 way. This monitoring is particularly useful if performed in real-time, however, such an
Accepted 5 July 2012
approach is rather difficult (if not impossible) to implement in networks with increased
Available online 6 September 2012
traffic, using a passive monitoring scheme. One way to overcome this problem is to selec-
tively sample network data, which in turn opens new issues such as how frequently this
Keywords:
sampling should be performed, so as to obtain useful and exploitable data. In this work
Passive network monitoring
Massive network data
it is shown that it is possible to accurately represent high-speed network traffic using suit-
Time series able time series models and then determine the size of the sampling window, so as to
Packet loss detect packet loss. The resulting scheme is scalable, protocol-independent and able to raise
alerts in real time.
Ó 2012 Elsevier B.V. All rights reserved.
1. Introduction
During the past few years, there is a significant rise of computer technology and network engineering which has rendered
monitoring and diagnosing of computer networks a vital task. Essentially, these parts can be split into the data gathering
phase (from data streams) and the processing phase. The former faces the challenge of where to store this massive data,
so as to enable its processing. What is more, for high-speed networks, this may be an almost impossible task, as the rate
the data is gathered at can be so high that it could easily accumulate to several terabytes per week. By examining the cap-
tured data, conclusions can be drawn on certain properties of the given network, one of which is packet loss. Alternative
ways for accurately measuring packet loss involve pre-installing appropriate sensors/processors at strategic points in a net-
work. However, such an approach was not always possible in certain cases.
Data from different sources needs to be merged and collated before any conclusions on the network’s behaviour can be
drawn. To aid the analysis of the gathered data, some initial processing needs to take place, which usually consists of either
clustering or filtering. There are two main ways in which clustering can be performed: hierarchical or partitional. The former
attempts to build a hierarchy of clusters and is further divided into agglomerative (bottom-up approach) and divisive (top-
down approach). The latter attempts to directly decompose the data set into a set of disjoint clusters. In addition, these
two methods can be combined and may yield better results [1].
Since data gathering is a temporal process, it is well-suited for modelling with time series. Besides, such models have suc-
cessfully been applied to network traffic analysis for anomaly detection [2], which is closely related to our aims and objec-
tives. Initially, measures of time series analysis can be used for investigating normality, possible correlation and linear trend
⇑ Corresponding author.
E-mail addresses: thanosb@math.auth.gr (T. Vafeiadis), alpapanik@teilar.gr (A. Papanikolaou), iliou@it.teithe.gr (C. Ilioudis), v13@it.teithe.gr (S.
Charchalakis).
1569-190X/$ - see front matter Ó 2012 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.simpat.2012.07.002
174 T. Vafeiadis et al. / Simulation Modelling Practice and Theory 29 (2012) 173–180
in the data. A secondary goal is to identify patterns of time series (such as seasonality) and to fit in an ARMA/ARIMA model.
The analysis of the gathered IP records poses some difficulty due to the increased bias of network traffic. The other problem
we are dealing with is to find vast changes (gaps) in the normal behaviour of the gathered IP datasets. We are interested in
the number of packets that are lost while a network connection is active and we aim to find how this number affects the
overall behaviour of the network traffic. In order to distinguish between significant and non-significant changes in the
behaviour of the network traffic we use the packet loss ratio along with a ‘sliding time window’ of fixed length [3]. With
the aid of Matlab we generate Monte-Carlo realisations for this combination of parameters in order to decide about the
threshold values that allow us to determine whether there are significant changes or not.
Our results show that it is possible to accurately represent high-speed network traffic using suitable time series models
and then determine the size of the required parameters (packet loss ratio, sliding window), so as to find gaps in the network
traffic. Once the system parameters have been determined and monitoring of the network has started, an alert can be raised
in real time when increased packet loss is detected.
The paper is organised as follows: Section 2 presents the related work on the various techniques for capturing traffic of IP
networks and the modelling of massive IP data. In Section 3 the process for choosing an appropriate time series model that is
able to represent the gathered data set is presented. Section 4 exploits the time series models that were established in
Section 3 and by simulating such traffic the optimal parameters for the sampling window are established, thus evaluating
the usefulness of the proposed scheme. Finally, the various findings are summarised in Section 5 together with some
directions for future research.
2. Related work
The various techniques for estimating packet loss in networks mainly fall into two categories: Active, that inject additional
packets of some kind to aid their measurements and passive, that record transmitted traffic and process it in order to extract
information related to packet loss. The obvious difference is that active techniques alter the traffic characteristics with the
addition of the extra packets.
When it comes to high-speed networks and passive traffic capturing, in the vast majority of the related work a distributed
or a distributed-like approach is usually implemented. Due to the great amount of data that needs to be processed, several
sensors/processors are employed, each of them dealing with a fraction of the total traffic, like in the case of the DiCAP archi-
tecture [4]. In such cases, some equipment usually needs to be pre-installed at specific locations.
An early approach was the tool ‘Sting’ [5], which overcame the limitation of requiring two cooperative hosts by measuring
the link loss rate from a client to any TCP-based server on the Internet based on the loss recovery algorithms of the TCP pro-
tocol. An alternative scheme was proposed in [6], where the sequence numbers in TCP were exploited for deducing the pack-
et loss. The obvious drawback of these two schemes is that they are limited to the TCP protocol.
Hash-based packet identification was the core of the technique proposed in [7] for passively monitoring packet loss. Nev-
ertheless, this technique needs to stop monitoring in order to compute the packet loss. The scheme proposed in [8] could be
regarded as an improvement to the previous one, since it shares similar characteristics, can be performed on-line and is more
lightweight. In addition, it features an accurate measurement of the packet loss rate that is able to pinpoint the packet loss
rate for individual traffic flows. The main difference between the two schemes is that the former hashes the packets’ payload
and correlates them, whereas the latter matches packets to flows and then compares flows with each other for computing
the packet loss.
Attempting to deduce packet loss information by examining stream data could be regarded as performing database que-
ries on such data. For processing high-speed data stream queries, one approach is the use of a ‘sliding window’, which tends
to be preferred for many applications [3]. An advantage of this approach is that it emphasises on most recent data, which also
applies to the case of packet loss detection.
The aforementioned schemes have various limitations. They are either applicable only to specific protocols or require
additional equipment to be pre-installed at certain network nodes (usually active schemes), to aid the measurement of pack-
et loss. The proposed scheme uses a statistical approach and is therefore applicable to any kind of network traffic and/or pro-
tocols. In particular, it is a passive, single-point (namely, it does not require the any additional sensors/processors) network
monitoring scheme, suitable for high-speed networks. The problem of large network data is overcome by periodically cap-
turing traffic samples and examining them to establish whether they signal a situation of significant packet loss. The sam-
pling interval is established by fitting suitable ARMA/ARIMA time series models to previously-captured traffic. Finally, the
scheme is scalable, meaning that data from additional sources can be sampled in the same manner, nevertheless limited
by the physical hardware and/or software constraints.
The modelling of network traffic can be divided into a traditional Markov-based approach, the time series analysis meth-
od, and the neural network method [9]. Markov chains are short-memory processes, hence failing to capture certain statis-
tical properties of a packet stream, such as its burstiness and are thus unsuitable for modelling long-range dependent process
such as network traffic [10,11]. Modelling such processes is only feasible if either the active or both the active and the idle
periods of transmission are heavy-tailed distributed [12], which was not the case in our data set. Neural networks are non-
linear statistical data modelling tools and they can be used to model complex relationships between inputs and outputs or to
find patterns in data [13]. Several neural-network-related techniques have been proposed that focus on the analysis of large
T. Vafeiadis et al. / Simulation Modelling Practice and Theory 29 (2012) 173–180 175
data [14,15]. However, in order to be effective in applications of anomaly detection, they require adequate amounts of both
normal and abnormal data for their training, before they can produce accurate results [16]. Nevertheless, since it was un-
known whether enough such data was available for our experiments, the use of neural networks did not qualify as a poten-
tially good choice.
Time series methods have also been proposed for modelling network traffic e.g. AR, MA, ARMA, ARIMA, FARIMA [17,18].
The usage of ARMA/ARIMA models for prediction of network traffic is presented in [19,20], where they are applied for mod-
elling the end-to-end delay in network traffic.
The work presented in this paper uses ARMA and ARIMA modelling techniques, which in fact reflect accurately the effect
of packet auto-correlation. The minimum size of the sampling interval with respect to the overall packet traffic is estimated,
so as to enable the detection of packet loss instances.
The data used for the experiments that follow was obtained from the daily network traffic of an academic institute. Daily
IP records on three distinct, non-successive working days were obtained, by taking readings every 10 s. The application used
for acquiring the data (written in Python) could be run on a firewall and it could monitor the network traffic in real time. Its
objective was to extract information by processing data flows travelling through the firewall. For the sake of simplicity, only
TCP protocol data was recorded, due to its connection-oriented nature. In particular, the application monitored the formation
and the termination of each data transfer between two ends and upon termination of the data transfer it produced a sum-
mary of statistical data for each flow. For the purposes of our experiments it was sufficient to examine only the meta-data
and not the actual data that was transferred.
For each IP address, the overall connection time and the number of packets sent and received during a connection were
recorded. Due to the massive data of IP records and their lineaments, IP data was split into two groups: IP source and IP des-
tination, denoted hereafter as IP-s and IP-d, respectively. During the 3-day-long period 9079 IP-s and 4938 IP-d were gath-
ered in total. However, only the IPs with the highest number of records were used for the purpose of this research, to
minimise statistical errors. For example, the data that was collected for a given IP-s is presented in Table 1.
ARIMA models of the series were selected on the basis of an examination of autocorrelation and partial autocorrelation
functions (ACF and PACF, respectively) using standard techniques [21].
The effect of the inclusion of IP data in the case of time series models was tested using the significance levels of each var-
iable and the effect on the Bayesian information criterion (BIC) and Akaike information criterion (AIC) value for the model,
using backward stepwise techniques.
An ARIMA model consists of a forecasting equation which may include previous lags in the series, or ‘autoregressive’
terms, and lags of the forecast errors, or ‘moving average’ terms. A time series which needs to be differenced to be made
stationary is said to be an ‘integrated’ version of a stationary series [22]. The notation used to describe an ARIMA model
is of the form ðp; d; qÞ ðP; D; Q Þs . The first set of parentheses represents the non-seasonal part of the model, where p the
order of an autoregressive process, d the order of differencing and q the order of a moving average process. The second
set of parentheses represents the seasonal component, where P is the order of a seasonal autoregressive process, D the order
of seasonal differencing, Q the order of a seasonal moving average process and s the length of the seasonal period [23,21].
The ACF measures the correlation between values at each point in a series and values at lags prior to that point [22,23,21].
This information is further used to calculate the PACF, the correlation remaining between each point and lag in the series
after the influences of all closer lags have been removed. The BIC is calculated as BIC ¼ 2 lnðLÞ þ lnðnÞk, where L is the like-
lihood function based on the residuals from the model, n is the number of residuals and k is the number of free parameters
[24,25]. This value therefore takes into account both the fit of the model and its parsimony, and should be as low as possible.
Table 1
An example based on the data gathered during the sampling period (for a given IP address).
The AIC is a measure of the relative goodness of fit of a statistical model in general case is calculated as AIC ¼ 2k 2 lnðLÞ
(parameters k and L are defined above).
Assuming the time series Y t ; t ¼ 1; . . . ; n, we use a parametric approach in order to fit a model. The auto-regressive, mov-
ing-average (ARMA) models are expressed as follows:
or equivalently:
UðBÞyt þ HðBÞat
where UðBÞ ¼ 1 u1 B up Bp ; HðBÞ ¼ 1 h1 B uq Bq ; p and q the ranks of the auto-regressive and moving average
processes respectively and at the stationary random component (residuals). The ARMA models are also expressed as:
ARMAðp; qÞ.
The basic assumption in estimating the ARMA coefficients is that the data are stationary, that is, the trend or seasonality
cannot affect variance. This also means that the data should have a constant mean, variance and autocorrelation through
time which, in general, does not hold. Thus, the auto-regressive, integrated, moving-average (ARIMA) models are expressed
as:
For each IP address the series of packets that were sent and received (denoted hereafter as Ps and Pr respectively), were fit
into an ARMA/ARIMA model. The nature of those series allows us to assume that there is no correlation among the data. Fig. 1
presents the periodogram of a Ps series, wherein no seasonality exists. The periodogram of the Pr series exhibits similar
behaviour.
Same results were also obtained from the other Ps and P r series, for the other IP addresses. These series turned out to be
uncorrelated (white noise time series), a fact that was least expected considering that these records were gathered in non-
continuous time intervals. Thus, we study the Ps and P r series in smaller parts of n ¼ 1000 elements, assumed as
X i ; i ¼ 1; . . . ; n. It is widely accepted by the scientific community a group of 1000 elements to be a fairly large data set as
far as time series are concerned.
The fitting of an ARMA/ARIMA model in a time series requires initially to look up autocorrelation and partial correlation
plots, a commonly used tool for model identification in ARMA/ARIMA models, in order to detect whether the data is station-
ary or not. Also, partial autocorrelations plots are useful in identifying the order p of an autoregressive model and autocor-
relations plots in identifying the order q of a moving average model. The autocorrelation plots of X i showed that neither
seasonality nor trend are present in the series, however, there where general signs of non-stationarity in the data. Using first
order differences, the most common and best fitted models in Ps and P r series are the ARIMAð1; 1; 1Þ
ðyt ð1 þ uÞyt1 þ uyt2 ¼ at hat1 ; u ¼ 0:676; h ¼ 0:751Þ and the ARIMAð0; 1; 1Þ ðyt yt1 ¼ at hat1 ; h ¼ 0:564Þ. It is
worth pointing out that several models fitted well to the same data set and in order to choose the ones that gave the best
fit, Akaike’s information criterion (AIC) [26] and Bayesian information criterion (BIC) or Schwarz Criterion (SBIC) [24] were
employed.
Table 2
Fitted ARIMA models to the first four parts of thousand elements of Ps series.
Table 3
Fitted ARIMA models to the first four parts of thousand elements of Pr series.
As can be seen from Table 2, it is quite possible that more than one models may be fitted for a given data subset. In
particular, for the first 1000 elements, both ARIMAð0; 1; 1Þ and ARIMAð1; 1; 1Þ models fitted quite well.
Similar ARIMA models are fitted also to the Pr series of the given IP address. In Table 3, are given the fitted models and
each parameters estimation for continuous parts of thousand elements of P r series.
The same procedure was followed for Ps and P r series of other IP addresses. According to the results presented above, the
general conclusion was that fitting an ARMAðp; qÞ or an ARIMAðp; q; dÞ model with both ranks of auto-regressive and moving
average processes around 1, is the best fitting option for this kind of series. It should also be noted that this procedure is more
suitable when the series is split into smaller parts and then each of the individual parts is processed, rather than processing
the whole series at once.
Packet loss is measured as the ratio of the packets lost in regard to the overall number of packets within established con-
nections and is denoted as:
Packets Lost
k¼
Overall Packets
As has been mentioned in Section 3.1, sampling was performed on three distinct, non-successive working days, by taking
readings every 10 s. Fig. 2 represents the packet loss ratio (k) on these specific days. It is worth emphasising that packet loss
is evident in Fig. 2a for t ’ 5000 and at two other instances for t ’ 6000 and t ’ 7500. By examining Fig. 2b, one can see that
there was only one instance of minor packet loss for t ’ 8000. In Fig. 2c, severe packet loss was recorded for t ’ 5500 and
t ’ 6000, that also lasted for a rather long time interval.
Considering the actual performance of the network at these dates, for k P 0:05 noticeable packet loss was observed at
given instances. The goal was to establish the shortest possible sampling interval (w), together with the smallest possible
value of k that could signal with confidence the occurrence of packet loss in the network.
Various scenarios were examined, where the ratio k was correlated to the use of a sliding data window w. The factors
involved were varied as follows: ratio k coefficient varies as k ¼ 0:01; 0:02; . . . ; 0:10 and sliding data window as
w ¼ 200; 300; . . . ; 900, counted in seconds. The combinations of all the values of k and w give a total of 10 8 ¼ 80 cases.
By examining the simulation results of the first two dates in Table 4a and b respectively, it is clear that for k P 0:04 or
k P 0:05 and w P 600 s, only one time period was indicated to exhibit packet loss, a result that is in agreement with the
actual network performance at these two dates. In addition, from Table 4c where the simulation results of the third day
are given, it is clear that for k 6 0:04 and w 6 600 it is also possible for a network loss to be identified. However, on this spe-
cific day, extremely high packet loss was recorded, as it emerges from the obtained data (see also Fig. 2c) and hence this
result can be considered to be fair enough.
178 T. Vafeiadis et al. / Simulation Modelling Practice and Theory 29 (2012) 173–180
Fig. 2. Ratio k at time t, for the three given dates that data collection was performed.
The overall results conclude that in order to identify a network loss using the ratio of lost packets in regard to the overall
number of packets during the overall connections of a network, sampling intervals must be at least 600 s long and the ratio k
must be greater than 0.05.
In this paper it has been shown that it is possible to analyse massive network data and signal the occurrence of packet loss
in real time, without requiring any kind of equipment to be installed in advance at the network node that monitoring takes
place. However, some ‘training’ of the proposed system is required, so as to adjust itself to the characteristics of the traffic to
T. Vafeiadis et al. / Simulation Modelling Practice and Theory 29 (2012) 173–180 179
Table 4
Quantum of time intervals with a possible network loss due to ratio k and sliding data window w. The values in bold indicate the simulation results that were in
agreement with the actual network performance at the given dates.
w (seconds)
200 300 400 500 600 700 800 900
(a)
k 0.01 104 63 38 27 22 17 14 13
0.02 71 41 25 17 14 12 10 9
0.03 51 24 15 9 6 5 4 3
0.04 34 17 9 6 3 3 3 3
0.05 24 12 5 2 1 1 1 1
0.06 19 8 4 2 0 0 0 0
0.07 14 6 4 2 0 0 0 0
0.08 11 5 4 2 0 0 0 0
0.09 9 3 2 1 0 0 0 0
0.10 2 2 0 0 0 0 0 0
(b)
k 0.01 43 24 15 12 9 9 6 5
0.02 33 19 12 9 6 4 4 3
0.03 21 7 5 4 2 2 2 1
0.04 13 5 3 3 1 1 1 1
0.05 11 5 3 2 1 1 1 1
0.06 6 4 2 2 1 1 1 1
0.07 5 2 1 0 0 0 0 0
0.08 4 1 1 0 0 0 0 0
0.09 2 0 0 0 0 0 0 0
0.10 2 0 0 0 0 0 0 0
(c)
k 0.01 10 3 2 2 1 1 1 1
0.02 7 3 2 2 1 1 1 1
0.03 6 3 2 2 1 1 1 1
0.04 6 3 2 2 1 1 1 1
0.05 6 3 2 2 1 1 1 1
0.06 5 3 2 2 1 1 1 1
0.07 5 3 2 2 1 1 1 1
0.08 5 3 2 2 1 1 1 1
0.09 5 3 2 2 1 1 1 1
0.10 5 3 2 2 1 1 1 1
be monitored. This ‘training’ effectively involves representing the high-speed traffic using suitable time series models. Then,
by using these models, traffic sharing similar properties can be simulated and useful conclusions can be derived, such as how
sampling should be performed in order to obtain exploitable data.
Future research could involve determining whether this scheme can be fully automated. Namely, whether the system
would be able to adjust in changes in the characteristics of the monitored traffic flows. This is both a matter of being able
to automatically select appropriate time series models and a matter of determining the least amount of network data that
needs to be analysed or sampled. Moreover, since time series are used, it is worth investigating whether it may be possible to
achieve some predictability for possible network errors in the future. In addition, the scheme could be tried on different con-
text, such as P2P-originated network traffic. Finally, it could also be determined whether such an approach can be used for
real-time intrusion detection in high-speed networks and whether this scheme could be used for detecting specific attacks or
classes of attacks.
References
[1] C.-R. Lin, M.-S. Chen, Combining partitional and hierarchical algorithms for robust and efficient data clustering with cohesion self-merging, IEEE
Transactions on Knowledge and Data Engineering 17 (2) (2005) 145–159.
[2] A. Lakhina, M. Crovella, C. Diot, Diagnosing network-wide traffic anomalies, ACM SIGCOMM – Computer Communication Review 34 (4) (2004) 219–
230.
[3] B. Babcock, S. Babu, M. Datar, R. Motwani, J. Widom, Models and issues in data stream systems, in: 21st ACM SIGMOD International Conference on
Management of Data, ACM, Madison, Wisconsin, 2002, pp. 1–16.
[4] C. Morariu, B. Stiller, DiCAP: distributed packet capturing architecture for high-speed network links, in: 33rd IEEE Conference on Local Computer
Networks (LCN) 2008, Montreal, Que, 2008, pp. 168–175.
[5] S. Savage, Sting: a TCP-based network measurement tool, in: USENIX Symposium on Internet Technologies and Systems, USITS, vol. 2, Boulder,
Colorado, 1999, pp. 7.
[6] P. Benko, A. Veres, A passive method for estimating end-to-end TCP packet loss, in: IEEE Globecom, 2002, pp. 2609–2613.
[7] S. Ohta, T. Miyazaki, Passive packet loss monitoring that employs the hash-based identification technique, in: 9th IFIP/IEEE International Symposium
on Integrated Network Management (IM), Nice, France, poster Program (Session 2), 15–19 May 2005.
180 T. Vafeiadis et al. / Simulation Modelling Practice and Theory 29 (2012) 173–180
[8] A. Friedl, S. Ubik, A. Kapravelos, M. Polychronakis, E.P. Markatos, Realistic passive packet loss measurement for high-speed networks, in: M.
Papadopouli, P. Owezarski, A. Pras (Eds.), Traffic Monitoring and Analysis, Lecture Notes in Computer Science, Aachen, Germany, 2009, pp. 1–7.
[9] J. Li, L. Shen, Y. Tong, prediction of network flow based on wavelet analysis and ARIMA model, in: International Conference on Wireless Networks and
Information Systems – WNIS ’09, Shanghai, China, 2009, pp. 217–220.
[10] V. Paxson, S. Floyd, Wide area traffic: the failure of poisson modeling, IEEE/ACM Transactions on Networking 3 (3) (1995) 226–244.
[11] W. Willinger, V. Paxson, Where mathematics meets the internet, Notices of the American Mathematical Society 45 (8) (1998) 961–970.
[12] W. Willinger, M.S. Taqqu, R. Sherman, D.V. Wilson, Self-similarity through high variability: statistical analysis of ethernet LAN traffic at the source
level, IEEE/ACM Transactions on Networking 5 (1) (1997) 75–86.
[13] S. Haykin, Neural Networks, Prentice Hall International Inc., 1999.
[14] K. Kaikhah, S. Doddameti, Discovering trends in large datasets using neural networks, Applied Intelligence 24 (1) (2006) 51–60.
[15] B. Liang, J. Austin, A neural network for mining large volumes of time series data, in: IEEE International Conference on Industrial Technology (ICIT
2005), City University of Hong Kong, Hong Kong, 2005, pp. 688–693.
[16] Y. Wang, I. Kim, Bootstrap-based simple probability model for classifying network traffic and detecting network intrusion, Security Journal 21 (4)
(2008) 278–290.
[17] K. Papagiannaki, N. Taft, Z.-L. Zhang, C. Diot, Long-term forecasting of internet backbone traffic: observations and initial models, in: Twenty-Second
Annual Joint Conference of the IEEE Computer and Communications, vol. 2, 2003, pp. 1178–1188.
[18] D.L. Jagerman, B. Melamed, W. Willinger, Stochastic modeling of traffic processes, in: Frontiers in Queueing: Models, Methods and Problems, CRC Press,
1996, pp. 271–370.
[19] M. Stemm, S. Seshan, R.H. Katz, A network measurement architecture for adaptive applications, INFOCOM 2000. Nineteenth Annual Joint Conference of
the IEEE Computer and Communications Societies, vol. 1, IEEE, Tel Aviv, Israel, 2000, pp. 285–294.
[20] D.A. Vivanco, A.P. Jayasumana, A measurement-based modeling approach for network-induced packet delay, in: 32nd IEEE Conference on Local
Computer Networks – LCN 2007, Dublin, 2007, pp. 175–182.
[21] C. Chatfield, The Analysis of Time Series, 5th ed., Chapman & Hall, New York, NY, 1997.
[22] G.E.P. Box, G.M. Jenkins, Time Series Analysis: Forecasting and Control, Holden-Day, San Francisco, CA, 1976.
[23] P.J. Diggle, Time Series: A Biostatistical Introduction, Oxford University Press, Oxford, 1990.
[24] G.E. Schwarz, Estimating the Dimension of a Model, Annals of Statistics 6 (2) (1978) 461–464.
[25] W.S. Wei, Time Series Analysis: Univariate and Multivariate Methods, Addison-Wesley, Harlow, 1990.
[26] H. Akaike, A new look at the statistical model identification, IEEE Transactions on Automatic Control 19 (6) (1974) 716–723.