ERAN Capacity Monitoring Guide
ERAN Capacity Monitoring Guide
ERAN Capacity Monitoring Guide
Issue
DraftA
Date
2014-1-20
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and the customer.
All or part of the products, services and features described in this document may not be within the purchase scope or
the usage scope. Unless otherwise specified in the contract, all statements, information, and recommendations in this
document are provided "AS IS" without warranties, guarantees or representations of any kind, either express or
implied.
The information in this document is subject to change without notice. Every effort has been made in the preparation
of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this
document do not constitute a warranty of any kind, express or implied.
Website:
http://www.huawei.com
Email:
support@huawei.com
For definitions of the man-machine language (MML) commands, parameters, alarms, and
performance counters mentioned in this document, see the "Operation and Maintenance" part in
3900 Series LTE eNodeB Product Documentation for eNodeB base station, BTS3202E Product
Documentation for BTS3202E base station, and BTS3203E LTE Product Documentation for
BTS3203E base station.
For the BTS3202E and the BTS3203E LTE, the main control unit, transmission unit, and baseband
unit share the CPU because they are integrated into the same board, called BTS3202E board or
BTS3203E LTE board. The main control board and the baseband board mentioned in this document
correspond to the BTS3202E board or BTS3203E LTE board, and the CPU usage of the main
control board corresponds to that of the BTS3202E board or BTS3203E LTE board.
This document is not applicable to scenarios with large capacity and heavy traffic. For guidelines in
such scenarios, contact Huawei technical support.
Product Versions
The following table lists the product version related to this document.
Product Name
Product Version
DBS3900
V100R009C00
BTS3900
BTS3900A
eNodeB: V100R007C00
BTS3900L
BTS3900AL
BTS3202E
BTS3203E
ii
Intended Audience
This document is intended for:
Field engineers
Change History
This section describes changes in each issue of this document.
Draft A (2014-1-20)
Draft A (2014-1-20)
This is the first draft.
iii
Contents
Contents
About This Document .................................................................................................................... ii
1 Overview......................................................................................................................................... 1
1.1 Network Resources .......................................................................................................................................... 1
1.2 Capacity Monitoring Methods.......................................................................................................................... 3
2 Capacity Monitoring..................................................................................................................... 4
2.1 Introduction ...................................................................................................................................................... 4
2.2 Downlink User Perception ............................................................................................................................... 5
2.2.1 Monitoring Principles ............................................................................................................................. 5
2.2.2 Monitoring Methods ............................................................................................................................... 6
2.2.3 Suggested Measures ................................................................................................................................ 6
2.3 PRACH Resource Usage .................................................................................................................................. 6
2.3.1 Monitoring Principles ............................................................................................................................. 6
2.3.2 Monitoring Methods ............................................................................................................................... 6
2.3.3 Suggested Measures ................................................................................................................................ 7
2.4 PDCCH Resource Usage .................................................................................................................................. 7
2.4.1 Monitoring Principles ............................................................................................................................. 7
2.4.2 Monitoring Methods ............................................................................................................................... 8
2.4.3 Suggested Measures ................................................................................................................................ 8
2.5 Connected User License Usage ........................................................................................................................ 8
2.5.1 Monitoring Principles ............................................................................................................................. 8
2.5.2 Monitoring Methods ............................................................................................................................... 8
2.5.3 Suggested Measures ................................................................................................................................ 9
2.6 Paging Resource Usage .................................................................................................................................... 9
2.6.1 Monitoring Principles ............................................................................................................................. 9
2.6.2 Monitoring Methods ............................................................................................................................... 9
2.6.3 Suggested Measures ................................................................................................................................ 9
2.7 Main-Control-Board CPU Usage ................................................................................................................... 10
2.7.1 Monitoring Principles ........................................................................................................................... 10
2.7.2 Monitoring Methods ............................................................................................................................. 10
2.7.3 Suggested Measures .............................................................................................................................. 10
2.8 LBBP CPU Usage .......................................................................................................................................... 11
2.8.1 Monitoring Principles ........................................................................................................................... 11
iv
Contents
1 Overview
Overview
This chapter describes the types of network resources to be monitored and the method of
performing capacity monitoring.
Table 1-1 describes the types of network resources to be monitored and impacts of resource
insufficiency on the system.
1 Overview
Meaning
Impact of
Resource
Insufficiency on
the System
Monitoring Item
Cell
resources
Physical resource
blocks (PRBs)
Bandwidth consumed
on the air interface
Downlink User
Perception
Physical random
access channel
(PRACH) resources
Random access
preambles carried on
the PRACH
PRACH Resource
Usage
Physical downlink
control channel
(PDCCH) resources
Downlink control
channel resources
Uplink and
downlink
scheduling delays
are prolonged,
and user
experience is
affected.
PDCCH Resource
Usage
Connected user
license
Maximum permissible
number of users in
RRC_CONNECTED
mode
New services
cannot be
admitted, and
experience of
admitted users is
affected.
Connected User
License Usage
Paging resources
eNodeB paging
capacity
Paging messages
may be lost,
affecting user
experience.
Paging Resource
Usage
Main-control-board
CPU
Processing capability of
the main control board
of the eNodeB
KPIs deteriorate.
Main-Control-Board
CPU Usage
LTE baseband
process unit (LBBP)
CPU
Processing capability of
the LBBP board
KPIs deteriorate.
Transport resource
groups
eNodeB logical
transport resources
Packets may be
lost, affecting
user experience.
Transport Resource
Group Usage
Ethernet ports
eNodeB physical
transport resources
Packets may be
lost, affecting
user experience.
eNodeB
resources
1 Overview
Daily monitoring for prediction: Counters are used to indicate the load or usage of
various types of resources on the LTE network. Thresholds for resource consumption are
specified so that preventive measures such as reconfiguration and expansion can be taken
to prevent network congestion when the consumption of a type of resource continually
exceeds the threshold. For details, see chapter 2 "Capacity Monitoring."
Thresholds defined for capacity monitoring in this document are generally lower than those for
alarm triggering so that risks of resource insufficiency can be detected as early as possible.
Thresholds given in this document apply to networks experiencing a steady growth. Thresholds are
determined based on experiences. For example, the connected user license usage threshold 60% is
specified based on the peak-to-average ratio (about 1.5:1). When the average usage reaches 60%,
the peak usage approaches 100%. Threshold determining considers both average and peak values.
Telecom operators can define thresholds based on the actual situation.
Telecom operators are encouraged to formulate an optimization solution for resource capacity
based on prediction and analysis for networks that are experiencing fast development, scheduled to
deploy new services, or about to employ new charging plans. If you require services related to
resource capacity optimization, such as prediction, evaluation, optimization, reconfiguration, and
capacity expansion, contact Huawei technical support.
2 Capacity Monitoring
Capacity Monitoring
This chapter describes monitoring principles and methods, as well as related counters, of all
types of service resources. Information about how to locate resource bottlenecks and the
related handling suggestions are also provided.
Note that resource insufficiency may be determined by usage of more than one type of service
resource. For example, a resource bottleneck can be claimed only when both connected user
license usage and main-control-board CPU usage exceed the predefined thresholds.
2.1 Introduction
You need to determine busy hours of the system for accurate monitoring of counters. You are advised to
define busy hours as a period when the system or a cell is undergoing the maximum resource
consumption of a day.
Table 2-1 describes types of resources to be monitored, thresholds, and handling suggestions.
Table 2-1 Types of resources to be monitored, thresholds, and handling suggestions
Resource
Type
Monitoring Item
Downlink
Perception
User
Conditions
Handling
Suggestions
Add
carriers
eNodeBs.
Usage
of
preambles
contention-based access 75%
for
Usage
of
preambles
for
non-contention-based access 75%
or
Resource
Type
2 Capacity Monitoring
Monitoring Item
Conditions
CCE usage
80%
Connected
user
license
usage 60%
Handling
Suggestions
Uplink or downlink
PRB usage < 90%
Uplink or downlink
PRB usage 90%
No handling is required.
Main-control-board
CPU usage < 60%
Add licenses.
Main-control-board
CPU usage 60%
Add eNodeBs.
Main-Control-Board
CPU Usage
Transport
Group Usage
eNodeB
resources
Resource
2 Capacity Monitoring
where
L.Thrp.bits.DL indicates the total throughput of downlink data transmitted at the PDCP
layer in a cell.
L.Thrp.Time.DL indicates the duration for transmitting downlink data at the PDCP layer
in a cell.
where
L.RA.GrpA.Att indicates the number of times that random preambles in group A are
received.
L.RA.GrpB.Att indicates the number of times that random preambles in group B are
received.
L.RA.Dedicate.Att indicates the number of times that dedicated preambles are received.
2 Capacity Monitoring
If the system bandwidth is 5 MHz or 10 MHz and the PRACH resource adjustment
algorithm is disabled, N is 50.
If the system bandwidth is 5 MHz or 10 MHz and the PRACH resource adjustment
algorithm is enabled, N is 100.
To check whether the PRACH resource adjustment algorithm is enabled, run the LST
CELLALGOSWITCH command to query the value of the RachAlgoSwitch.
If the random preamble usage reaches or exceeds 75% for X days (three days by default)
in a week, enable the adaptive backoff function by running the following command to
help reduce the peak RACH load and average access delay:
MOD CELLALGOSWITCH: LocalCellId=x, RachAlgoSwitch=BackOffSwitch-1;
If the system bandwidth is 5 MHz or 10 MHz, it is good practice to enable the PRACH
resource adjustment algorithm by running the following command:
MOD CELLALGOSWITCH: LocalCellId=x,RachAlgoSwitch=RachAdjSwitch-1;
If the dedicated preamble usage reaches or exceeds 75% for X days (three days by
default) in a week, enable the PRACH resource adjustment algorithm and reuse of
dedicated preambles between UEs by running the following command:
MOD CELLALGOSWITCH: LocalCellId=x,RachAlgoSwitch=
RachAdjSwitch-1,RachAlgoSwitch=MaksIdxSwitch-1;
This helps reduce the probability of UEs initiating contention-based random access in the
case of dedicated preamble insufficiency and therefore helps reduce the access delay.
If PDCCH symbols are excessive, which indicates that the usage of PDCCH CCEs is low,
the resources that can be used by the PDSCH decreases. This will also result in low
spectral efficiency.
2 Capacity Monitoring
If the value of PDCCH Symbol Number Adjust Switch is On, you do not need to monitor PDCCH
resource usage. The reason is that the eNodeB automatically adjusts the number of PDCCH symbols
based on the CCE load to meet the CCE requirement while preventing excessive PDSCH resource
consumption. You can run the LST CELLPDCCHALGO command to query the setting of PDCCH
Symbol Number Adjust Switch.
If the value of PDCCH Symbol Number Adjust Switch is Off, turn on the switch by
running the following command:
MOD CELLPDCCHALGO: LocalCellId=x, PdcchSymNumSwitch=ON;
If the uplink or downlink PRB usage reaches or exceeds 90%, no handling is required.
For details about uplink or downlink PRB usage, see section 2.2 "Downlink User Perception".
2 Capacity Monitoring
The licensed number of connected users can be queried by running the following
command:
DSP LICENSE: FUNCTIONTYPE=eNodeB;
In the command output, the value of LLT1ACTU01 in the Allocated column is the
licensed number of connected users.
If the main-control-board CPU usage is less than 60%, increase the licensed limit.
For details about main-control-board CPU usage, see section 2.7 "Main-Control-Board CPU
Usage."
L.Paging.Dis.Num
where
L.Paging.S1.Rx indicates the number of paging messages received over the S1 interface.
2 Capacity Monitoring
The percentage of paging messages received by the eNodeB over the S1 interface
reaches or exceeds 60%.
1500 or more paging messages from the mobility management entity (MME) to UEs are
discarded in a day.
If the MCS measurement and initial-transmission failure measurement indicate that the
channel quality is poor, KPI deterioration may not be caused by main-control-board CPU
overload but by deterioration in channel quality.
If the KPIs deteriorate and the main-control-board CPU usage exceeds a preconfigured
threshold, you are advised to perform capacity expansion according to section 2.7.3
"Suggested Measures."
VS.Board.CPUload.Mean
where
The percentage of times that the main-control-board CPU usage reaches or exceeds 85%
is greater than or equal to 5%.
When the main-control-board CPU is overloaded, you are advised to add an eNodeB and
connect it to the evolved packet core (EPC) through a new S1 interface.
10
2 Capacity Monitoring
VS.Board.CPUload.Mean
Percentage of times that the LBBP CPU usage reaches or exceeds a preconfigured
threshold (85%) = VS.Board.CPULoad.CumulativeHighloadCount/3600 x 100%
where
The percentage of times that the LBBP CPU usage reaches or exceeds 85% is greater
than or equal to 5%.
When the LBBP CPU is overloaded, you are advised to perform capacity expansion on the
eNodeB user plane as follows:
Add an LBBP to share the network load, and then determine whether to move existing
cells or add new cells based on the number of UEs. The capacity expansion methods are
as follows:
If the radio resources are sufficient (that is, the usage of each type of radio resources
is lower than the threshold), move cells from the existing LBBP to the new LBBP.
If the radio resources are insufficient, set up new cells on the new LBBP.
If the eNodeB has multiple LBBPs and one of them is overloaded, move cells from the
overloaded LBBP to an LBBP with a lighter load.
LBBP load can be indicated by the following:
Percentage of times that the CPU usage reaches or exceeds a preconfigured threshold
If the eNodeB already has a maximum of six LBBPs and more LBBPs are required, add
an eNodeB.
11
2 Capacity Monitoring
where
12
2 Capacity Monitoring
The bandwidth configured for a transport resource group can be queried by running the
following command:
DSP RSCGRP: CN=x, SRN=x, SN=x, BEAR=xx, SBT=xxxx, PT=xxx;
In the command output, the value of Tx Bandwidth is the bandwidth configured for the
transport resource group.
The packet loss rate reaches or exceeds 0.05% for five days in a week
The proportion of the average transmission rate to the configured bandwidth reaches or
exceeds 80% for five days in a week.
The proportion of the maximum transmission rate to the configured bandwidth reaches
or exceeds 90% for two days in a week.
When a transport resource group is congested, you are advised to expand the bandwidth of the
transport resource group. The following is an example command:
MOD RSCGRP: CN=x, SRN=x, SN=x, BEAR=IP, SBT=BASE_BOARD, PT=ETH, PN=x, RSCGRPID=x, RU=x,
TXBW=xxxx, RXBW=xxxx;
If the problem persists after the bandwidth adjustment, you are advised to expand the eNodeB
bandwidth.
(Item 1) Proportion of the average uplink transmission rate to the allocated bandwidth =
VS.FEGE.TxMeanSpeed/Allocated bandwidth x 100%
(Item 2) Proportion of the maximum uplink transmission rate to the allocated bandwidth
= VS.FEGE.TxMaxSpeed/Allocated bandwidth x 100%
(Item 3) Proportion of the average downlink reception rate to the allocated bandwidth =
VS.FEGE.RxMeanSpeed/Allocated bandwidth x 100%
(Item 4) Proportion of the maximum downlink reception rate to the allocated bandwidth
= VS.FEGE.RxMaxSpeed/Allocated bandwidth x 100%
where
13
2 Capacity Monitoring
Allocated Bandwidth
Disable
UMPT
1 Gbit/s
LMPT
UMPT
LMPT
BTS3202E
board
BTS3203E LTE board
Enable
or
You can run the LST LR command to query the values of LR Switch, UL Committed
Information Rate (Kbit/s), and DL Committed Information Rate (Kbit/s).
The types of main control boards can be queried by running the following command:
DSP BRD: CN=x, SRN=x, SN=x;
In the command output, the value of Config Type is the type of the main control board.
The proportion of the average uplink transmission rate (or downlink reception rate) to the
allocated bandwidth reaches or exceeds 70% for at least five days in a week.
The allocated bandwidth is 750 Mbit/s by default. The actually allocated bandwidth can
be obtained from the operator.
The proportion of the maximum uplink transmission rate (or downlink reception rate) to
the allocated bandwidth reaches or exceeds 85% for at least two days in a week.
14
This chapter describes how to identify resource allocation problems. Network abnormalities
can be found through KPI monitoring. If a KPI is deteriorated, users can analyze the access
counters (RRC resource congestion rate and E-RAB resource congestion rate) to check
whether the deterioration is caused by resource congestion.
Description
L.RRC.ConnReq.Att
L.RRC.ConnReq.Succ
L.E-RAB.AttEst
L.E-RAB.SuccEst
L.E-RAB.AbnormRel
L.E-RAB.NormRel
15
If the RRC resource congestion rate is higher than 0.2%, KPI deterioration is caused by
resource congestion.
If the E-RAB resource congestion rate is higher than 0.2%, KPI deterioration is caused by
resource congestion.
16
The fault location procedure begins with the identification of abnormal KPIs, followed up by
selecting and performing a KPI analysis on the top N cells.
Cell congestion mainly results from insufficient system resources. Bottlenecks can be
detected by analyzing the access counters (RRC resource congestion rate and E-RAB resource
congestion rate).
17
4 Related Counters
Related Counters
Counter Name
Description
PRBs
L.ChMeas.PRB.DL.Used.Avg
L.ChMeas.PRB.DL.Avail
L.Thrp.bits.DL
L.Thrp.Time.DL
L.RA.GrpA.Att
L.RA.GrpB.Att
L.RA.Dedicate.Att
L.ChMeas.CCE.CommUsed
L.ChMeas.CCE.ULUsed
L.ChMeas.CCE.DLUsed
L.ChMeas.CCE.Avail
Connected
user
L.Traffic.User.Avg
Paging
resources
L.Paging.S1.Rx
PRACH
resources
PDCCH
resources
18
Resource
Type
4 Related Counters
Counter Name
Description
L.Paging.Dis.Num
Board CPU
resources
VS.Board.CPUload.Mean
VS.Board.CPULoad.Cumulative
HighloadCount
Transport
resource
groups
VS.RscGroup.TxPkts
VS.RscGroup.TxDropPkts
VS.RscGroup.TxMaxSpeed
VS.RscGroup.TxMeanSpeed
VS.FEGE.TxMaxSpeed
VS.FEGE.TxMeanSpeed
VS.FEGE.RxMaxSpeed
VS.FEGE.RxMeanSpeed
Ethernet ports
19