Nemo Analyze Training
Custom KPI Workbench &
Custom SQL Queries
NEVER SETTLE FOR ANYTHING LESS THAN EXCELLENCE. NEMO NETWORK TESTING SOLUTIONS
Custom queries and KPIs
SQL queries and KPI Workbench
3rd party
query tool
Nemo Analyze
(visualisation in
map, graph, etc)
Nemo Analyze
(visualisation in
map, graph, etc)
KPI workbench
ODBC interface
(SQL queries)
Nemo Analyze database
2 ways of creating custom KPI:
SQL query
KPI workbench
SQL queries can be used as custom KPIs as
such in Nemo Analyze, or from any 3rd
party tool that has ODBC interface
KPI workbench can be used to further
process the output of SQL queries
SQL Queries vs. KPI Workbench
SQL pros
Efficient for power users
Standard language
Scalability of the queries is good
Excellent tool for grouping and filtering of data
SQL cons
Requires SQL and database schema knowledge
Temporal relations between tables are difficult or impossible to build
Difficult or impossible to track sequence of events, and do other more advanced logic
KPI workbench pros
Graphical UI, no need for SQL knowledge
Possibility to create temporal correlations between tables
Possibility to create state machines, and other advanced sequences
KPI workbench cons
Currently data can be correlated only per file, e.g. MMS end to end delivery time cannot be
calculated from two separate files
Scalabilty is not as good as with SQL, may run into Japa heap space error with large datasets
Use SQL if you need to add complex filter or formatting for one of the exsiting queries in the parameter
tree
Use KPI workbench if you cannot do what you want with SQL
KPI Workbench
NEVER SETTLE FOR ANYTHING LESS THAN EXCELLENCE. NEMO NETWORK TESTING SOLUTIONS
General
KPI workbench is a graphical scripting environment for creating user-defined custom
parameters and key performance indicators
KPI workbench is an additional data processing layer in top of the database SQL interface
Predefined SQL queries (parameter tree), or custom SQL queries are used as an input for KPI
workbench
Results of a custom KPI can be visualised in maps, charts and grids in Nemo Analyze user
interface
Why to use KPI workbench instead of custom SQL queries?
Easier to use: no need for SQL knowledge, or Nemo Analyze database schema
Many advanced queries are not possible to do with SQL: tracking certaing signaling message
sequences, call drop root cause analysis, etc.
Input and Output of the Custom KPI
Custom KPI can have one or multiple
input parameters
Input parameters can be selected
from the parameter tree:
Input parameter is a dataset in
tabular format that has column
containing parameter values, and
typically other related columns
Predefined parameters
Custom SQL queries
Existing custom KPIs
E.g. Ec/No best active set contains
columns for Ec/No, scrambling code,
channel number, time, lat, long
Output is always one dataset
Correlations Merging Input Datasets Together
Input can have multiple datasets, output has always only
one, therefore the input datasets need to be merged, i.e.,
correlated to one table
Mathematical and logical operation nodes take only one
input dataset, therefore if input data is needed from
multiple datasets (parameters), they need to be
combined to one dataset with Correlations
Time based correlations:
Previous value
Current value
Next value
All values within time range
SQL type correlations
Inner Join
Left outer join
Union
Cartesian product
Correlation Using Time Intervals
The underlying data recording principle of all Nemo measurement tools is that the data is
written to the log file only when values change, i.e. values are not written periodically.
More precisely, if any of the information elements of a given event change, the event is
written to the log file.
Correlations
Previous, Current, and Next Value
Each row of the lefmost input dataset is
taken directly to the output
From other input datasets,
previous/current/next row from the
timestamp of each leftmost dataset row
is taken into the ouput
Useful when taking snapshots of the
network
E.g. value of each radio parameter in the
event of dropped call
Correlations
All Values Within Time Range
Useful when combining non-event type data (RSCP, RxLevel, Tx power, etc) to
one table
Scatter plots (Ec/N0 vs RSCP, Throughput vs. RxLevel, etc.)
Custom Events with multiple conditions: filter rows where Ec/N0 < x AND RSCP > y
AND etc..)
Correlations
All Values Within Time Range
Correlates two or more input datasets based on
time
Validity time intervals of the samples are taken
into account in the correlation so that no data is
lost
Output row is written for each timestamp where
any of the input datasets has a row
Output is also written if the validity time interval
of a non-first input dataset ends
Leftmost input dataset works as a filter: output
rows are only written if the leftmost input
dataset has a valid row for the given timestamp
Output contains all columns from all input
tables
= Timestamp of a row in table
= Validitity time interval of a row
Example correlation
A1
Table A
(input)
B1
B2
Table A+B
(output)
A1+B1
B3
A1+B2
A2+B2
A2+B3
B4
A2+B4
A2
A3
B5
A4
A2
A3
A3+B5
A4+B5
A5
B6
There is no new line in the input datasets,
output is written because the validity of B4
ends. The table B columns in the output row
are empty.
Table B
(input)
A6
A5+B5
A5+B6
A6+B6
B7
time
Output is not written from B7 because
Table A (leftmost input) does not have valid
row at the time of B7
Correlations
Union
Each row of each input
dataset is put to output
Union does not merge
rows together from input
tables
Rows are not in time
order
Union + order by time is
used as state machine
input
Example correlation
Table A+B
A1
Table A
A1
Table B
A2
B1
A3
A2
B2
A3
B3
A4
A5
B4
B5
A4
A5
Union
A6
B1
A6
B6
B2
B7
B3
B4
= A row in the table
B5
B6
B7
Aggregate Functions
Average, Minimum, Maximum, Sum, and Count
Input: dataset of N rows
Output: One row with the aggregate of selected input column
Definitions needed for the node:
Column: selected the column from the input dataset to which the
aggregate will be calculated
Weight by: Each sample can be weighted (multiplied) by other
column of input data. Typical usage is to weight average by time or
by distance
Group by: Data can be grouped, e.g. Avg RSCP per scrambling code
State Machine
Idle
Call attempt
Call failure
Powerful tool for various applications:
Calculating delay of any signaling procedure, e.g. call setup
Call
delay, handover delay
disconnected
Tracking certain signaling message sequence and create
event the occurence
Track call, or packet data transfer state, and calculate
statistics binned by call/data transfer e.g. Avg tput per data
transfer
Input: one dataset with the needed input data in time order
(UNION + Order by time)
Output: Row per each transition arrow in the state machine, if
output is set
start_time: timestamp of the transition
end_time: timestamp of the next transition
time_interval: time difference in milliseconds between
start_time and end_time, that is, the time spent in the
target state of the transition
Time_inter, with columns for the time spent in the old state,
running index number, and user-definable text
Call
dropped
Call connected
Connecting call
Alerting
Configuring State Machine
Required States are added from
the properties of the state
machine
Transitions are defined in the
properties of source state
For each transition, the
conditions must be defined
based on the values of input data
row
If text in the State transition
Output field is defined, and
output will be written from every
occurrence of the transition
GROUP BY Node
GROUP BY node can be used group table of
data based on any column or columns of the
table, and to calculated aggregates (min, max,
avg, etc) per group for other columns
E.g. Statistics per cell table, with average RSCP,
average Ec/N0, nr. of dropped calls, nr. of failed
calls
Creating Per Session Aggregation
It is often necessary to aggregate
data per session (call, attach, PDP
context, data transfer, etc
For this purpose, session
parameters are available for every
session type in the parameter tree
The session parameters are
returning one row per session, with
session specific information such as
status (fail/success), cause code,
call setup time, data connection
setup time, data transfer protocol,
etc.
The session parameters have
timestamp at the beginning of the
session, with time range/time
interval ranging to the end of the
session
This makes it possible to
correlate any metrics with the
session by using the session as
master input in All Values
Within Time Range (see next
slide)
By adding Group By after that,
the metrics can be aggregated
per session
All Values Within Time Range
with Session as master input
= Timestamp of a row in table
= Validitity time interval of a row
Example correlation
Call session
(Master input)
B1
Call 1
BLER
(input)
Output
Call1+B1
B2
Call1+B2
B3
Call1+B3
B4
Call1+B4
Call1
Call 2
B5
B6
There is no new line in the input datasets,
output is written because the validity of B4
ends. The table B columns in the output row
are empty.
B7
time
Output is not written from BLER 7 because
Call session does not have valid row at the
time of BLER 7 (that is, the call was ended
before)
Call2+B5
Call2+B6
KPI Workbench improvements (Professional Edition)
Possibility to execute a KPI simultaneously on all
measurement files
Needed when correlating data across files, e.g.
Comparing scanner and mobile data
Time-triggered state transition in state machine
Enables event creation when given message was not
received within defined time period
State transition triggered by changed value in state machine
Enables event creation when e.g. Serving cell changes
New Session parameters in the Parameter Tree
Aggregation of data per call, data transfer, PDP
context, Attach, without having to create state
machine to track the session
Time shift node
Time shift node can be used to modify the timestamp and the time range of any input data.
One of the most relevant use cases of this node is when one wants to automatically capture details
before and/or after a particular event for custom root cause analysis purposes
Picture below illustrates capturing of RF details 4s before and after each dropped call using time shift
node
Resample node
Resample node can be used to resample
any input data containing the time column
Sampling period is user-definable in
milliseconds
Nemo logging tools write data to the log
file when the value changes, not
periodically
Data is handled properly in Nemo Analyze,
but if the data is to be exported to a 3rd
party tool, it is usually better to export the
data with a constant sample period
Together with Time shift and Group by
nodes, Resample can be also used to bin
data over a longer period to reduce the
data amount.
Running KPIs per File/Measurement/All
KPI can be run in three modes:
Per file. Input queries are run per file
Per measurement. Input queries are
run over all files of a measurement at
time
All. Input queries are run over the
whole data set as one shot.
Per file should be use always when possible
Input data is processed one logfile at
time, even when running the KPI over
multiple KPIs
Lower memory consumption
Can be used always when data is not
correlated across multiple logfile
Execute per measurement is when comparing
data across two or more logfiles of the same
measurement
Excute per all when comparing data across
arbitrary sets of data, e.g. two measurements
collected with different drive test kits at the
same time or when comparing imported OSS
data to drive test data
Inputs must be sorted by time
Execute per file, Example
In this example, Ec/No,
RSCP, and TX power per
logfile are correlated to
same table
Execute per Measurement, Example
For example, comparing
scanner RSRP and mobile
RSRP of the same drive
Note that all the input
queries are run over all the
files of a measurement,
when comparing e.g. RSCPs
of two terminals, the RSCP
queries must be filtered by
device number
SQL Queries
NEVER SETTLE FOR ANYTHING LESS THAN EXCELLENCE. NEMO NETWORK TESTING SOLUTIONS
To get started:
Read this document
Get SQL editor. Queries can be written with Nemo Analyze database browser, but better SQL
freeware editors are available, such as http://gpoulose.home.att.net/Tools/QTODBC61.msi
Login/Passwd for the database: administrator/password
Get following reference documents from the Nemo User ClubUser manuals and user
documents downloads
Nemo Analyze database schema (Describes the table structure of the database)
Nemo File format specification (Describes the events in Nemo measurement fileformat)
Open access SQL reference (Describes the supported SQL syntax)
Nemo Analyze user manual (Describes the special, Nemo-specific SQL scalar functions and stored
procedures supported by the Nemo Analyze SQL driver)
Nemo Analyze Database Solution
Database solution from ObjectStore
OBDC interface from OpenAccess
+ Nemo added scalar functions and stored procedures
The solution is optimized for fast and convenient use as a standalone tool:
No high system requirements standalone Analyze runs in a standard PC
Queries over single measurement file are always fast regardless of the amount of files in the
database
Maintenance free, no DB administrator needed
Tables
Data in relational DB is stored in relations, which
are perceived by the user as tables
Table represents an object, or an event (e.g.
employee, sales order, etc.)
Each row of the table represents an unique
instant of the object or event
Each column represents different peace of
information regarding the object or event (e.g.
first name, last name, salary, customer name,
order charge, etc.
When referred to a table in a query, the syntax is
{schema name}.{table name}
E.g. "Nemo.UMTS"."BLER
Scema: Nemo.UMTS
Table: BLER
Relations Between Tables
If rows in given table can be associated with rows in
another table, tables are said to have a relationship
between them
One-to-one, one-to-many, and many-to-many relations
are possible (first two exist in Nemo Analyze DB)
E.g. one-to-one: in a given table, each row has a relation
with a single row of another table.
Every table has a primary key (red arrow)
Primary key is a column that uniquely identifies the row within a
table. Using scalar OID_SHORT(oid) will convert oid columns
into a 9 digit integer.
Foreign key (yellow arrow) is the oid of the related row
in another table
- E.g. BLER and Event: the_event of each row in the
BLER table equals oid of one row in the Event table
Overview of the Database Schema
Database schema is based on Nemo logfile format
Nemo fileformat is event based
New event is recorded when recorded value changes, not periodically
Data is grouped to different events based on relations between data, e.g. SHO event records all data that is
related to soft handover
Every event of the Nemo fileformat has a corresponding table in the database
See Nemo file format specification and database schema for more information about different events and
mapping of events to tables
oid (Object ID) column is present in every table of the database
oid is the unique identifier of a row in the table
Measurement table has one row per measurement. Measurement in Nemo language means a logfile, or set of
logfiles
ECNO
collected as multi-device measurement.
the_event
Device table has one row per logfile
...
SHO
Event table contains time and GPS information of all data
the_event
... CAA
Measurement
oid
title
...
Device
oid
the_measurement
device_extension
Event
oid
time
gps_latitude
gps_longitude
the_device
...
the_event
...
All event tables
the_event
...
Nemo Analyze Database Schema
Mapping of Fixed Size Events to DB Tables
If the event has always fixed number of
parameters, it is mapped straight to
corresponding table
For example, mapping of SHO event to the DB:
For each SHO event, there is one row in the
SHO table
For each measurement there is one row in
Measurement table
SHO and Event tables have one-to-one
relation
Device and event have one-to-many relation
Measurement and Device have one-to-many
relation
Nemo Analyze Database Schema
Mapping of Variable Size Events to DB Tables
If the event has variable amount of parameters, it is
mapped to multiple tables in the DB
Mapping of each event is described in database schema
documentation
Example: ECNO event. Number of parameters in ECNO
event depends on the amount of cells recorded
For each ECNO event, there is one row in ECNO table
For each carrier measured in one ECNO event, there
is a row in Channel table
For each cell (active, monitored, etc.) measured in the
ECNO event, there is a row in Cell table
Event, Device, and Measurement relations are as with
fixed size events
+ Views
There is a + view for every table that has relation with
the Event table
Views are displayed as tables, where the table name has
+ in the end
In the view, the corresponding table has been joined
with the Event, Device, and Measurement tables
Time, GPS data, and the measurement file name are
associated with the actual parameter
For example: BLER+ table contains also the time and
coordinates, joined from the Event table
+ views are faster to query, and easier to use, compared
to self-made JOIN
+ views should be always used in queries it time,
or lat&long is needed!
BLER
BLER+
Right way to do it
Timestamp in the Database
Time is recorded to the database in
Timestamp format (column: sql_time)
Native binary format (column: time)
Data is recorded event based, not periodically in Nemo
fileformat
In order to get correct statistics, and to plot correct
graphs, two things are needed:
Timestamp. Point of time when the event occurred
Interval. Elapsed time of how long the recorded event
has been/will be valid
Both the timestamp, and the interval are embedded to
the binary timestamp:
Timestamps can be fetched with scalar function T_
(native binary timestamp)
Interval in milliseconds can be fetched with scalar
function TI_INTERVAL(native binary timestamp)
When custom queries are made to Analyze UI binary
time should be used
Analyze UI fetches both the timestamp and the time
interval from the native binary timestamp
automatically
Filtering Results of Certain Measurement File(s)
Fastest to query and easy way to do it is to use HINT
HINT is added at the end of the query, in the following
format:
/* MEAS({measurement_file_name_1}:{file_extension}|
{measurement_file_name_2}:{file_extension}|....) */
When using HINT, queries over certain file(s) are
always fast, regardless of the amount of other files in
the DB
JOIN can be done with regular SQL syntax also
This is needed only when data from Device, or Measurement
table is needed to return
Even in that case it is recommended to use HINT to
perform the actual filtering, because it makes the
query faster
When custom queries are made to Analyze UI,
measurement filtering MUST NOT BE DONE, unless
query is wanted to be statically limited to certain
measurement file(s)
HINT is added automatically to the query in
runtime
Right way to do it
Value Enumeration
Lots of data is recorded in number format, where the
meaning of each value is enumerated in the fileformat
Numbers are written to the DB
In the DB there is ValueEnum table, which has enumeration
of all the number parameters
Real meaning of a number value can be fetched with
following scalar function:
VAL_TO_STRING({parameter_name_in_ValueEnum}, {value})
When custom queries are made to Analyze UI,
VAL_TO_STRING is not needed!
Number values are displayed automatically in
decoded format
Connections 1/2
Certain tables, like DAS have relation to
Connection table
Every DAS event belongs to some connection
DAS can belong to: Attach, PpdContextActivation,
Data, and DataTransfer connection(s)
Attach is parent connection of
PDPContextActivation, PdpContextActivation is
parent connection of Data, etc.
Two scalars are available related to connections:
CONN_IS_SHARED(conn1.oid, conn2.oid).
Checks if the connections are the same, or if
the one is parent for another
CONN_IS_TYPE(conn.oid, numeric_exp).
Checks if connection is of given type
Connections 2/2
Examples:
1. Get only those throughput samples that were
recorded during data transfer:
2.
Get all throughput samples recorded when PDP
context has been activated and Access point has
been Internet
If the table does not have relation to connection
(column the_connection does not exist), These
things have to be done with time correlation
(described later)
Connection types:
0="Unknown"
1="Voice"
2="Handover"
3="Attach"
4="PDP contect
activation"
5="Data"
6="RRC"
7="Data transfer"
8="MMS"
9="SMS"
10="POC"
11="LAU"
12="RAU"
13="Ping"
Correlating Tables Without Relation 1/3
1.
2.
Most tables do not have any relation between each other, but you may still want to join data between such tables
Two type of JOINs can be made based on time:
Sample based correlation
Each row of table x and a row from table y WHERE the timestamp of x row is within time range of the y row
E.g. Tx power vs. RSCP scatter
Time range condition
- Each row of table x WHERE the timestamp is in certain range, defined by some other tables.
E.g. Tx power samples between Attach request and Attach Accept L3 messages.
Correlating Tables Without Relation 2/3
Sample Based Correlation
1.
2.
3.
4.
List tables to be joined in FROM clause: FROM table x, table y
make x.time = y.time condition to WHERE clause
When x table is the first one in FROM clause:
each y sample is taken, and checked, if there is a x sample that has the same timestamp as y sample OR if the x timestamp falls
to the validity time interval of the y sample
When y table is the first one in FROM clause, the comparison is vice versa
Comparison is not bi-directional! It there is a big difference in the sample periods of x and y, the one that has smaller sample
period, should be selected as y in order to get highest possible resolution
Make x.the_device = y.the_device condition to WHERE clause (this limits the time correlation within each measurement file, improves the
performance)
Add /* OPTIONS(USE_TIME_SCOPE) */ hint at the end of the sql query. This enables the time scope join.
For example: Get Tx power vs. RSCP, when RSCP < -95 dBm
Tx power is in Nemo.UMTS.TXPC table, RSCP is in Nemo.UMTS.ECNO.Cell table, no relation between the tables
All Tx power samples are taken and checked if they fall to the validity time interval of an RSCP sample that is < -95
SELECT a.sql_time AS "Time", "tx_power_umts", "rscp"
FROM "Nemo.UMTS.ECNO"."Cell+" b, "Nemo.UMTS"."TXPC+" a
WHERE b.order = 1 AND b.cell_type = 0 AND a.time = b.time AND a.the_device = b.the_device AND
rscp < -95
/* OPTIONS(USE_TIME_SCOPE) */
Correlating Tables Without Relation 3/3
Time Range Correlation
sql_time columns is the timestamp in datetime format
sql_time column has to be used always in the conditions where it is needed to test whether a
timestamp is in the range of set by two other timestamps
Time indexing does not work if the T_ ( time) is used, or sql_time is used inside scalar function, this
will degrade the performance of this kind of queries
Guidelines for Custom Queries in Analyze UI
Binary timestamp (time) can (and should) be returned in the SELECT row, UI converts it automatically
to readable timestamp
Binary timestamp MUST be used if query is wanted to see correctly in line and bar graph
This is because the validity time interval of the samples is needed also to plot correct graphs
Measurement file filter HINT must not be used in the query, unless it is wanted to be statically limited
to certain measurement file(s) UI filters the query results in runtime
VALUE_ENUM scalar is not needed, UI uses it automatically
Results have to be ordered by time (ORDER BY sql_time) in order to see results correct in graphs and
map
Adding Custom Queries to Analyze UI
Manual Query
Own queries can be added from query manager
1.Query manager -- > add manual query
2.Give name to the query
3.Select Edit SQL manually checkbox
4.Type the query in
5. Set aliases for different graphs, i.e. what
column will be in x-axis, what will be in y-axis,
etc.
6. When finished, the query is available in the
User branch in the parameter tree in Analyze
workspace.
Adding Custom Queries to Analyze UI
Query Wizard
Simple queries can be done with query wizard
Joins are made automatically between tables that have
relationships
Time-based correlation is not done
If two selected tables dont have relation, result
set will be Cartesian product of the tables!
Step-by-step procedure:
Select tables from where data is get
Select columns to be displayed from the table(s)
Select column for ordering the results, if time based
ordering is not sufficient
Set filtering conditions for results
Define aliases between controls and result columns
(which columns will be x and which will be y in graph
etc.)
COMPLETE TOOLSET FOR WIRELESS NETWORK TESTING
www.anite.com/nemo