2016-Wim DQM Brazil (Paper19 - Icwim8)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

DEVELOPMENT OF A WIM DATA QUALITY MANAGEMENT

SYSTEM FOR THE BRAZILIAN FEDERAL ROAD NETWORK

Leonardo Hans Lucas Valter ZANELA Amir MATTAR


GUERSON VAN LOO FRANCESCHI TANI VALENTE
LabTrans/UFSC Corner Stone Intl. LabTrans/UFSC LabTrans/UFSC LabTrans/UFSC
Brazil Switzerland Brazil Brazil Brazil

Abstract
Since the start of the present decade, the Brazilian federal government expanded its use of
WIM systems for highway planning and operations, and over 350 new high-speed WIM sites
should be implemented by the National Department of Transportation Infrastructure (DNIT)
by 2017. Motivated by this situation, DNIT and the Transportation and Logistics Laboratory
at the Federal University of Santa Catarina (LabTrans/UFSC) developed a prototype of an
automated data quality management system, capable of performing effective and
efficient quality management of the federal WIM network in Brazil. This paper describes the
development of quality criteria adapted to the Brazilian road traffic conditions, their
implementation in standard quality checks and the construction of a prototype computer tool
for automated data quality management.

Keywords: Weigh-in-Motion, WIM, data quality, quality management.

Resumo
Com o início da presente década, o governo federal Brasileiro expandiu o uso de sistemas
WIM para planejamento e operações de rodovias, e mais de 350 novos sistemas WIM de alta
velocidade devem ser implementados pelo Departamento Nacional de Infraestrutura de
Transportes (DNIT) até 2017. Motivados por esta situação, o DNIT e o Laborátório de
Transportes e Logística da Universidade Federal de Santa Catarina (LabTrans/UFSC)
desenvolveram um protótipo de sistema automatizado de gestão de qualidade de dados, capaz
de realizar uma gerência efetiva e eficiente da qualidade da rede federal de sistemas WIM.
Este artigo descreve o desenvolvimento de critérios de qualidade adaptados às condições do
tráfego rodoviário no Brasil, a sua aplicação em verificações de qualidade e a construção de
uma ferramenta para gestão automatizada da qualidade dos dados.

Palavras-chave: Pesagem em movimento, WIM, qualidade de dados, gestão da qualidade.


1. Introduction

In the present decade, motivated by a new wave of investments in road infrastructure, the
Brazilian Federal Department of Transportation Infrastructure (DNIT) focused its attention on
modernizing its methods for traffic data collection and truck weight enforcement with the
objective of providing more effective tools for roadway planning and operations. In this
context two national programs were launched:
 PNCT - the National Plan on Traffic Count with the objective of providing permanent
traffic data collection in its road network and therefore improve decision-making for
investments in transportation infrastructure in Brazil. The PNCT consists of 320 traffic
data collection sites equipped with WIM systems throughout the federal road network
and;
 PIAF - the Integrated and Automated Enforcement Stations, which consists of fixed weigh
stations with mainline WIM for screening of heavy vehicles. In its first phase, 35 PIAFs
have been contracted for weight enforcement and operations are expected to start in 2017.

Experience with the use of WIM systems show that system performance generally decays
over time, and these changes in performance have an effect on the accuracy and reliability of
the output of WIM systems. As WIM data will be used as input for different kinds of studies
and political decision making processes, the quality of these processes is directly dependent
on the quality of the WIM data collected. As a consequence, the establishment of effective
methods and tools for monitoring the quality of the WIM data collected is important in order
to ensure the quality of the results and conclusions drawn from the application of this data.

DNIT endorsed a project in collaboration with LabTrans and the consultancy of Corner Stone
International for the development of a prototype automated data quality management system
capable of performing effective and efficient quality management of its WIM network. The
project was divided into three main parts:
 Development of potential quality criteria based on international experience and validation
under Brazilian conditions.
 Implementation of statistical control charts based on quality criteria.
 Implementation of prototype software tool.

2. Development of Quality Criteria

For the development of quality criteria suitable for the Brazilian federal WIM network, a
study was conducted on existing WIM Quality Checks developed in different countries.
Research and applications from South Africa (De Wet, 2012), (Slavik, De Wet, 2012), the
United States (Nichols, Bullock, 2004), (FHWA, 2010) and the European Union (Telman,
Hordijk, 2013), (Lees, Van Loo, 2015) were analyzed with the objective of identifying
potential standard checks that could be adapted for Brazilian conditions. As a result of this
process, the following criteria were selected:

2.1 Traffic Count


The number of registrations per day will vary from day to day because of variations in the
traffic flow. However a full day or a number of hours without any registrations is an
indication that the system may have been offline. Monitoring the number of heavy vehicles
records per hour serves as a criterion to detect if the system has been offline for longer
periods. If a system has been offline for too long, it reduces the reliability of the collected
data.
The PNCT and PIAF are programs developed under the design-build-operate-maintain
(DBOM) contracting model, which means that a private third-party is responsible for the
design and construction as well as operation and maintenance-related services. While the
PIAF has not yet been implemented, the PNCT establishes that a contractor will be penalized
if there is an absence of records in over 10% of the total hours in a month.

Initial assessments of monthly sets of data from PNCT have shown that some of sites
presented ‘gaps’ in the data collection, indicating that the system had been offline for several
days. The initial monthly check on traffic count will result in a warning if more than 72 hours
are registered with no vehicle records in a given month. Later implementations may
incorporate further quality checks based on vehicle counts per hour and per day, which have
been used for similar purposes in different countries (Walker, Cebon, 2012).

2.2 Vehicle Classification


The distribution of vehicles over the various vehicle classes will vary from system to system
because of variations in the traffic flow. The class ‘other’ is generally intended for special
transports with extended axle configurations that do not match any other vehicle class.
However it is also used by WIM systems for unclassified vehicles, which provides an
indication of the incorrect operation of the WIM system itself (Van Loo and Lees, 2015).
According to the Brazilian National Register of Commercial Road Vehicles (2015) the fleet of
special and operational vehicles that do not fall under the established vehicle classes account
to approximately 0.06% of the total registered fleet of heavy vehicles. Thus, it is unlikely that
over 10% of the traffic flow will account to special vehicles under normal operating
conditions. Assessments of data from a number of PNCT sites have shown sites with up to
50% of heavy vehicles classified as ‘other’ (Class L1), while sites in good operating
conditions had percentages as low as 0,8%.

Further evaluation over PNCT sites with more than 10% of heavy vehicles classified as
‘others’ have shown an indication of issues in the measurements of axle distances, which
constitutes one of the most important inputs for the PNCT class division. The investigation
showed that in all of these cases, at least 90% of the vehicles classified as ‘others’ had at least
one of its axle distances measured below 1.00m, while Brazilian regulations establish a
minimum of 1,20m for heavy vehicles (DNIT, 2012). Therefore the initial checks
implemented in the system will generate a warning if over 10% of the detected vehicles are
classified as “others” in two or more consecutive days of a given month.

2.3 Axle Distance


Accurate timing is crucial in any WIM system since it forms the basis for many of its
measurements and calculations, like speed, axle distances, length, classification and axle
loads. Monitoring of standard axle distances – generally the distance between the 2nd and 3rd
axle of a six axle tractor semi-trailer combination - may be used as a possible indicator of
failure in the internal timing of WIM systems (Slavik, 2012). This distance is determined by
the standardized design and construction of this type of truck that has been optimized for
maximum load carrying capacity. As a result, this axle distance will typically show very little
variation.

In order to be effective this check needs a statistically significant number of vehicles, hence a
very common vehicle class. The first analysis of a month of data from a number of PNCT
sites have confirmed that the six axle tractor semi-trailer combination (Class E1 in PNCT and
Class 3S3 in PIAF class division) suitable for this type of criteria besides being one of the
most common vehicle classes in all sites. In the analysis it was verified that most sites had an
average axle distance of between 125 and 126cm with a standard deviation of less than 2%.

2.4 1st Axle Load


International research shows that measurements of the first (steering) axle loads of 5 and 6
axle articulated vehicles may serve as criteria for quality checks on WIM performance. The
design and load distribution of this type of vehicle makes the load on the first axle relatively
stable and can serve as a reference value. Tests and evaluations performed over WIM data in
Brazil indicate that Class E1 in PNCT (Class 3S3 in PIAF class division) is the most suitable
for this type of criteria. The graph on Figure 1 presents the distribution of first axle loads of
E1/3S3 trucks over the period of two weeks at a fixed low-speed enforcement axle scale:

Figure 1 – First axle load distribution - Class E1/3S3– Weigh Station 1608

The average load of the first axle of the 1.282 vehicles of Class E1/3S3 that entered the
station was 5.725kg with a standard deviation of 650kg. The graph shows a downward shift in
the measurements at the same time when the enforcement officers claimed operational issues
with the equipment.

2.5 Validation of Quality Criteria


At this stage, the first set of quality criteria were validated using up to five months of data
from PNCT sites. The graph in Figure 2 shows an example of application of the preliminary
methodology where daily averages were calculated over a period of two months. In this
example, the WIM site is composed by an array with two lines of piezoelectric sensors.

Figure 2 – Daily averages of first axle loads - Class E1 - Site 40046 (PNCT) - Lane 1

The graph on Figure 2 shows the calculation of daily averages of first axle loads of six-axle
articulated heavy vehicles over the period from the August to September, 2014. This site
started operations in May and the reference line was drawn based on the averages of the first
month of data collection. The charts for the site present a relatively steady trend around the
reference line until the end of August. In the end of August and especially throughout
September, the daily averages started running further above the reference line, showing a
possible deterioration in the performance of the WIM system and the potential of the quality
check as a tool for detecting shifts in WIM data under Brazilian local conditions.

Besides the evaluations for the development of quality checks based on first axle loads,
similar evaluations were performed for checks on axle distance, classification and traffic
count processes. Hence, this stage of the analyses presented specific results that confirmed
initial assumptions regarding the suitability of certain quality checks, which supported the
execution of further analyses and developments.

3. Implementation of quality control charts

The control charts for WIM data quality management were structured through the adaptation
of Statistical Process Control (SPC) techniques. In this context, Statistical Control Charts are
SPC tools that aim to detect abnormal variation due to circumstances that are not usual or
inherent in regular processes. In this methodology, the control charts take into consideration
the 1st Axle Load and Axle Distance WIM quality criteria, which are both applied with six-
axle tractor-semitrailer vehicle combinations (Class E1 or 3S3). These control charts are
based on the theory of Shewart’s control charts for variables (Montgomery, 2004), and the
application of the quality checks are performed through two distinct phases:
 Phase I – Qualification of reference period.
 Phase II – Evaluation of subsequent months of data collection.

3.1 Phase I
Phase I is performed for the first full month of data collection, immediately after the
calibration of the system. This period will be used as a reference for subsequent monthly
quality checks, so the consistency of the data is validated with the aid of charts based on daily
averages and standard deviations in combination with their respective lower and upper control
limits. Due to natural variation in local traffic conditions, the daily subgroups of data vary in
size, so the calculations of the central lines and the control limits take that into consideration.
The formulations for obtaining the upper and lower limits in Phase I are shown on Table 1:

Table 1 – Phase I – Formulations for control charts

Chart Lower Control Limit Upper Control Limit Central


Line
Averages k k
(x ) x s x s x
c4  n c4  n
Standard
 k   k 
deviations 1   1  c 4 2   s 1   1  c 4 2   s s
(s)  c4   c4 

 x and s refer to the mean of the subgroup averages and standard deviations,
respectively;
 n refers to the number of measurements per day;
 c4 is a control chart constant that depends on subgroup size:
4  (n  1)
c4  (1)
4 n  3
 k refers to the number of standard deviations to be drawn from the central line. Theory of
control charts states that for an industrial context with limited common cause variation,
the default value for k is 3. For control charts outside the industrial context with a larger
expected variation, a higher value for k can be chosen in order to minimize false warnings.
Thus, the calculation of the control limits in this methodology considers k = 4 and 5 for
the Axle Distance criteria and the 1st Axle Load criteria, respectively.

The establishment of the k values was based on empirical analyses and validation performed
over samples of WIM data in the scope of the project. These values were effective in
providing limits for the detection of unusual data variation without excessive false warnings.
Different values, however, may be adopted as further operation and validation takes place.
Figure 3 shows validation results of Phase I with the 1st axle criteria and WIM data from the
Araranguá WIM test-site managed by LabTrans in cooperation with DNIT. In this case, the
WIM site is composed by two lines of piezoquartz sensors:

Figure 3 – Phase I - Daily averages of 1st axle loads – E1/3S3 – Araranguá test site

In the chart, all points fall inside the established limits and the period is qualified as a
reference for the given criteria. The chart shows a widening of the control limits between
December 22nd and December 31st, which is due to the low volume of heavy vehicles during
the holiday season. The reliability of the checks becomes lower with a smaller amount of
measurements, so the method takes this into consideration and makes the control limits less
strict. Besides the quality check performed through control charts, the qualification of the
reference period in Process I is done with an extra set of fixed absolute limits in order to
verify basic calibration of the system:
 1st Axle Load criteria: x < 4000 or x > 7000.
 Axle Distances criteria: x < 120 or x >140.
According to the established methodology, quality checks on the reference month of data will
generate a quality warning if one point falls outside the control limits.

3.2 Phase II
Phase II involves quality checks on data collected in the months after the reference period.
The formulations for obtaining the upper and lower limits are shown in Table 2, where   s
and   x , as previously calculated. The values of k and the formulation for c4 remain the
same as in phase I.

Table 2 – Phase II – Formulations for control charts

Chart Lower Control Limit Upper Control Limit Central


Line
Averages ( x ) k k
   
n n 
Standard  c  k  1  c 2     c  k  1  c 2    c4  
deviations (s)  4 4
  4 4

The control charts for the subsequent months of WIM data collection will generate a quality
warning if a run of 3 consecutive points occurs outside of the control limits, which may
indicate a trend in the WIM data. Figure 4 shows the application of Phase II as part of the
validation tests of the methodology:

Figure 4 – Phase II - Daily averages of 1st axle loads – E1/3S3 – Araranguá test site

The figure shows a merge of control charts for the months of March and April, 2015. In the
given data collection process, the daily averages of the reference vehicle 1st axle loads
remained relatively stable until the end of March, but with a slight upward trend. In the first
three days of April a run of 3 consecutive points occurred outside the control limits,
generating a quality warning. An assessment of the performance of that WIM site with
hundreds of measurements from the low-speed enforcement site nearby confirmed an upward
shift in the system’s weight measurements. The mean difference in GVW measurements from
both systems went from 2.28% in March to 3.21% in April. These results indicate the
potential effectiveness of the quality check in pointing out trends in the WIM data.

3.3 Performance tests


The chosen quality checks and control charts were implemented in a prototype computer tool
and their performance was assessed using up to 12 months of data from PNCT and the
Araranguá WIM test site. The main conclusions drawn from these tests were:
 For checks on 1st axle loads, the control charts based on daily averages showed to be valid
for detecting shifts in the accuracy of the data collected. Tests with control charts based on
standard deviations were tested and need further evaluation before being implemented due
to excessive generation of false warnings.
 For checks on axle distances, the validation tests indicate that control charts based on
standard deviations appeared to be best suitable for detecting possible system
inconsistencies. Control charts based on averages were mostly stable and therefore unable
of detecting inconsistencies in axle distance measurements, so further evaluation needs to
be done for this type of quality check.
 For checks on vehicle classification, it was verified that the percentage of heavy vehicles
classified as ‘others’ is a suitable criteria. It remains around 1% in sites with good
operating performance and can reach up to 59% in other sites. It was found that most sites
with high numbers of unclassified vehicles also presented inconsistencies in its axle
distances. Furthermore, there is indication of a direct correlation between vehicles being
not classified with the detection of axle distances smaller than 1.00m.
 For checks on traffic count, validations showed that the checks were able to identify gaps
in the data and are especially relevant for large WIM programs like the PNCT, where
significant loss of data can been prevented if detected in earlier stages.

4. Development of a prototype computer tool

A prototype computer tool was developed in order to automate the application of the
developed quality checks and facilitate the overall management of data quality. The large
volume of data received motivated a careful decision on the programming language and the
database tool to be used.

The chosen programming language was Python. For the database, in order to prevent
performance issues due to the large volume of data processing, the solution was found in non-
relational databases, which are often faster than traditional databases (Nayak et al., 2013).

4.1 Data model and Flow


Figure 5 represents the basic flow of data inside the application, and names a few of the
processes involved in the transformation of this data. The whole process is presented in
general terms as follows:

Figure 5 – General data flow diagram within the application

 New data is given to the software via processing of XML files. In a first stage, the
software reads all records from the XML file and insert them into the database under a
table named ‘Records’.
 After processing the XML file, the data that was inserted is read again to perform a
grouping operation that joins all the readings in periods of fifteen minutes, one hour and
one day (only daily groupings are of interest to the quality assurance criteria). When
performing such groupings, a summary containing the number of vehicles crossing the site
and the mean value of all the weigh and axle distance readings is calculated and stored at
the corresponding tables. The summaries are also separated by vehicle class.
 Data in the flow tables is accessed to generate indicators per day; these indicators are
calculated and represent parameters used as quality criteria for quality checks in each day.
 Having the indicators ready, the system analyses each full month of records and calculates
the quality criteria, checking if any of the sites in any month has violated the calculated
limits. If there is a violation, the software creates a record in the alerts_per_month table,
which stores data quality warnings. A warning is defined as one record of quality criteria
violation. More than one alert can exist for the same site in the same month if it violated
more than one criterion.

4.2 Features of DQM system


In order to support the automatic application of standard quality checks, the prototype
of Data Quality Management System developed for DNIT's WIM network counts on other
functionalities, such as automatic warnings for questionable data quality and generation of
diagnosis executive reports. Statistical and practical criteria are used for warnings, allowing
for further investigation with graphs and tables whenever needed by the user. Figure 7 shows
a screenshot from the developed computer tool for automated WIM data quality management:

Figure 7 – Prototype computer tool

As shown on the screenshot, the tool generates a list of quality warnings immediately after the
data is imported. These warnings may be directly accessed and visualized in the form of
control charts, as shown on the right side of the image. From the warning list the user may
also generate reports in editable format for comments and interaction with stakeholders.

5. Conclusions

 A prototype of a Data Quality Management System has been developed by DNIT and
LabTrans for efficient quality control of measurement data from DNIT’s WIM network.
 A first set of quality criteria were identified based on international research and adapted to
the Brazilian road traffic conditions based on data from the first operational PNCT sites.
 The selected quality criteria were converted to statistical control charts that were verified
using a few months of data from a number of PNCT sites and the Araranguá test-site.
 The control charts were implemented in a prototype software tool for automated data
quality management and remote monitoring of all sites in DNIT’s WIM network.
 The tool has shown potential for supporting the management of WIM network, improving
the quality of the collected WIM data, and providing a guarantee for its applications;
 The next stage of the project include a large scale test operation of the prototype computer
tool within all PNCT sites in order to fully assess the efficiency of the methodology and
the computer tool in managing the quality of a large WIM network.
 In the test operation, further assessment and adjustments in the methodology for quality
checks will be made as well as an evaluation of the implementation for a full production
system.

6. References

 De Wet, G. (2012), Data-based WIM calibration and data quality assessment in


SouthAfrica. In: 6th International Conference on Weigh-In-Motion, Dallas. Proceedings:
p. 209-218.
 Departamento Nacional de Infraestrutura de Transportes - DNIT (2012), Quadro de
Fabricantes de Veículos, 166p.
 Federal Highway Administration – FHWA (2010), WIM Data Analyst’s Manual,
Publication No. FHWA-IF-10-018, EUA.
 Lees, A.; Van Loo, H. (2015), Standard Quality Checks for Weigh-In-Motion Data, ITS
World Congress 2015, Bordeaux, France
 Nichols, A.; Bullock, D. (2004), Quality Control Procedures for Weigh-in-Motion data,
Federal Highway Administration – FHWA, Publication No. FHWA/IN/ JTRP-2004/12
 Slavik, M.; De Wet, G. (2012), Checking WIM axle-spacing measurements. In: 6th
International Conference on Weigh-In-Motion, Dallas. Proceedings: 450 p. 156-163.
 Telman, J.; Hordijk, J. (2013), Monitoring prestaties WIM systemen, Rijkswaterstaat
Dienst Verkeer en Scheepvaart, Delft, The Netherlands, (in Dutch);
 Agência Nacional de Transportes Terretres – ANTT (2015), RNTRC em Números
http://appweb2.antt.gov.br/rntrc_numeros/rntrc_TransportadorFrotaTipoVeiculo.asp,
acessed in Jan. 2015.
 Nayak, A.; Poriya, A., Poojary, D. (2013). Type of NOSQL Databases and its
Comparison with Relational Databases, Int. Journal of Applied Information Systems
(IJAIS). 5 (no.4), p16-19.
 Montgomery. D.C. (2004), Introdução ao Controle Estatístico da Qualidade. Rio de
Janeiro: LTC.
 Walker, D,; Cebon, D. (2012), The Metamorphosis of LTPP Traffic Data. In: 6th
International Conference on Weigh-In-Motion, Dallas. Proceedings: p. 242-249.

You might also like