Assessment of Crash Occurrence Using
Assessment of Crash Occurrence Using
Assessment of Crash Occurrence Using
Abstract
This work identifies factors that influence crash occurrence within a traffic analysis zone (TAZ) by accounting for location-
specific effects and serial correlation in longitudinal crash data. This is accomplished by applying a random effect negative
binomial (RENB) model. Unlike commonly used count models such as Poisson and negative binomial (NB), RENB accounts
for heterogeneity and serial correlation in crash occurrence. An RENB was applied to 15 years of crash data in Arkansas with
1,817 TAZs. Four models were developed for total crashes and by severity (property damage only (PDO), injury, and fatal).
RENB-estimated impacts were measured using the incidence rate ratio (IRR). The significant causal factors found to increase
in observed crashes include: (i) average precipitation (a one-unit increase in average precipitation results in a 134% increase
in total monthly crashes for a TAZ); (ii) average wind speed (16%); (iii) urban designation (7%); (iv) traffic volume (2%); and
(v) total roadway mileage (1% for each functional class). Snow depth and days of sunshine were found to decrease the num-
ber of accidents by 15% and 2%, respectively. Employment and total population had no impact on crash occurrence.
Goodness-of-fit comparisons show that RENB provides the best fit among Poisson and NB formulations. All four model diag-
nostics confirm the presence of over-dispersion and serial correlation indicating the necessity of RENB model estimation.
The main contribution of this work is the identification of crash causal factors at the TAZ level for longitudinal data, which
supports data-driven performance measurement requirements of recent federal legislation.
Crash occurrence is an increasing public health concern, While most crash occurrences are in urban areas, the
compromising the well-being of communities and result- risk factors of crashes are higher in rural areas (9).
ing in serious social and economic losses. Globally, more Roadway design elements, narrow shoulders, and higher
than 1 million deaths and 20–50 million serious injuries speed limits can make rural driving conditions more
are attributed to traffic crash incidents (1). In 2015, an hazardous. Low population density and geographic iso-
analysis of road crashes per 100,000 population ranked lation of rural areas limit the transferability of the identi-
Arkansas, a relatively rural state in the U.S., in the top fied crash risk factors from urban to rural areas.
five states with the highest fatality rate (17.8) (2). Therefore, there is a need for targeted studies of rural
According to the Federal Highway Administration areas to identify crash-contributing factors (10).
(FHWA), the rapidly growing population is increasing With U.S. federal legislation requiring performance-
drivers on the road each year, which can result in more based planning, it is increasingly important to be able to
exposure to traffic incidents (3). In response, traffic estimate safety performance measures, like crash occur-
safety strategies have been implemented in the U.S.—for rences, at the same spatial resolution used for mobility,
example, Vision Zero, safety belt use regulations, and accessibility, and other performance measures (11).
distance-based charges (4, 5). Policy, operational, and
infrastructure solutions for crash mitigation are based on 1
Department of Civil Engineering, University of Arkansas, Fayetteville, AR
identified risk factors including driver-related factors
(i.e., impairment, fatigue, or distractions) and roadway Corresponding Author:
infrastructure (6–8). Karla J Diaz-Corro, kjdiazco@uark.edu
2 Transportation Research Record 00(0)
While it is more common to develop crash-prediction the simplest measure to assess the degree of safety of a
models for corridors or specific sites, by estimating or roadway segment, intersection, or sites with similar char-
predicting crashes at the traffic analysis zone (TAZ) acteristics and traffic volumes (21). Caution on the use
level, it is easier to tie safety performance measures into of crash rates must be taken because of the non-linear
long-range transportation planning efforts (12–15). relationship between crash rates and volumes. Thus, it is
Naderan and Shahi included an aggregate macro-level recommended to employ count-data models to identify
crash prediction model to apply during trip generation in crash risk factors (22–25).
the four-step travel demand model to forecast the num-
ber of crashes in urban TAZs (16). The intention of the
model is crash prediction and not to identify causal fac- Causal Factors Associated with Crash Occurrence
tors necessary for countermeasure selection. The ratio- Causal factors of crash incidence are commonly associ-
nale for identifying crash causal factors at the TAZ level ated with roadway network attributes—for example, lane
is that it can be used for the development of long-range width, median type, speed limit, and number of lanes—
safety plans. Thus, here is where it relies on the impor- thus helping engineers and other traffic safety profession-
tance of creating a method to assess crash occurrence als design safer roads (12–14). Additionally, crash causal
using specialized methods to forecast crash frequencies. factors can be associated with drivers’ sociodemographic
A handful of studies focused on identifying risk factors characteristics (age, gender, household income levels) to
of crash occurrence using historical data at the TAZ level determine risk factors and develop laws and policies (26,
(17–20). For instance, Wang et al. showed that at the 27). Weather conditions—for example, rain, snowfall,
macro-level—for example, TAZ and census tract—traffic and temperature—considered causal to a crash may aid
flow influenced crash occurrence, by using Bayesian in the specification of lighting, pavements, and other
models to account for the spatial dependency (20). Peera infrastructure or vehicle designs (28, 29). These elements
et al. studied the relationship among land-use character- are the most common causal factors identified in the lit-
istics and traffic accidents at the TAZ level using a gener- erature, which can be associated in broader categories as
alized linear regression model (17). This paper extends spatial variables (e.g., land use, population, roadway
this approach by using a model that better captures characteristics) and temporal variables (e.g., weather
interdependencies in the data. parameters) (13, 30, 31).
Without accurate estimates of crash-influencing fac-
tors, transportation planning agencies will be unable to
make informed transportation policy decisions to enhance Methods to Identify Causal Factors for Crash
safety. Therefore, the main contribution of this study is to Occurrence
help fill a critical gap in the transportation safety toolkit
The most common methods for estimating crash-
concerning the study of historical crash data through
contributing factors include multiple linear regression
modeling approaches suitable for identifying crash causal
(MLR), multinomial logistic regression, and Poisson
factors at a macro-level (i.e., TAZ level). This paper
regression (32–36). These models estimate coefficients
applies a count model that allows for temporal autocorre-
(parameters) for causal factors to identify and compare
lation in historical TAZ-level crash data with over-disper-
significant factors. Models differ in their assumptions
sion. Specifically, a random effect negative binomial
about the structure of crash data—for example, normal,
(RENB) model is considered. RENB models do not
random, discrete, and non-negative. Assumptions of
require assumptions of homogeneity, autocorrelation, or
observed heterogeneity and spatial correlation are often
statistical dispersion of the data (e.g., mean and variance).
violated by real-world historical crash data (37–39). Li
Instead, RENB models treat data as a longitudinal panel
et al. developed a geographically weighted Poisson
to account for heterogeneity and serial correlation.
regression model and identified traffic patterns, road net-
work attributes, and sociodemographic characteristics as
Background causal factors (36). Of these, daily vehicle miles traveled
(DVMT) had the strongest influence on crash occur-
Measures of Safety rence, reflecting the general relationship between crashes
Crash occurrence is the most consistently referenced and exposure. While this method captures the observed
measure of safety for roadway safety analysis and is heterogeneity that exists in the relationship between
interpreted as a frequency—for example, the number of crash occurrence and explanatory variables over the geo-
crashes divided by a measure of exposure, such as traffic graphical extent of the study area, the calibrated model
volume (specifically annual average daily traffic is not spatially transferable, since it does not account for
[AADT]) passing over a segment of roadway or through spatial correlation in the dataset. Quddus et al. found
an intersection during a specified period. Crash rates are that, in London, traffic flow and the resident population
Diaz-Corro et al 3
Dependent variable
Number of crashes per month per TAZ count na ASP Yesm 1.75 2.50 0.00 24.00
Explanatory variables
Weather effects
Average precipitation inches + NOAA Yesm 0.005 0.024 0.00 0.29
Binary 1: if there is a day with snow depth . 1 in.; 0 otherwise binary + NOAA Yesm 0.097 0.296 0.00 1.00
Average wind speed mps + NOAA Yesm 0.047 0.128 0.00 7.04
Average percent of possible sunshine % - NOAA Yesm 8.18 12.40 0.00 31.00
Average temperature Celsius + NOAA Yesm 60.84 15.01 24.90 89.60
Minimum temperature Celsius + NOAA Yesm 71.81 15.26 33.70 105.6
Maximum temperature Celsius + NOAA Yesm 49.86 14.87 14.40 77.20
Roadway characteristics
Road length of interstate mi + ARDOT No 1.49 4.378 0.00 41.40
Road length of freeways and expressways mi + ARDOT No 0.36 2.471 0.00 53.40
Road length of primary arterial mi + ARDOT No 2.28 5.383 0.00 51.20
Road length of minor arterial mi + ARDOT No 4.49 6.787 0.00 52.30
Road length of major collector mi + ARDOT No 10.72 19.99 0.00 173.5
Road length of minor collector mi + ARDOT No 4.65 11.32 0.00 93.10
Road length of local road mi + ARDOT No 46.96 99.69 0.00 1344
Ln (ADT), in thousands count + ARDOT Yest 6.77 1.369 0.00 10.21
Ln (VMT), in hundred thousand count + ARDOT Yest 10.04 1.746 3.61 15.03
Built environment
Employment density, in hundreds count/mi2 + ACS Yest 0.591 0.699 0.00 2.75
Total population, in thousands count + ACS Yest 1.503 3.640 0.001 48.51
Binary 1: if TAZ is urban binary + ARDOT No 0.561 0.496 0 1
Note: ACS = American Community Survey (from the U.S. Census); ARDOT = Arkansas Department of Transportation; ASP = Arkansas State Police; Ln (ADT) = natural log of ADT; Ln (VMT) = natural
log of VMT; Max. = maximum; Min. = minimum; na = not applicable; NOAA = National Oceanic and Atmospheric Administration; SD = standard deviation; TAZ = traffic analysis zone.
m
time variant by month.
t
time variant by year.
Diaz-Corro et al 5
Oceanic and Atmospheric Administration (NOAA)’s major arterials, major collector, minor collector, and
National Center for Environmental Information (NCEI) local road); (ii) average vehicle-miles traveled (VMT);
(46). The weather data is a monthly average (aggregate) and (iii) ADT. The most current (2017) roadway net-
and does not reflect the weather at the exact time of the work database provided by Arkansas Department of
crash as would be reported in the crash reports. The goal Transportation (ARDOT) was used. Historical data on
of the model is to allow aggregate analysis of crash roadway characteristics was not publicly available and,
casual factors; thus, it was deemed most appropriate to thus, the most recent network file was used. However,
aggregate weather conditions to the TAZ level, and not within a TAZ, it is assumed that the total mileage by
the individual crash level. function class did not change considerably. As evidence,
One of the needs of the proposed model is to have according to the Highway Performance Monitoring
continuous data for all study periods. Since physical System (HPMS) for Arkansas between 2000 and 2014,
weather stations often do not collect data each day of the total highway mileage increased from 97,600 to
the year because of maintenance periods, gaps in tem- 102,595 mi, across the entire state (approximately 5%
poral coverage may be observed. Besides, not every TAZ increase over 53,000 square mile area over a 15-year
contains a weather station. Thus, the following process is span). All roadway characteristic data used for the con-
necessary to convert the ‘‘raw’’ weather data from NCEI struction of this model was obtained from ARDOT. The
to monthly averages for each TAZ. For the weather sta- road links contained in a TAZ were collected using GIS
tion assignment to TAZ, in this paper, it was assumed software (e.g., ‘‘sum lines length’’) to calculate the total
that if there is at least one weather station within a TAZ, length of roadways by functional class within a TAZ in
then the parameters for that weather station can be miles. Pulugurtha et al. assumed that road links which
assigned to be representative of that TAZ. If more than overlapped with TAZ boundaries should not be consid-
one weather station were contained within a TAZ, then ered in their analysis because there was no approach to
the average weather parameters were calculated and attribute the segment to a single TAZ (13). Similar over-
assigned to that TAZ. If there were no weather stations lapping issues were found in the analysis in this study.
found within a TAZ, then the closest-neighbor analysis Unlike Pulugurtha et al., using GIS software, links over-
was performed to determine which of the neighboring lapping a TAZ boundary were identified, and the total
stations was the closest to the centroid of the TAZ, using traffic volume was proportionally split according to length
as maximum threshold the cut-off distance proposed in within each TAZ to each TAZ spanned by the link (13).
Akter et al. (47). Weather can be considered homoge- ADT and VMT were both collected from ARDOT for the
nous within a specified ‘‘cut-off’’ distance of 10–16 mi 15-year study period. ADT and VMT were aggregated by
around the weather station (48). year. ADT and VMT were not available at monthly aggre-
gation as they come from annual traffic counts.
Roadway Data. Roadway characteristics aggregated to
the TAZ include: (i) the total road length by roadway Built Environment Data. Built environment data includ-
functional class (e.g., interstate, other freeways and ing TAZ designation as urban or rural, population, and
expressways, other principal arterials, minor arterials, employment density can be obtained from the American
6 Transportation Research Record 00(0)
Figure 2. Example of disaggregation from census tract to traffic analysis zone (TAZ) level.
Community Survey 5-year estimate at the census tract TAZ for period t is independent of other times—provide
level (50). The built environment (employment, popula- incorrect estimation in the presence of over-dispersion
tion, and urbanicity) characteristics were included as and serial correlation, respectively. Although the NB
planning variables. Data on employment and population model accounts for over-dispersion conditions, it does not
were aggregated yearly. Data on the built environment allow location-specific effects or serial correlation over
was available for all 15 years of the study period. time for TAZ-level crash counts. If over-dispersion and
For Arkansas within the boundaries of a census tract, serial correlation exist in the data, RENB models are rec-
there are approximately one to five TAZs, and the major- ommended, since group-specific effects (i.e., TAZ-specific
ity of TAZ boundaries overlap with census tracts. To effects) are believed to be randomly distributed across
(dis)aggregate from the census tract to the TAZ level, it locations (52). Depending on how the effect deviates from
was assumed that socioeconomic traits are evenly distrib- the mean, and across time, serial correlation can be found
uted across the entire census tract. When a census tract to be positive or negative. Because of the heterogeneity of
intersected with a TAZ, the percent coverage of each cen- the TAZ group-location constraint, it can be assumed
sus tract within the TAZ was calculated (Figure 2). Then, that the number of crashes is related to location-specific
the percentage coverage is used to associate census- effects. Thus, in this study, the RENB model with n num-
derived variables to the TAZ. The resulting values were ber of group locations (e.g., TAZs) and tm periods (e.g., t
the weighted average of the built environment variables years with m months), appears appropriate for analyzing
aggregated by year. the crash frequencies with over-dispersion, at the same
Built environment characteristics were limited to what time accounting for location-specific and temporal effects.
could be garnered from public data sources that were con- From the parent NB regression model, the RENB can
sistent across the state. Because of the lack of available be derived by introducing a random location-specific and
land-use data for Arkansas, as a proxy, however, employ- time effect. The main benefit of using this approach is
ment density and population were used. Previous studies that the variance-to-mean ratio is not assumed equal or
claim that land-use data can be captured by employment constant across the group locations since they are ran-
densities, which measure how efficiently land is being used dom. The relationship between the estimated number of
per unit of employment in a geographical area (51). crashes in a TAZ and the covariance of an observation
unit i, in year t with month m, can be written as:
Modeling Approach
^itm = litm di
l ð1Þ
This paper uses an RENB regression model to assess the
relationship between the number of accidents within a where
TAZ and selected explanatory factors. A key contribution di represents random locations with specific effects.
of this paper is the use of longitudinal data spanning The non-negativity can be rewritten as:
15 years, which combines cross-sectional and temporal
characteristics at the TAZ level to assess realistic beha- ^itm = litm di = expðXitm b + hi Þ
l ð2Þ
vioral models that cannot be identified at the site-level.
Traditional count-based models—for example, Poisson where
and NB, which assume that the number of crashes in a b is the coefficient vector to be estimated;
Diaz-Corro et al 7
Table 2. Model Results (Dependent Variable: Number of Crashes per Month per Traffic Analysis Zone [TAZ])
Coefficient
Variables RE Poisson NB RENB (IRR)
Weather effects
Average precipitation 1.201*** 1.979*** 0.850 (2.34) ***
Average wind speed 0.106** 2.988*** 0.152 (1.16) ***
Binary 1: if there is a day with snow depth . 1 in. 20.163*** 20.183*** 20.166 (0.85) ***
Average percent of possible sunshine 20.019*** 20.041*** 20.020 (0.98) ***
Roadway characteristics
Road length of interstate 0.003 0.001** 0.007 (1.01) ***
Road length of freeways and expressways 20.011*** 20.013*** 0.008 (1.01) ***
Road length of primary arterial 0.013** 0.011*** 0.006 (1.01) ***
Road length of minor arterial 0.01 0.007*** 0.007 (1.01) ***
Road length of major collector 0.005** 0.004*** 0.002 (1.00) ***
Ln (average daily traffic [ADT]) 0.022 20.004*** 0.021 (1.02) ***
Built Environment
Employment density 0.003 20.064*** 0.046 (1.00) ***
Total population 0.001 0.006*** 0.003 (1.00) ***
Binary 1: if TAZ is urban 0.024 0.060*** 0.066 (1.07) ***
Constant 0.479* 0.577*** 20.255***
Model diagnostic statistics
Wald chi2 (DF1 = 13) 431,546*** 22,768*** 12,447***
Ln (alpha) na 0.353*** na
LR test of alpha = 0 chi-squared (DF = 1) 1.1E05*** na na
LR test versus pooled (chi-squared) na na 2.5 E04***
Ln a na na 2.667***
Ln b na na 3.308***
a na na 14.39***
b na na 27.33***
Akaike information criterion (AIC) 1,311,614 1,140,686 1,121,188
Bayesian information criterion (BIC) 1,311,775 1,140,846 1,121,359
Log-likelihood (LL) 2655,792 2570,328 2560,578
Breusch and Pagan Lagrangian multiplier test for na na 1,373.11***
random effects chi-square (01)
Wooldridge test for serial autocorrelation F (1,1816) na na 8,252.17***
N (number of observations) 327,060
i (number of TAZs) 1,817
t (number of years) 15
m (number of months) 180
Note: a = beta-distribution parameter a ; b = beta-distribution parameter b; DF = degrees of freedom; IRR = incidence rate ratio; Ln a = inverse of 1 plus
the beta-distribution parameter a; Ln b = inverse of 1 plus the beta-distribution parameter b; Ln (alpha) = natural log of the over-dispersion parameter; LR
= likelihood ratio; na = not applicable; NB = negative binomial; RENB = random effect negative binomial; RE Poisson = random effect Poisson model.
*
significance at 10%.
**
significance at 5%.
***
significance at 1%.
necessarily higher in snowy weather when compared with Roadway Characteristics Effects. The statistically significant
dry weather as indicated in previous studies (60–63). variables in the model for roadway characteristics were
The IRR value for the variable associated with per- total length of interstate, freeways and expressways, pri-
centage of possible sunshine indicates that 1% increase mary arterial, minor arterial, major collector, and ADT.
in the maximum amount of sunshine possible from sun- The results showed a positive relationship between the
rise to sunset with clear sky conditions was associated number of crashes and the total length of the roadway by
with 2% reduction in total accidents per month, indicat- functional class in a TAZ; however, the magnitude of this
ing that, under clear sky conditions, drivers have better impact is very low, around 1%. The highest effect among
visibility. Note that this variable does not indicate sun roadway variables was found for ADT—for example, the
glare, which could lead to a different interpretation in IRR value indicates that 1% increase in annual ADT
relation to crash occurrence. Average temperature was was associated with 2% increase in monthly accidents.
not found to be significant in the model, consistent with The positive coefficient for all variables representing
the findings of previous studies (64). roadway characteristics was consistent with expectations
10 Transportation Research Record 00(0)
Table 3. Random Effect Negative Binomial (RENB) Model Results Comparison by Crash Severity Type
Weather effects
Average precipitation 0.480 (1.62) *** 0.714 (2.04) *** 1.065 (2.90) **
Average wind speed 0.050 (1.05) *** 0.172 (1.19) *** 0.111 (1.12) *
Binary 1: if there is a day with snow depth . 1 in. 20.181 (0.84) *** 20.159 (0.85) *** 20.30 (0.74) ***
Average percent of possible sunshine 20.013 (0.99) *** 20.020 (0.98) *** 20.014 (0.99) ***
Roadway characteristics
Road length of interstate 0.007 (1.01) ** 20.024 (0.98) *** 0.088 (1.09) ***
Road length of freeways and expressways 0.001 (1.00) 20.020 (0.98) *** 0.068 (1.07) ***
Road length of primary arterial 0.008 (1.01) *** 20.012 (0.99) *** 0.075 (1.08) ***
Road length of minor arterial 0.012 (1.00) *** 20.007 (0.99) *** 0.056 (1.06) ***
Road length of major collector 0.003 (1.02) *** 0.001 (1.00) ** 0.015 (1.02) ***
Ln (ADT) 20.036 (1.04) *** 0.017 (1.02) *** 20.003 (1.00)
Built environment
Employment density 20.014 (0.99) 0.077 (1.08) *** 20.279 (0.76) ***
Total population 0.003 (1.00) 0.000 (1.00) 0.003 (1.00)
Binary 1: if TAZ is urban 0.046 (1.04) 20.080 (0.92) *** 20.218 (0.80) ***
Constant 0.339*** 20.327*** 20.254
Wald chi2 (DF1 = 13) 1,976*** 10,782*** 1,167***
LR test versus pooled (chi-squared) 5.9 E04*** 2.6 E04*** 1.04 E3***
Ln a 1.217*** 2.062*** 5.972***
Ln b 20.787*** 2.784*** 0.175**
a 3.38*** 7.86*** 392.13***
b 0.455*** 16.18*** 1.19**
Akaike information criterion (AIC) 354,218 1,006,690 34,052
Bayesian information criterion (BIC) 354,389 1,006,861 34,223
Log-likelihood (LL) 2177,093 2503,329 217,010
Breusch and Pagan Lagrangian multiplier test for 1.1E+06*** 9.2E+05*** 1,373.11***
random effects chi-square (01)
Wooldridge test for serial autocorrelation F (1,1816) 407.971*** 3,553.41*** 3.354*
N (number of observations) 327,060 327,060 327,060
i (number of TAZs) 1,817 1,817 1,817
t (number of years) 15 15 15
m (number of months) 180 180 180
Note: a = beta-distribution parameter a; b = beta-distribution of parameter b; DF = degrees of freedom; IRR = incidence rate ratio; Ln a = inverse of 1 plus
the beta-distribution parameter a; Ln b = inverse of 1 plus the beta-distribution parameter b; Ln (alpha) = natural log of the over-dispersed parameter
alpha; LR = likelihood ratio; PDO = property damage only; TAZ = traffic analysis zone.
Lastly, the coefficients for the built environment show The study used data from on-the-ground weather sta-
a differing effect by crash severity. For injury crashes, tions in Arkansas supplied by NOAA. Other forms of
none of the variables for the built environment were weather data are available. For example, gridded weather
found statistically significant. For fatal crashes, employ- data is available from other sources (NASA’s Modern-
ment density was associated with a 24% reduction in the Era Retrospective analysis for Research and Applications,
number of fatal accidents, whereas a TAZ with urban Version 2 [MERRA-2]). Grid weather data could be an
characteristics was associated with a 20% reduction alternate source. A sensitivity analysis to determine the
compared with its rural counterparts. For PDO, employ- trade-off in accuracy of the station-based weather with
ment density was associated with an 8% increase in the increasing distance from the weather station was carried
number of PDO accidents, whereas a TAZ with urban out. For Arkansas, the average distance between a
characteristics was associated with an 8% reduction weather station and a TAZ centroid was 22 mi. The
compared with its rural counterparts. MERRA-2 satellite weather data was available for grids
of approximately 30 mi (50 km). This is a slightly lower
resolution than the weather station data; thus, on-the-
Sensitivity Analysis ground weather station data was used in this paper.
A key assumption in the model framework was the desig- To incorporate this data, the following pre-processing
nation of weather conditions to a TAZ. method was performed. When two or more weather
12 Transportation Research Record 00(0)
roadway infrastructure (added lanes, expanded shoulders, and Arkansas Department of Transportation (ARDOT) for
etc.) likely occurred over time. It is recommended that providing the roadway characteristics data.
future work incorporate infrastructure changes over time
if data to do so are available. Likewise, ADT and VMT Author Contributions
data were collected yearly and not monthly. Future studies
The authors confirm contribution to the paper as follows: study
can use monthly traffic counts from ITS systems or cell-
conception and design: S. Mitra, K. Diaz-Corro; data collection:
phone data, as these technologies are becoming increas- K. Diaz-Corro, L. Coronel; analysis and interpretation of results:
ingly common. K. Diaz-Corro, S. Mitra, S. Hernandez; draft manuscript prepara-
In the absence of continuous weather data, future tion: K. Diaz-Corro, S. Hernandez, S. Mitra. All authors reviewed
studies should investigate temporal characteristics related the results and approved the final version of the manuscript.
to weather that may capture seasonal variation in
crashes. For example, the percent of crashes in a month Declaration of Conflicting Interests
aggregated by day of the week (e.g., beginning of the The author(s) declared no potential conflicts of interest with
week, weekday, end of the week, weekend), type of day respect to the research, authorship, and/or publication of this
(e.g., holiday, working day), season (e.g., winter, spring, article.
summer, fall) could be used as indicators of crash occur-
rence without the requirement of detailed weather data. Funding
Additionally, when physical weather station data are not The author(s) received no financial support for the research,
available, it is suggested that other forms of weather authorship, and/or publication of this article.
data, such as NASA’s MERRA-2, are used as an alter-
native source. For Arkansas, this alternative source did ORCID iDs
not make a difference because the level of resolution was
larger than the maximum distance of the on-the-ground Karla J Diaz-Corro https://orcid.org/0000-0002-1936-4547
Leyla Coronel Moreno https://orcid.org/0000-0002-4299-7334
weather data used in this paper.
Suman Mitra https://orcid.org/0000-0002-7776-5779
By estimating crash causal factors at the TAZ level, Sarah Hernandez https://orcid.org/0000-0002-4243-1461
several policy and planning decisions concerning safety per-
formance can be generated. Considering that federal legisla-
tion in the U.S. requires performance-based planning, it is References
necessary to analyze safety at spatial levels of resolution 1. World Health Organization, Department of Violence &
that match those generated for mobility, accessibility, and Injury Prevention & Disability. Global Status Report on
other performance measures. The model framework pre- Road Safety: Time for Action. World Health Organization,
sented in this paper identifies if there are weather, roadway, Geneva, 2009.
and built environment characteristics that can be associated 2. Sivak, M., and B. Schoettle. Mortality from Road Crashes
in the Individual US States: A Comparison with Leading
with crash occurrences. The results from the model frame-
Causes of Death in 2015. The University of Michigan, Sus-
work can be implemented by state transportation agencies tainable Worldwide Transportation, Ann Arbor, MI,
to prioritize safety-related projects. For example, if crashes 2018, pp. 1–36.
are frequently occurring on roadway types with extreme 3. Federal Highway Administration (US). Highway Statistics
weather conditions (e.g., severe rainfall or precipitation pro- 2004. Federal Highway Administration, Washington, DC,
ducing hydroplaning), this combination of characteristics 2006.
can be considered causal factors. Then, locations (e.g., 4. Yang, D., E. Kastrouni, and L. Zhang. Equitable and Pro-
TAZs) can be prioritized for implementing low-cost safety gressive Distance-Based User Charges Design and Evalua-
treatments—for example, signage, surface treatments, and tion of Income-Based Mileage Fees in Maryland.
drainage improvements. The importance of identifying the Transport Policy, Vol. 47, 2016, pp. 169–77.
5. Evenson, K. R., S. LaJeunesse, and S. Heiny. Awareness
causal factors for transportation planning organizations is
of Vision Zero Among United States’ Road Safety Profes-
that there is a requirement for performance-based planning
sionals. Injury Epidemiology, Vol. 5, No. 1, 2018, pp. 1–6.
at the same spatial resolution that matches other perfor- 6. Dingus, T. A., F. Guo, S. Lee, J. F. Antin, M. Perez, M.
mance metrics related to mobility (travel times), accessibil- Buchanan-King, and J. Hankey. Driver Crash Risk Fac-
ity, and so forth. Thus, the results of this model can tors and Prevalence Evaluation Using Naturalistic Driving
improve the ways in which state transportation agencies Data. Proceedings of the National Academy of Sciences of
prioritize safety-related projects. the United States of America, Vol. 113, No. 10, 2016,
pp. 2636–2641.
7. Papadimitriou, E., A. Filtness, A. Theofilatos, A. Ziako-
Acknowledgments poulos, C. Quigley, and G. Yannis. Review and Ranking
The authors thank Arkansas State Police (ASP) Highway of Crash Risk Factors Related to the Road Infrastructure.
Safety Office, for providing the crash data used in this paper Accident Analysis & Prevention, Vol. 125, 2019, pp. 85–97.
14 Transportation Research Record 00(0)
8. Anastasopoulos, P. C., and F. L. Mannering. A Note on Intersections. Journal of Traffic and Transportation Engi-
Modeling Vehicle Accident Frequencies With Random- neering (English edition), Vol. 3, No. 2, 2016, pp. 166–171.
Parameters Count Models. Accident Analysis & Prevention, 23. Coruh, E., A. Bilgic, and A. Tortum. Accident Analysis
Vol. 41, No. 1, 2009, pp. 153–159. with Aggregated Data: The Random Parameters Negative
9. Rakauskas, M. E., N. J. Ward, and S. G. Gerberich. Iden- Binomial Panel Count Data Model. Analytic Methods in
tification of Differences Between Rural and Urban Safety Accident Research, Vol. 7, 2015, pp. 37–49.
Cultures. Accident Analysis & Prevention, Vol. 41, No. 5, 24. Mannering, F. L., V. Shankar, and C. R. Bhat. Unob-
2009, pp. 931–937. served Heterogeneity and the Statistical Analysis of High-
10. Ratcliffe, M., C. Burd, K. Holder, and A. Fields. Defining way Accident Data. Analytic Methods in Accident
rural at the US Census Bureau. American Community Sur- Research, Vol. 11, 2016, pp. 1–6.
vey and Geography Brief. 2016. 25. Xu, P., H. Huang, N. Dong, and S. C. Wong. Revisiting
11. Fixing America’s Surface Transportation Act. In 114th Con- Crash Spatial Heterogeneity: A Bayesian Spatially Varying
gress of the United States of America, Vol. 6, January, 2015. Coefficients Approach. Accident Analysis & Prevention,
12. Mitra, S. Spatial Autocorrelation and Bayesian Spatial Sta- Vol. 98, 2017, pp. 330–337.
tistical Method for Analyzing Intersections Prone to Injury 26. Pirdavani, A., S. Daniels, K. Van Vlierden, K. Brijs, and B.
Crashes. Transportation Research Record: Journal of the Kochan. Socioeconomic and Sociodemographic Inequal-
Transportation Research Board, 2009. 2136: 92–100. ities and their Association with Road Traffic Injuries. Jour-
13. Pulugurtha, S. S., V. R. Duddu, and Y. Kotagiri. Traffic nal of Transport & Health, Vol. 4, 2017, pp. 152–161.
Analysis Zone Level Crash Estimation Models Based on 27. Sagar, S., N. Stamatiadis, S. Wright, and A. Cambron.
Land Use Characteristics. Accident Analysis & Prevention, Identifying High-Risk Commercial Vehicle Drivers Using
Vol. 15, 2013, pp. 678–687. Sociodemographic Characteristics. Accident Analysis &
14. Yu, R., Y. Xiong, and M. Abdel-Aty. A Correlated Ran- Prevention, Vol. 143, 2020, p. 105582.
dom Parameter Approach to Investigate the Effects of 28. Tefft, B. C. Motor Vehicle Crashes, Injuries, and Deaths in
Weather Conditions on Crash Risk for a Mountainous Relation to Weather Conditions, United States, 2010–2014.
Freeway. Transportation Research Part C: Emerging Tech- AAA Foundation for Traffic Safety. Washington, DC, 2016.
nologies, Vol. 50, 2015, pp. 68–77. 29. Saha, S., P. Schramm, A. Nolan, and J. Hess. Adverse
15. Washington, S., I. Van Schalkwyk, S. Mitra, M. Meyer, Weather Conditions and Fatal Motor Vehicle Crashes in
E. Dumbaugh and M. Zoll. Incorporating Safety into Long- the United States, 1994–2012. Environmental Health, Vol.
Range Transportation Planning. NCHRP Report 546. 15, No. 1, 2016, pp. 1–9.
Transportation Research Board of the National Aca- 30. Wong, J. T., and Y. S. Chung. Comparison of Methodol-
demics, Washington, D.C., 2006. ogy Approach to Identify Causal Factors of Accident
16. Naderan, A., and J. Shahi. Crash Generation Models: Severity. Transportation Research Record: Journal of the
Forecasting Crashes in Urban Areas. Transportation Transportation Research Board, 2008. 2083: 190–198.
Research Record: Journal of the Transportation Research 31. Ahmed, M. M., M. Abdel-Aty, and R. Yu. Assessment of
Board. 2010 2148: 101–106. Interaction of Crash Occurrence, Mountainous Freeway
17. Peera, K. M., R. S. Shekhawat, and C. S. Prasad. Traffic Geometry, Real-Time Weather, and Traffic Data. Trans-
Analysis Zone Level Road Traffic Accident Prediction portation Research Record: Journal of the Transportation
Models Based on Land Use Characteristics. International Research Board, 2012. 2280: 51–59.
Journal for Traffic and Transport Engineering (Belgrade), 32. Jovanis, P. P., and H. L. Chang. Modeling the Relationship
Vol. 9, No. 4, 2019, pp. 376–386. of Accidents to Miles Traveled. Transportation Research
18. Mukoko, K. K., and S. S. Pulugurtha. Examining the Record: Journal of the Transportation Research Board,
Influence of Network, Land Use, and Demographic Char- 1986. 1068: 42–51.
acteristics to Estimate the Number of Bicycle-Vehicle 33. Fitrianti, H., Y. P. Pasaribu, and P. Betaubun. Modeling
Crashes on Urban Roads. IATSS Research, Vol. 44, No. Factor as the Cause of Traffic Accident Losses Using Mul-
1, 2020, pp. 8–16. tiple Linear Regression Approach and Generalized Linear
19. Zhang, C., X. Yan, L. Ma, and M. An. Crash Prediction Models. IOP Conference Series: Earth and Environmental
and Risk Evaluation Based on Traffic Analysis Zones. Math- Science, Vol. 235, No. 1, 2019, p. 012030.
ematical Problems in Engineering, Vol. 2014, 2014, pp. 1–9. 34. Arbabzadeh, N., and M. Jafari. A Data-Driven Approach
20. Wang, C., L. Liu, and C. Xu. Developing a New Spatial for Driving Safety Risk Prediction Using Driver Behavior and
Unit for Macroscopic Safety Evaluation Based on Traffic Roadway Information Data. IEEE Transactions on Intelligent
Density Homogeneity. Journal of Advanced Transportation, Transportation Systems, Vol. 19, No. 2, 2017, pp. 446–460.
Vol. 2020, 2020, pp. 1–9. 35. Ye, X., K. Wang, Y. Zou, and D. Lord. A Semi-Nonpara-
21. Carter, D., D. Gelinne, B. Kirley, C. Sundstrom, R. Srini- metric Poisson Regression Model for Analyzing Motor
vasan, and J. Palcher-Silliman. Road Safety Fundamentals: Vehicle Crash Data. PLoS One, Vol. 13, No. 5, 2018, p.
Concepts, Strategies, and Practices that Reduce Fatalities e0197338.
and Injuries on the Road. Federal Highway Administration, 36. Li, Z., W. Wang, P. Liu, J. M. Bigham, and D. R. Ragland.
United States, 2017. Using Geographically Weighted Poisson Regression for
22. Roshandeh, A. M., B. R. Agbelie, and Y. Lee. Statistical County-Level Crash Modeling in California. Safety Sci-
Modeling of Total Crash Frequency at Highway ence, Vol. 58, 2013, pp. 89–97.
Diaz-Corro et al 15
37. Quddus, M. A. Modelling Area-Wide Count Outcomes Columbus, Ohio. Journal of Urban Planning and Develop-
with Spatial Correlation and Heterogeneity: An Analysis ment, Vol. 141, No. 4, 2015, p. 04014040.
of London Crash Data. Accident Analysis & Prevention, 52. Hausman, J. A., B. H. Hall, and Z. Griliches. Econometric
Vol. 40, No. 4, 2008, pp. 1486–1497. Models for Count Data with an Application to the Patents
38. Siddiqui, C., M. Abdel-Aty, and K. Choi. Macroscopic R&D Relationship. National Bureau of Economic
Spatial Analysis of Pedestrian and Bicycle Crashes. Acci- Research, 1984.
dent Analysis & Prevention, Vol. 45, 2012, pp. 382–391. 53. Cameron, A. C., and P. K. Trivedi. Regression Analysis of
39. Zeng, Q., H. Wen, H. Huang, and M. Abdel-Aty. A Baye- Count Data. Cambridge University Press, 2013.
sian Spatial Random Parameters Tobit Model for Analyz- 54. U.S. Census Bureau. QuickFacts. www.census.gov/quick-
ing Crash Rates on Roadway Segments. Accident Analysis facts/AR. 2020.
& Prevention, Vol. 100, 2017, pp. 37–43. 55. Arkansas Department of Transportation. 2014 Facts Sheets.
40. Yakovlev, P. A., and M. Inden. Mind the Weather: https://www.arkansashighways.com/Trans_Plan_Policy/poli
A Panel Data Analysis of Time-Invariant Factors and cy_legis/publications/fact_sheets/2014_fact_sheet.pdf
Traffic Fatalities. Economics Bulletin, Vol. 30, No. 4, 2010, 56. StataCorp. 2019. Stata Statistical Software: Release 16.
pp. 2685–2696. College station, TX: StataCorp LLC.
41. Venkataraman, N. S., G. F. Ulfarsson, V. Shankar, J. Oh, 57. Stine, R. A. Graphical Interpretation of Variance Inflation
and M. Park. Model of Relationship Between Interstate Factors. The American Statistician, Vol. 49, No. 1, 1995,
Crash Occurrence and Geometrics: Exploratory Insights pp. 53–56.
from Random Parameter Negative Binomial Approach. 58. Breusch, T. S., and A. R. Pagan. A Simple Test for Hetero-
Transportation Research Record: Journal of the Transporta- scedasticity and Random Coefficient Variation. Econome-
tion Research Board, 2011. 2236: 41–48. trica: Journal of the Econometric Society, Vol. 47, No. 4,
42. Venkataraman, N., G. F. Ulfarsson, and V. N. Shankar. 1979, pp. 1287–1294.
Random Parameter Models of Interstate Crash Frequen- 59. Wooldridge, J. M. Econometric Analysis of Cross Section
cies by Severity, Number of Vehicles Involved, Collision and Panel Data. MIT Press, Cambridge, MA. 2010.
and Location Type. Accident Analysis & Prevention, Vol. 60. Eisenberg, D., and K. E. Warner. Effects of Snowfalls on
59, 2013, pp. 309–318. Motor Vehicle Collisions, Injuries, and Fatalities. American
43. Mohammadi, M. A., V. A. Samaranayake, and G. H. Journal of Public Health, Vol. 95, No. 1, 2005, pp. 120–124.
Bham. Crash Frequency Modeling Using Negative Bino- 61. Brown, B., and K. Baass. Seasonal variation in frequencies
mial Models: An Application of Generalized Estimating and rates of highway accidents as function of severity.
Equation to Longitudinal Data. Analytic Methods in Acci- Transportation Research Record: Journal of the Transporta-
dent Research, Vol. 2, 2014, pp. 52–69. tion Research Board, 1997. 1581: 59–65.
44. U.S. Department of Transportation, Federal Highway 62. Fridstrøm, L., J. Ifver, S. Ingebrigtsen, R. Kulmala, and L.
Administration, Office of Planning, Environment, and Real- K. Thomsen. Measuring the Contribution of Randomness,
ity. Travel Model Improvement Program (TMIP). TMIP Exposure, Weather, and Daylight to the Variation in Road
Email List Technical Synthesis Series 2007–2010. 2014. Accident Counts. Accident Analysis & Prevention, Vol. 27,
45. Yu, R., M. Abdel-Aty, and M. Ahmed. Bayesian Random No. 1, 1995, pp. 1–20.
Effect Models Incorporating Real-Time Weather and 63. Eisenberg, D. The Mixed Effects of Precipitation on Traf-
Traffic Data to Investigate Mountainous Freeway fic Crashes. Accident Analysis & Prevention, Vol. 36, No. 4,
Hazardous Factors. Accident Analysis & Prevention, 2004, pp. 637–647.
Vol. 50, 2013, pp. 371–376. 64. Usman, T., L. Fu, and L. F. Miranda-Moreno. Quantify-
46. National Oceanic and Atmospheric Administration and ing Safety Benefit of Winter Road Maintenance: Accident
National Centers for Environmental Information. Histori- Frequency Modeling. Accident Analysis & Prevention, Vol.
cal Palmer Drought Indices, National Centers for Environ- 42, No. 6, 2010, pp. 1878–1887.
mental Information, 2016. 65. Quistberg, D. A., E. J. Howard, B. E. Ebel, A. V. Moudon,
47. Akter, T., S. K. Mitra, S. Hernandez, and K. Corro-Diaz. B. E. Saelens, P. M. Hurvitz, J. E. Curtin, and F. P. Riv-
A Spatial Panel Regression Model to Measure the Effect of ara. Multilevel Models for Evaluating the Risk of Pedes-
Weather Events on Freight Truck Traffic. Transportmetrica trian–Motor Vehicle Collisions at Intersections and Mid-
A: Transport Science, Vol. 16, No. 3, 2020, pp. 910–929. Blocks. Accident Analysis & Prevention, Vol. 84, 2015,
48. Datla, S., and S. Sharma. Impact of Cold and Snow on pp. 99–111.
Temporal and Spatial Variations of Highway Traffic 66. Dai, D., E. Taquechel, J. Steward, and S. Strasser. The
Volumes. Journal of Transport Geography, Vol. 16, No. 5, Impact of Built Environment on Pedestrian Crashes and
2008, pp. 358–372. the Identification of Crash Clusters on an Urban Univer-
49. Federal Highway Administration (FHWA). Highway Sta- sity Campus. Western Journal of Emergency Medicine, Vol.
tistics Series Publications, 2000–2016. State Motor-Vehicle 11, No. 3, 2010, p. 294.
Registrations, Washington, DC, 2017. 67. Hadayeghi, A., A. S. Shalaby, and B. N. Persaud. Safety
50. U.S. Census Bureau. American Community Survey (ACS), Prediction Models: Proactive Tool for Safety Evaluation in
Five-Year Estimates, 2000-2014, Washington, DC, 2014. Urban Transportation Planning Applications. Transporta-
51. Lu, J., and J. M. Guldmann. Employment Distribution tion Research Record: Journal of the Transportation
and Land-Use Structure in the Metropolitan Area of Research Board, 2007. 2019: 225–236.