Disruptive Innovation For Auto Insurance Entrepreneurs: New Paradigm Using Telematics and Machine Learning

Disruptive Innovation for Auto Insurance
Entrepreneurs: New Paradigm Using

Telematics and Machine Learning
N. Arun Kumar and Siva Yellampalli
Abstract Currently, Motor insurers are playing a passive role in terms of identi-
fication of risk incidents for the policy holders. Traditional insurance does not
differentiate safe drivers and unsafe drivers. Since they do not have the vehicle
telematics data of the policy holders. Many insurance corporations are planning to
utilize telematics data to build a model of predictive risk for policy holder and claim
possibility. They can reward safe drivers by low premiums and/or no-claim bonus.
Likewise, unsafe drivers need to pay extra risk premium. This means drivers have a
stronger incentive to adopt safer practices. This chapter describe black-box auto
insurance predictive model utilizing basic telemetry like GPS sensor data for usage
based insurance. Predictive model is developed using binary logistic regression
machine learning technique. It is an informative chapter for entrepreneurs since it
highlights the business proposition from an insurer perspective to gain competi-
tiveness in highly commoditized insurance market.
1 Introduction
Existing insurance billing models are based on policy data and claims data. Policy
data include vehicle cost, depreciation, first party or third party claims, and tenure
of the policy. Claims data include number of claims and claim amount. Currently
market segmentation is based on claims and we have no-claim and claim customers.
It encourages customers to drive cautiously by giving no-claim bonus. This descrip-
tive model is general and the premium cost is fixed since it does not consider usage.
So customers who have low annual mileage with good driving behavior subsidize
bad drivers who have higher annual mileage and exhibit reckless driving. It is unfair
to good drivers and does not penalize bad drivers. So there is a need to rationalize
the premium cost based on miles driven and driving behavior and provide value
addition to insured. It has been observed that there is a growing competitiveness in
N. Arun Kumar (*) · S. Yellampalli

VTU Extension Centre, UTL Technologies Ltd, Yeshwanthpur, Bangalore, Karnataka, India
e-mail: arunkumar.nageswar@itcinfotech.com; siva.yellampalli@utltraining.com
© Springer International Publishing AG, part of Springer Nature 2018 555

D. Khajeheian et al. (eds.), Competitiveness in Emerging Markets, Contributions to
Management Science, https://doi.org/10.1007/978-3-319-71722-7_27
556 N. Arun Kumar and S. Yellampalli
insurance companies to reduce the number of claims and to prevent fraudulent

claims. Usage based insurance (UBI) is seen as the next generation insurance model
which fosters win-win to both insurers and insured.
In the proposed predictive model, premium is rationalized based on the usage and
it is called risk-adjusted premium. It considers historical usage and predicts the
future usage and categorizes customers accordingly. It categorizes customers as
“Risk” or “No Risk”. This customer segmentation is accomplished by leveraging the
technological advances in Telematics and Machine Learning. A customer catego-
rized as “Risk” need to pay higher risk-adjusted premium compared to “No Risk”
customer. This proposed new billing model goes beyond customer expectation by
proactively alerting a driver regarding his/her behavior. By reducing risk there is
likelihood of reduction of claims which benefit the insurers. So it promotes road
safety and adds more rigor to already existing premium calculation mechanism.
Insurance companies have started providing insurance policies which add value
to customers based on usage. These polices are classified in to three classes based
on the parameter being considered for determining the usage. Mile-based insurance
is the simplest form and considers only miles driven. Pay how you drive (PAYHD)
and Pay as you drive (PAYD) insurance policies measure driving behavior in
addition to distance travelled. The rigor in terms of type of usage is more in
PAYHD compared to PAYD. Analytics related to driving behavior can be derived
from telematics. PAYD may use aggregated GPS data to measure excessive speed,
travel time without break, speed and time-of-day information, usage of mobile
while driving, distance and time travelled. It may also include statistical informa-
tion regarding the historic riskiness of the road. Telematics provide an advantage to
the user by providing immediate feedback loop to the driver regarding the risk
involved and the dynamic cost of the insurance. So it acts as a deterrent to the driver
to restrain from reckless driving immediately for safety of self, co-passengers and
pedestrians. Eventually policy holder gain monetary benefit by lower premiums or
bonus additions. PAYHD policies include additional sensors like accelerometer.
PAYD has two schemes namely basic scheme and extra risk premium. In basic
scheme only kilometers travelled is considered. Extra risk premium scheme is
applicable if someone drives too long without a break or travels at an excess
speed. In this chapter, predictive modeling is used for extra risk premium scheme
(Tselentis et al. 2016). It describes the low level architecture of the logistic
regression model used in auto insurance risk premium calculator
1.1 Scope
The Logistic regression model (LRM for short) is used in an auto insurance risk
premium calculator to predict the risk from features extracted from GPS data. In risk
premium calculator, the LRM is used to perform regression and binary categorical
classification of risk. The accuracy of the LRM as a ratio of correct predictions to
number of predictions done is computed by using two features from GPS data.
Disruptive Innovation for Auto Insurance Entrepreneurs: New Paradigm. . . 557
1.2 Existing Implementations
PAYD insurance policies are offered by many insurance companies across the
world, collecting data in a variety of methods. It differs on the level of privacy
provided to the users.
WGV, a German insurance company gather vehicle speed and location infor-
mation and it is being verified whether speed limit is adhered. If the speed exceeded
for a given route, then the policy holder earns “negative” points that will have an
impact on risk premium.
Progressive Casualty Insurance (US) and AVIVA (Canada) use proprietary devices
which connect to OBDII (On Board Diagnostics II) port of the vehicle. This device
collects trip start and end time, miles driven, duration of trip, number of sudden starts
and stops, and time and date of each connection/disconnection to the OBDII port. This
data is reviewed by the user in a computer and can be exchanged with the insurer.
In Germany, Swiss Re and DVB Winterthur insurance companies have a similar
device to exchange data with the insurer. Route information, behavior of the user,
kilometers travelled and route information is inferred by using GPS.
Hollard Insurance provide PAYD insurance based on GPS, which records all the
data related to location, time and stores it in a server. The policy holder can access
the policy details using internet.
Progressive Insurance Corp. (US), registered the US Patent US5797134 to
capture necessary data using GPS and transmit it using GSM network. The data
includes safety equipment used (seat belts, turning signals . . ..). It also includes
driving behavior like rate of acceleration, rate of braking and observation of traffic
signals and speed.
Norwich Union (UK), owner of European patent (EP) number 0700009 and
Uniqa Group (Austria) follow the architecture using GPS and GSM. However, data
is limited to time of day, riskiness of the road and kilometers driven.
MAPFRE (Spain) use architecture using GPS and GSM. The data includes
percentage of night hours, average speed, time of day, type of road driven, average
length of trips and kilometers driven.
STOK (Netherland) use architecture using GPS and either active or passive way
to transmit data to the server. Passively by USB, Blue-tooth or wirelessly or
actively by using GSM network. The statistics and trip logs are accessible by
insurance companies and user (Troncoso et al. 2011). The summary of all existing
PAYD implementation is given in Table 1.
1.3 Existing Predictive Models
Annual risk can be calculated as the product of per-mile risk and annual mileage. It
was found that a relationship exist between reduction in VMT (vehicle miles
travelled) and reduction in risk. Mileage is not the only important risk factor.
Table 1 Existing PAYD implementations

Methodology for Methodology for
Company Country data gathering transmission of data Patent
WGV Germany GPS User submission
Progressive US Device in the User submission using
Casualty vehicle internet
Insurance
AVIVA Canada Device in the User submission using
vehicle internet
Swiss Re Germany Device in the User submission using
vehicle internet
DVB Winterthur Germany Device in the User submission using
vehicle internet
Hollard South Africa GPS GSM
Progressive Insur- US GPS GSM US5797134
ance Corp
Norwich Union UK GPS GSM EP0700009
Uniqua Austria GPS GSM
MAPFRE Spain GPS GSM
STOK Netherland GPS Passive or active
However, it has a substantial impact on risk. It has an influential factor on risk

prediction along with other factors and not alone (Litman 2006).
Poison and Linear models were used to predict insurance risk using annual
mileage. It showed that mileage contributed explanatory power when used along
with other risk factors (Ferreira and Minikel 2012)
Premium cost model based on mileage, location, time and driving behavior was
built. Premium was based on fixed cost plus additional cost based on the linear
combination of above mentioned risk factors and coefficients (Boquete et al. 2010).
We may encounter insufficient exact knowledge of risk factors and a large
combination of these factors are needed for the prediction of risk. In this case
fuzzy-linguistic approximation apparatus is suitable for projection of the evaluation
of the ride which is used to calculate the insurance premium (Kantor and Stárek
2014).
A comparison of the performance of three models namely, logistic regression,
random forests and artificial neural networks model was performed. Three months
of data was sufficient to obtain best risk estimations (Baecke and Bocca 2017).
1.4 Design Overview
A GPS device with GSM (GPRS enabled) with required control board will be fixed
into the vehicle as shown in Fig. 1. This device will be powered from the vehicle’s
battery. GPS capabilities provided include Speed, Idle time, Latitude and Longitude
of the vehicle. It shall have a battery in it for the failsafe mechanism if the vehicle
Fig. 1 GPS deployment architecture
battery has been disconnected or drained. The device shall be able to store the GPS
Sentences in case of non-availability of the GSM network and the same shall be
pushed to the server in the First In First Out basis (FIFO). The device shall provide
OTA (Over the Air) based firmware upgrades. The device configuration settings
like server address and other details should be configurable by SMS. The device
shall be configured to send GPS Sentences as Web Requests. The HTTP based Web
Requests will be received by public IP server. The device shall directly talk to Web
Server which shall aggregate the telematics data and insurance data. It hosts the
machine learning algorithm and generates the reports regarding the predictive risk
(Husnjaka et al. 2015)
LRM is a customized version of generalized linear model and it is similar to
linear regression (McCullagh and Nelder 1989). It is based on Machine Learning
architecture shown in Fig. 2. It is used to compute the possibility of a dichotomous
outcome “Risk” or “No risk” based on one or more independent variables which are
called as predictors or features. The features are extracted from GPS data namely
average speed and average driving time.
Machine learning architecture involves model, input, output and classifier. Any
model before deployment needs to be learnt using offline system. In this case,
mapped GPS and Insurance data is used as training dataset. Typically, it includes
Risk, Average speed and Average driving time for each insurance policy holder.
Classifier is specified during online learning of the model and it is also called
classification cut-off. The model is scalable wherein new features like mobile usage
Training Data
→ →
{ (x (1) ,y (1)),..., (x (N),y (N)) }
→ Input
where x (i ) = (x 0(i ),..., x d(i )) →
object encoded with features
x = (x 0,..., x d )
Offline
Online
Training Model classifier
System
Sub-system
TRAINING
Final
Output predicition
y (response/dependent variable)
Fig. 2 Machine learning architecture
by driver, road condition can be added. Model can be calibrated using classifier and
updated using online learning mechanism.
Risk is dependent on many factors namely, terrain, usage of mobile while
driving, vehicle age, maintenance cycle, speed, road condition, travel time without
break and driving behavior. “Risk” or “No Risk” is a dependent variable and speed,
road condition, travel time without break and driving behavior are independent
variables. LRM shall predict a discrete outcome with the assumption that there is no
linearity between independent variables and dependent variables.
2 Methodology
Predictive analytics provides three learning techniques namely, supervised,

unsupervised and reinforcement. In supervised learning predictions are made
based on set of examples. It looks at different patterns and choose the best pattern
to make predictions. In unsupervised learning the data is grouped in clusters so that
complex data appears simpler or organized. Reinforcement learning choose an
action in response to each data point. In the current context, supervised learning
shall suffice to identify the behavioral pattern. Supervised learning can be used for
classification, regression and anomaly detection. In the current scenario, classifica-
tion methodology is chosen to perform segmentation by dividing the customers in
to subsets with common behavior and homogeneity. Consequently, implementing
strategies and decision making shall be easier. The primary objective is to quantify
risk-adjusted premium based on customer segmentation.
Classification may be two-class or multi-class. Since we are interested in “Risk”
or “No Risk” it is two-class classification. There are five primary factors which
impact the choice of algorithm for classification namely, accuracy, training time,
linearity, number of parameters and number of features. Accuracy of the model may
Device
Data
Claims Risk-adjusted
Model
Data Premium
Policy
Data
Fig. 3 Data sources
be accurate or approximate and it has an impact on the processing time. Training

time is related to accuracy and it is key when time is limited. Linearity is an
assumption made in the model that data trends follow a straight line and classes
can be separated by a straight line. The error tolerance and number of iterations of
the model are affected by number of parameters. If there are many features it may
slow down the processing time of model.
Some of the prominent algorithms for two-class classification are logistic
regression, decision forest, decision jungle, boosted decision tree, neural network,
average perceptron, support vector machine, locally deep support vector machine
and Bayes’ point machine. In the current scenario, logistic regression is chosen
since it is simple and fast. It exhibits excellent accuracy, fast training times and
support use of linearity (Fig. 3).
2.1 Infrastructure
Risk-adjusted premium policy requires GPS unit with GSM network connectivity to
be installed in the vehicle. Insurance companies need to host predictive analytical
software which receive the GPS data and to process it for business insights. The
GPS unit need to have configuration and firmware upgrade capability, battery
failsafe mechanism, in-built memory for storing historical data, intelligence to
check the GSM availability and send the aggregated data. The device may be
procured from vendors or insurance companies may go for their own product. In
either case the cost implication of the device need to be considered. Consequently,
this equipment need to be certified for data transfer laws for telematics devices.
Likewise, insurance companies need to adhere to legal considerations that the
insured telematics data is used only for insurance purposes only. So there is an
overhead cost with regard to compliance of data protection laws. The deployment
and operational cost of the policy and the ROI need to be studied further. This
section provides an overview regarding the business entities and operational
challenges.
2.2 Algorithm of Logistic Regression
Aggregated sensor data are transformed in to probabilities using logistic function.

The obtained probabilities are discrete so it can be used directly to perform
regression or predict future usage. So link function called logit is used to convert
these probabilities to continuous values and regression is performed. Consequently,
predictive model is built. However, it may not be functioning with the desired
accuracy. So online learning is performed using stochastic gradient descent tech-
nique to optimize the model and to achieve the desired accuracy (Table 2).
2.2.1 Processing of Claim and Policy Data
After claim settlement, insurance companies perform reconciliation of the claim

data in order to arrive at the next premium amount to be paid by policy holder. In
general, premium is based on vehicle group or type, year of registration and
previous claim amount. However this information is not used for predicting risk
involved during the policy period.
Historical vehicle insurance claim data was obtained from insurance company
for a fleet of 418 different vehicles used in distribution network of FMCG industry.
Claim data consists of policy holder age, vehicle category, and year of registration,
average cost of claim amount and count of claims. This data is maintained by
insurance company at the initiation of the policy and also during claim settlement.
The average claim amount is an indicator of the risk for the corresponding policy
holder. Policies which have average cost of claim above a threshold value set by
insurance company is categorized as “Risk” and others are categorized as “No
risk”. This customer segmentation based on usage shall be used as training dataset
so as to improve the accuracy of the model (Table 3)
2.2.2 Processing of Device Data
Insurance claim data is mapped with telematics data of the fleet to create training
dataset. Telematics data consists of aggregated GPS data, which are average speed
and average driving time. The training data has one categorical dependent variable
called risk and two independent continuous predictor’s namely average speed and
average driving time for each vehicle in the fleet. This training data is used to learn
the predictive model by computing the coefficients (Table 4).
Table 2 Algorithm steps

Step Procedure Result
Processing of Aggregating the data Pre-requisite to predict the risk profile of
device data customer
Processing of Mapping with historical device Training data preparation based on risk
claim and policy data profiles
data
Online learning Training the model and Accuracy improvement
configuration of learning rate
Calibration Configuration of classification Model accuracy optimization based on
cut-off external factors like road condition, terrain
and route
Prediction Segmentation of customers Risk-adjusted premium is calculated for
based on the risk profiles and each customer
mileage
Table 3 Claim and policy data

Vehicle Policy holder Vehicle Vehicle year of Average claim Claim
ID age group registration amount count
Table 4 Device data

Vehicle ID Average speed Average driving time Mileage Date and time stamp
3 Results
Bubble chart plotted for as subset of vehicles for risk which is linearly proportional
to average speed and average driving. Bubble with bigger size is risk and smaller
size is no risk (Fig. 4).
It is observed that segmentation of customer as “Risk” or “No risk” is dependent
on GPS device data, Insurance claim and policy data and model classifier data. GPS
data is the actual data for which prediction shall be computed. Insurance data is
used for offline and online learning of the model. Model classifier data is the
classification cut-off threshold used in the calibration of the model. It is dependent
on other factors like road condition, terrain, atmospheric condition and used to the
fit the model.
The learned LRM model was used to predict risk for a given GPS dataset of a
month. Consequently, reconciliation was performed with insurance claim data of
2 years to compute the segmentation. It was observed that accuracy was 51%. The
Fig. 5 shows the predicted outcomes p ¼ 1 (Risk) and p ¼ 0 (No Risk) against the
features (Fig. 6).
Distribution of Risk
100
90
80
70
Average speed
60
50
40
30
20
10
0
–2 0 2 4 6 8 10 12
Travel time without break
Fig. 4 Distribution of risk
GPS Data
R
I
S
Model Classifier K Insurance Data
Data
Fig. 5 Relationship diagram
3.1 Analysis of Results
In this chapter, telematics and insurance data of 418 vehicles were analyzed.
Consequently, a predictive model with 51% prediction accuracy was developed
using binary logistic regression technique. This model has the key advantage of
calibration on the fly as per the business needs. Any degradation of the model can be
circumventing by updating the model with online learning and tuning the classifi-
cation cut-off. The cutoff value can be configured based on the aging of the vehicle,
1.0
0.8
0.6
Prob(y=1)
0.4
0.2
0.0
0 2 4 6 8 10
x
Fig. 6 Customer segmentation of risk and no risk
engine run in hours, road condition and terrain, driving behavior and other factors.
It also helps the researcher to study the influence of one independent variable like
driver behavior on the outcome risk while keeping other predictors constant.
However, the predictive model drifts as the dataset size increases from moderate
to large and features increase. Further study is needed regarding the comparative
study of predicting risk in usage based auto insurance using SVM, Decision tress
and other models. A benchmark would help the insurance companies to choose the
appropriate tool.
An attempt has been made to create a synergy or fusion of insurance claim data
and telematics data to predict risk. As on today, the insurance companies do not
have business insight regarding the potential insurance claim. This information is
vital in terms of business planning of insurance companies and also in alerting the
policy holder regarding the imminent risk. So that the policy holder can take
necessary precaution for safety. Following observations was made during execution
of the model:
1. On correlating the accuracy of the learned model (70%) and executed model
(51%) there was a dip of 19% due to other factors like road condition, terrain,
vehicle maintenance cycle and mobile usage during driving which were unac-
counted in the classification cut-off value of 0.5.
2. It is recommended to calibrate the model by performing online learning and
revisiting the classification cut-off based on the above mentioned factors for the
next execution in order to improve the accuracy
3. An additional feature namely, Jerk energy can be included in GPS dataset by

using accelerometer to measure the driving behavior and improve the accuracy
of the model
4. Prediction of risk can be planned on a fortnightly basis in order to improve the
accuracy of the model instead of monthly basis
5. Segregating Insurance claim data based on average claim amount alone as
“Risk” or “No risk” may skew the training data. It is recommended to include
an additional feature like vehicle year of registration or policy holder age in
training dataset.
3.2 Application
Due to privacy concerns personal vehicle owners are reluctant to embrace UBI.
However, commercial vehicles, distribution network and public transport network
owners are eager to leverage the UBI benefits (Introducing Pay How you Drive
Insurance, 2016). We can list the following benefits:
1. Insurers can enhance lower premiums for non-risk drivers to improve volume
which is profitable (Lovick 2011)
2. Improve pricing accuracy based on risk profiles (Introducing Pay How you
Drive Insurance, 2016)
3. Enhance efficiency and effectiveness of claims processing by using telematics
data as evidence and automating the process (Lovick 2011)
4. Prevention of fraudulent claims and underwriting, stronger customer engage-
ment (Introducing Pay How you Drive Insurance, 2016)
5. Stronger customer engagement and retention of profitable accounts (Digital
Insurance Telematics Solution, 2017)
6. Reduce claim costs
7. Differentiate brand and De-commoditization (Introducing Pay How you Drive
Insurance, 2016)
8. Initiating new revenue generating personalized and customized value add ser-
vices like Geo-fencing, Tracking, Automated Maintenance, Stolen vehicle
recovery, Route planning, Reduce fleet costs (Introducing Pay How you Drive
Insurance, 2016)
9. New entrepreneurs serving customers through smart phones or online touch
points (Top 10 Trends in Insurance in 2016, 2016)
4 Conclusion
Traditional segmentation of pricing auto insurance based on descriptive model shall

be rendered obsolete by this new predictive model. By using market segmentation
insurance companies can encourage low-mileage drivers by attractive rates and
appropriately price un-profitable high-mileage drivers. Risk-adjusted premiums

shall improve pricing accuracy, enhance safety, and reduce claims and fraud
reduction. By market segmentation, insurers can focus and strategize policy
changes and improve their competitiveness. While insured customers obtain
value addition in terms of financial incentives as per usage accordingly.
Customers are concerned regarding privacy since the vehicle shall be tracked by
GPS. So privacy friendly methodology wherein instead of sending instantaneous
location information only aggregated data is sent to the insurer. Lack of transpar-
ency with regard to calculation of risk-adjusted premium has led to mediocre
customer acceptance of UBI. This model is simpler and can be understood by any
common insured person.
Customer segmentation improves premium pricing accuracy since it is risk-
adjusted. It enables insurers to reduce premium for profitable customers with low
risk and increase premium for non-profitable customers with high risk. It also enables
customize services like contextual driving tips based on driving behavior which in
turn reduce claims (Digital Insurance Telematics Solution, 2017). It also enhances
customer engagement by customizing products based on the segment. With the
geospatial location and behavioral data, it automates first notice of loss (FNOL)
and first report of injury (FROI) reporting (The Telematics Advantage: Growth,
Retention and Transformational Improvement with Usage-Based Insurance, 2012).
Which in turn reduces latency time in investigation, settlement and downtime of
vehicle. By supplementing claim data with the crash data fraud can be decreased. To
summarize, UBI is the key for improved customer experience and retention, effi-
ciency in claims settlement by streamlining manual process in to automation process,
improved product pricing and to reduce losses due to claims and fraud. Insurance
companies who embark on the journey of UBI shall increase the market share and
remain competitive in highly commoditized market.
Acknowledgments Authors are thankful to VTU Extension Centre, UTL Technologies Ltd
for providing the much needed infrastructure to conduct our research. We are also thankful to
Dr. B.S. Nagabhushana, Professor, Department of Electronics and Communication Engineering,
B.M.S College of Engineering, Bengaluru, India for his guidance.
References
Baecke, P., & Bocca, L. (2017). The value of vehicle telematics data in insurance risk selection
processes. Decision Support Systems, 98, 69–79.
Boquete, L., Rodrı́guez-Ascariz, J. M., Barea, R., Cantos, J., Miguel-Jiménez, J. M., & Ortega,
S. (2010). Data acquisition, analysis and transmission platform for pay-as-you-drive system.
Sensors, 10(6), 5395–5408.
Digital Insurance Telematics Solution. (2017). Tata Consulting Services Limited.
Ferreira, J., & Minikel, E.. (2012). Measuring per mile risk for pay-as-you-drive automobile
insurance. Transportation Research Record: Journal of the Transportation Research Board,
2297, 97–103.
Husnjaka, S., Perakovića, D., Forenbachera, I., & Mumdzievb, M. (2015). Telematics system in
usage based motor insurance. Procedia Engineering, 100, 816–825.
Introducing Pay How you Drive Insurance. (2016). Ernst & Young Global Limited.
Kantor, S., & Stárek, T. (2014). Design of algorithms for payment telematics systems evaluating
driver’s driving style. Transactions on Transport Sciences, 7(1), 9–16.
Litman, T. (2006). Distance-based vehicle insurance as a TDM strategy. Victoria: Transport
policy Institute.
Lovick, T. (2011). Insurance telematics understanding risk with technology. https://www.actuar
ies.org.uk/documents/b02-insurance-telematics-understanding-risk-technology
McCullagh, P., & Nelder J. A. (1989). Generalized linear models (Vol. 37). Florida: CRC Press.
The Telematics Advantage: Growth, Retention and Transformational Improvement with Usage-Based
Insurance. (2012). Cognizant.
Top 10 Trends in Insurance in 2016. (2016). Capgemini.
Troncoso, C., Danezis, G., Kosta, E., & Preneel, B. (2011). PriPAYD: Privacy-friendly pay-as-you-
drive insurance. IEEE Transactions on Dependable and Secure Computing, 8(5), 742–755.
Tselentis, D. I., Yannis, G., & Vlahogianni, E. I. (2016). Innovative insurance schemes: Pay
as/how you drive. Transportation Research Procedia, 14, 362–371.
Arun Kumar Nageswar is a student in Visvesvaraya Techno-

logical University, Belagavi, India, pursuing Master of Technol-
ogy in the field of digital electronics. He has a bachelor of
engineering in electronics from B.M.S. College of Engineering,
Bengaluru, India. Business analytics and machine learning is his
main area of interest. Arun has presented research papers in IEEE
seminars and has published research papers in IEEE Xplore
Digital Library.
Siva Yellampalli is currently working as a principal with VTU

Extension Centre, UTL Technologies Ltd. He obtained his M.S.
and Ph.D. from Louisiana State University. His area of research is
system level design for power optimization. His area of research
encompasses different research fields such as VLSI, mixed signal
circuits/systems development, MEMS and CNT sensors. He has
published a book in the area of mixed-signal design and edited
two books on carbon nanotubes. He has published 70 plus inter-
national journal papers and IEEE conference papers. In addition,
he has delivered keynote speeches at international conferences
held in Canada, Dubai and Spain including tutorials at various
IEEE International Conferences.

Disruptive Innovation For Auto Insurance Entrepreneurs: New Paradigm Using Telematics and Machine Learning

Uploaded by

Copyright:

Available Formats

Disruptive Innovation For Auto Insurance Entrepreneurs: New Paradigm Using Telematics and Machine Learning

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Disruptive Innovation For Auto Insurance Entrepreneurs: New Paradigm Using Telematics and Machine Learning

Uploaded by

Copyright:

Available Formats

Disruptive Innovation for Auto Insurance

Entrepreneurs: New Paradigm Using

N. Arun Kumar and Siva Yellampalli

N. Arun Kumar (*) · S. Yellampalli

© Springer International Publishing AG, part of Springer Nature 2018 555

insurance companies to reduce the number of claims and to prevent fraudulent

1.2 Existing Implementations

1.3 Existing Predictive Models

Table 1 Existing PAYD implementations

However, it has a substantial impact on risk. It has an influential factor on risk

1.4 Design Overview

Fig. 1 GPS deployment architecture

Fig. 2 Machine learning architecture

Predictive analytics provides three learning techniques namely, supervised,

Fig. 3 Data sources

be accurate or approximate and it has an impact on the processing time. Training

2.2 Algorithm of Logistic Regression

Aggregated sensor data are transformed in to probabilities using logistic function.

2.2.1 Processing of Claim and Policy Data

After claim settlement, insurance companies perform reconciliation of the claim

2.2.2 Processing of Device Data

Table 2 Algorithm steps

Table 3 Claim and policy data

Table 4 Device data

Fig. 4 Distribution of risk

Fig. 5 Relationship diagram

3.1 Analysis of Results

Fig. 6 Customer segmentation of risk and no risk

3. An additional feature namely, Jerk energy can be included in GPS dataset by

Traditional segmentation of pricing auto insurance based on descriptive model shall

appropriately price un-profitable high-mileage drivers. Risk-adjusted premiums

Arun Kumar Nageswar is a student in Visvesvaraya Techno-

Siva Yellampalli is currently working as a principal with VTU

You might also like