190
Views
0
CrossRef citations to date
0
Altmetric
Feature Articles

Auto Insurance Pricing Using Telematics Data: Application of a Hidden Markov Model

&
Published online: 02 Feb 2024
 

Abstract

This study develops a hidden Markov model (HMM)-based clustering framework to predict auto insurance losses using driving characteristics extracted from telematics data. Through a simulation experiment based on a proprietary telematics dataset, we show that HMM can effectively classify driving trips using model-implied hidden states, and HMM-based pricing methods provide better predictive power measured by deviance statistics. Importantly, the proposed framework not only enables us to price usage-based insurances at a granular level but is also viable for estimating long-term insurance losses utilizing the limiting properties of HMM.

ACKNOWLEDGMENTS

The authors thank two anonymous referees for their valuable suggestions that improved the article.

Disclosure Statement

No potential conflict of interest was reported by the author(s).

Notes

1 An observation is defined as per second record of driving behavior variables.

2 In the United States, residential areas generally have speed limits ranging from 25 to 40 mph, whereas interstate roads have speed limits ranging from 50 to 80 mph. See https://www.uproad.com/blog/speed-limits-in-the-usa

5 There are multiple ways to select the optimal number of clusters. For example, the elbow method is a popular one, in which the explained variation is plotted as a function of the number of clusters, and the elbow of the curve is picked as the optimal number of clusters (e.g., Ketchen and Shook Citation1996; Bholowalia and Kumar Citation2014).

6 Though the variables DBj and cluster represent the information of the ith unit and depend on i, we omit the subscript for the unit on the right side of the equation for convenience. The same applies to EquationEquations (6)–Equation(8).

7 Traditional rating factors may be flexibly added into the Poisson GLM models when both telematic and nontelematic effects are considered. In this case, EquationEquation (5) can be extended to

ln(λi) =β0+β1DB+β2TF+θcluster,

where DB and TF represent the vectors of driving behavior and traditional rating factors for the ith unit, respectively.

8 Although both clusters and hidden states are information extracted from the same dataset, we find very weak multicollinearity between these two variables. The variance inflation factor between these two variables is 1.79, which indicates a weak correlation (a variance inflation factor between 1 and 5 implies a weak association; James et al. Citation2021). We also employed the Cramér’s V test (Cramér Citation1946), and the Cramér’s V value between the variables is 0.142, which again implies a weak association.

Additional information

Funding

The authors are grateful for financial support from the Casualty Actuarial Society and the Committee on Knowledge Extension Research (CKER) of the Society of Actuaries Research Institute.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 114.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.