Machine learning (ML) is all the rage, often described as the new frontier in predictive analytics. While its history dates back to the 1950s with pioneering research on simple algorithms, more recent developments in the 1990s paved the way for the new wave of applications in artificial intelligence. Most agree this technology will have a notable impact on the industries and the jobs of the future. From a predictive modeling perspective, incorporating new data into evolving models allows the predictive power to continually improve.
Insurance presents interesting and nuanced use cases when applying predictive models to assessing risk and pricing policies. Being a heavily regulated industry brings challenges and the need to explain precisely how risk selection and pricing decisions are made. Also, consumers still want trusted advisors (agents, direct insurers, etc.) to help them navigate the right insurance coverage for their needs. That means continued human interaction in explaining coverage and pricing details. Particularly with the larger, more complex risks in commercial insurance, it’s clear that insurance analytics must go well beyond a ‘one-size-fits-all’ modeling technique.
Veteran data scientists in the insurance industry use a variety of techniques to develop predictive variables and final model output. They understand that insurers are required to go beyond a simple understanding of model scores; regulators often require an explanation for the decisions and recommendation made by the model, as well as a deeper understanding of why it makes the recommendations it does. Underwriters and claims adjusters also need the same context and transparency to have confidence incorporating model recommendations into their decision making.
As such, it’s imperative that the model doesn’t become a black box.
Since machine learning automatically changes the output as the model consumes more data, underwriters or claims adjusters can be left in a position where an explanation is impossible. The same policy may be treated differently in February than it was in May, and without a satisfactory explanation into the change, drawing the ire of regulators. In other words, the biggest benefit of ML is also its biggest problem.
Using multiple techniques is also important for variable selection. Different methods grab different triggers from the data, which become key variables for a model. There are multiple forms of both univariate and multivariate modeling techniques to choose from, including tree based methods, stepwise regression, and Lorenz curves. Stepwise regression seeks variables that are linear, whereas tree-based methods look for both linear and nonlinear.
The best way to approach analytics is to use multiple modeling techniques that support an insurer’s unique goals and position in the market.
Getting the right mix of modeling techniques
Linear Models or Generalized Linear Models (GLM) can be thought of as global models. That is a prediction equation that is estimated and applied to the entire data set. While these techniques are the gold standard in insurance, and will likely continue to be, other nonlinear methods can be useful for understanding and predicting a target value of interest. Specifically, these methods can help when a data set contains heterogeneous groups that should be treated differently, such as splitting a data set by premium size or geographic region. Tree-based methods are often used in these cases to deal with inherent differences in the data set. By partitioning the data space, we can fit highly accurate models to these subpopulations (called leaves in tree-based modeling), which can often overwhelm the predictive power of global solutions. Beyond making accurate predictions, the very structure of the tree can be informative for the business as well. With this machine-learned intuition, one can also use these learnings to help with other statistical techniques such as GLM.
Keeping the human connection
A recent Wall Street Journal article pointed out that AIG, which has invested heavily in predictive modeling, still proudly uses underwriters to make judgment calls based on data. Valen’s data also supports this, understanding that human judgment and analytics need each other to deliver the best results.
The graph above illustrates the lift of a predictive model. The greater the lift, the more effective an insurer’s predictive model is. The blue line represents the loss ratio improvement based on a combination of the underwriter and the model when making decisions on pricing policies. There is a more significant lift here when compared to both the underwriter (green) and predictive model alone (red).
Organizational culture comes is an important consideration in any analytics implementation. Many underwriters are still wary of technology taking over their jobs. GLM provides stable answers and a high degree of transparency into the final model output. It arms underwriters with the appropriate context and places them as overseers to the model, with the ability to make adjustments as needed.
Successful analytics initiatives require that the tech be ingrained in the strategic principles of the business, and a predictive analytics approach must converge with the overall corporate strategy. That can’t happen without executives who are confidently able to make important decisions and incremental improvements based on the insights from predictive modeling. There are many modeling techniques available and each has their advantages and disadvantages when incorporating within a complete analytics strategy. The key is finding a way for them to work together to support the proper use case.
Dax Craig is CEO of Valen Analytics. To find out more, message Valen Analytics® at email@example.com.
Read other articles by this author: