Editor’s Note: The following article is a collective contribution of Kevin M. Bingham, Frank Zizzamia, Jim Guszcza and Kirsten Hernan, all of whom work at Deloitte.

Henry Ford is credited with saying, “half of my advertising budget is wasted; I just don’t know which half.”

Ford might sympathize with insurance claims adjusters, who face even more pressing uncertainties. In the workers’ compensation domain, for example, it is well known that 20 percent of the claims ultimately account for 80 percent of the total claims costs. This is true of both lost-time and medical-only claims.

So Which 20 Percent?

Traditionally, it has been virtually impossible for the adjuster to know which claims make up the 20 percent early in the claim life cycle. But this has changed, thanks to advances in claims predictive modeling methods. Leading claim predictive models can serve as early warning systems, flagging at an early stage those claims most likely to grow large and complex. This enables an adjuster to prioritize his or her workflow and make effective use of special investigative unit (SIU) referrals, medical management, and return-to-work programs. A key advance has been a more effective use of injury groups in the claims predictive model design.

Claims predictive modeling in workers’ compensation has been around for many years, with first-generation modeling efforts typically focused on prospectively identifying the 20 percent of claims that drive 80 percent of costs as early as possible in the life cycle of the claim. Ideally, the insurer’s goal is to develop such a modeling capability that accomplishes this goal at first notice of loss (FNOL). However, a major limitation of typical first-generation claim predictive models is that they have tended to pool the types of injuries together and segment claims along fairly obvious lines such as injury type. For example, a sprain of a worker’s lower back is on average more severe than a contusion of a worker’s arm. While this is a valid approach, much more refined (and economically beneficial) solutions are possible.

It turns out that the 80/20 rule also applies within each injury type.  For example, 20 percent of the workers who sprain their lower backs will typically drive 80 percent of the total claim costs for all sprains of lower backs. Herein lies the opportunity for more effective workers’ compensation predictive modeling.

Next, we will describe how International Statistical Classification of Diseases and Related Health Problems (ICD-9) codes can be classified into workers’ compensation-specific groups. This grouping – used in today’s more advanced predictive model designs – can lay the foundation for identifying the most severe claims within each injury grouping[i].

The Claims Lifecycle Before Leveraging Advanced Analytics

Over the life of a claim, the information available at a specific point in time has typically determined how claim professionals assess the complexity of workers’ compensation claims.  At FNOL this information is quite limited, which is why more advanced workers’ compensation claim predictive models incorporate external data sources and synthetically created variables. For loss time claims and more severe medical only claims, the three-point contact with the injured worker, employer, and physician provides additional information that further enhances the claim professional’s understanding of claim complexity. 

Even at this stage in the claim’s life cycle, it is difficult to distinguish the difference in severity from one back strain to another, except for some differentiating claimant characteristics such as age, tenure, prior claim history, and specific details about the circumstances of the accident—that is, injury occurred in the office, injury occurred using heavy machinery, injury occurred from a height, and so on. More often than not, it is still difficult for the claim adjuster to identify the high severity back sprains without the use of external data and synthetic variables. 

Over time, additional information such as the receipt of medical bills, pharmaceutical information, and the claimant’s medical history begins to paint a much fuller picture of the claim’s true complexity. However, the opportunity to favorably impact the claim through early intervention, SIU involvement, and experienced resource triage may already be lost if it takes nine to 12 months for the adjuster to correctly identify the true severity and complexity of a claim. 

This no longer needs to be the case, however. Employing workers’ compensation claim predictive models that build upon external data sources, text mining capabilities, and insightfully specified synthetic predictive variables changes the game.  

The Evolution of Claims Analytics

It is helpful to view the evolution of workers’ compensation claims predictive models on analogy with the evolution of workers’ compensation underwriting models. Early efforts to segment policies on the basis of expected profitability were heavily dominated by business class code, with other predictive dimensions making second-order contributions. This resulted in models that for the most part told the underwriter what he or she already knew: for example roofers tend to be less profitable than florists. While it is not a bad thing to refine and further quantify the underwriter’s prior knowledge, the real business value—and the real intent of predictive modeling in this domain —has been identifying the pockets of, for example, profitable roofer risks and yes, unprofitable florist risks.

In short, these models enable the underwriter to evolve beyond primarily class-based underwriting to underwriting based on a refined, multidimensional view of each risk. This approach to insurance underwriting has been a major success story over the past dozen years, and today many major US commercial insurers have cultivated their ability to develop, deploy, and continually refine their underwriting models. Equations are now routinely used in a domain that had previously been largely judgment-driven as a matter of practical necessity.

Fast forward to today, and we see a similar untapped potential in the workers’ compensation claims modeling domain. Just as early underwriting models had been largely class-driven, today’s first-generation workers’ compensation claims models tend to be largely injury type-driven.

And analogously with the underwriting application, the real business benefit of claims modeling is not capturing the fact that (for example) contusions are on average less severe than lower back sprains. The adjuster already knows this. Rather, the intent of claims predictive modeling is to provide laser-like segmentation within each injury type. For example, certain lower back sprains would be good candidates for straight-through processing. On the other hand certain contusions – given the specific combination of age, comorbidities, and other case-specific risk factors – might be sufficiently complex to warrant assigning to a claim adjuster.

In short, next-generation claim predictive models capture the variation in severity that exists within each injury type. From a modeling point of view, this is a technical challenge. From a business point of view, it is an opportunity. Failing to capture within-injury type variation is an instance of what the business statistician Sam Savage calls “the flaw of averages”. The benefit of large-scale predictive modeling initiatives is to move beyond group averages. Just as data-driven retailers now treat customers as an individual rather than members of various consumer segments, claims predictive models should evolve to account for – but also see beyond – injury type.

To illustrate, the graph to the right below displays the differing overall average severities for three injury groups. Claims involving shoulder sprains/strains are clearly more severe on average than those involving neck and back sprains/strains. Claims involving non-back and shoulder contusions have the lowest overall average severity of the three injury groups.

In order to move beyond telling claims adjusters what they already know, it is crucial to leverage the power of ICD-9 injury groupings to help identify which contusion is going to be worse than average, and similarly which sprain of the lower back is going to be better than average.

Injury Grouping: Evaluating Severity Within Like Injuries

The first step in developing Deloitte’s injury grouping methodology was to research and analyze thousands of the potential ICD-9 codes that pertain to workers’ compensation claims. This required a detailed analysis of medical severity, clinical reasonability and the resultant workers’ compensation outcome to help identify the diagnoses that cluster together with similar projected claim outcomes. By carefully analyzing a large volume of historical workers’ compensation data, we developed approximately 70 injury groups. The injury groupings were determined by combining similar diagnoses with sufficient statistical credibility to result in stable statistical patterns, aimed at driving enhanced segmentation in the claim predictive modeling process. 

The second step involved developing a high level body part assignment process. This required assigning an initial body part for each injury grouping and ICD-9 combination. Part of this process involved reviewing the reasonability of the assignments based on medical science, actuarial peer review and the ability to capture variation in lost time duration patterns within a physical part of the human body. We identified 28 different body part assignments.

The third step in our injury grouping methodology involved the identification, development, and assignment of medical specialty treatment profiles to each injury group. This required examining the reasonability of each treatment profile on the basis of medical science, and comparing the treatment profile assignments with the AMA Physician Desk Reference. We developed 16 different treatment profiles (that is, chiropractic, physical therapy, orthopedics, x-ray, and so on) to help us better understand whether a claimant is likely being over-treated or under-treated at a specific point in time based on the number of actual office visits compared with medical protocols for their specific injury group.

Last, with our ICD-9 methodology thus defined, we realized that an approach was needed to help identify the prevalent diagnosis on a claim, since there can often be multiple ICD-9 codes referenced in the claim file. The process of identifying the prevailing diagnosis, or the condition that is primarily responsible for driving the claimant’s claim outcome, is not normally a straightforward exercise. ICD-9 diagnosis information is typically received and stored in the detailed medical bill data. Often there are multiple ICD-9 codes on a single bill and the codes can change over time, reflecting current information about the claimant’s diagnosis. Our ICD-9 injury grouping methodology includes a method that leverages the complete set of diagnosis information available on a claim to determine the prevalent injury group on an individual claimant level. 

Enhancing Claim Predictive Models Leveraging Injury Groupings

With this injury grouping methodology in hand, it was time to improve upon first-generation claim predictive models by better segmenting claims within each of the defined injury groupings.  We used a normalization technique to remove the variation in claim outcomes due to injury group, and analyzed patterns in large databases relating claim characteristics observable at FNOL with the ultimate claim outcomes. We followed a rigorous modeling methodology leavened with common sense and insurance claims domain knowledge, resulting in a predictive model that produces significant segmentation very early in the life of a workers’ compensation claim, literally at FNOL.  Such predictive models prospectively segment claims within such injury groups as Sprains and Strains of the Neck and Back. The results are dramatic: the highest-scoring claims are 25 times or more costly than the lowest scoring claims.

The graph below displays actual predictive model results based on real historical workers’ compensation claims.  The bars marked “1-10” represent the average ultimate claims severity for the best-scoring 10 percent of claims. Similarly, the “91-100” represents the ultimate severity for the worst-scoring 10 percent of claims. Note also that this graph is based on blind-test “holdout” data that was not used to build the model. This display demonstrates that (a) the model offers a high degree of segmentation power and (b) this segmentation is not driven simply by injury type. Within each injury type, the injury group-enhanced models identify as early as FNOL those claims destined to be the ones with high severity.


The use of injury groups further enhances the model development process by assisting the claim adjuster with interpreting in excess of 50 claim characteristics all at once. For example, the use of specific medical procedures can vary greatly based on injury type. Would the presence of five physical therapy treatments be higher or lower than indicated by medical protocols? Depending on the answer to this question, would this information drive the outcome of the claim (Hint: it will vary by injury group)For a sprain or strain, five treatments may be in line with expectations, or even lower than indicated by medical protocols, depending on the age of the claim. But if the injury was a simple contusion, five treatments would likely exceed existing expectations. Using injury groups in this way, medical protocols can help to draw more proper conclusions regarding claim complexity, especially when combined with other variables such as co-morbidities, external demographic and synthetic variables.

The Benefits

Injury grouping, prevalence methodology, and normalization-based injury group models constitute the cornerstone of the today’s  workers’ compensation claim predictive modeling. Models that use these components can provide key insights beyond what is already known by claims professionals.

At claim intake, a newly reported workers’ compensation claim is scored by the claim predictive model, resulting in immediate and more effective routing to claims adjusters and case managers with the appropriate level of experience. Similarly, these predictive models can help identify cases for auto-adjudication or fast track processes, as well as those that may require the involvement of the SIU resources. The business impact of such predictive models also extends to supervision and oversight, helping to trigger when it is appropriate to involve a more experienced supervisor.

The business impact on a claims organization is breakthrough performance that results from arming the right resources with the leading information and insights to allow them to take the appropriate actions more quickly, and thereby accelerating the claims life cycle. The results are real. The drivers of this improvement range from more effective resource allocation to more focused claims management strategies, resulting in better claims outcomes and bottom-line loss cost savings of up to 10 percent of an organization’s annual claims spend.

In Conclusion

Using an injury group-based approach to developing state-of-the-art claim predictive models is a reality today. Insurers, self insureds, and third party administrators (TPAs) are using this approach to help them better manage their claim exposures. These models are enabling organizations to assign the knowledgeable resource to the appropriate claim, at the required time for targeted intervention. The benefits are being measured and leading claims organizations are helping their injured claimants get better faster and return to work earlier.   

KEVIN M. BINGHAM is a principal at Deloitte Consulting LLP in Hartford, CT and leader of Deloitte’s Claim Predictive Modeling and Medical Professional Liability practices. FRANK ZIZZAMIA is a director at Deloitte Consulting in Hartford, CT and founder of Deloitte’s Advanced Analytics & Modeling practice. JIM GUSZCZA is the national predictive analytics lead in Deloitte’s Actuarial, Risk & Analytics practice and is an assistant professor of Actuarial Science, Risk Management, and Insurance at the University of Wisconsin-Madison School of Business. KIRSTEN HERNAN is a senior manager at Deloitte Consulting LLP in Philadelphia, PA, and a leader in Deloitte’s Claims and Risk Management practice.

This publication contains general information only and is based on the experiences and research of Deloitte practitioners. Deloitte is not, by means of this publication, rendering business, financial, investment, or other professional advice or services. This publication is not a substitute for such professional advice or services, nor should it be used as a basis for any decision or action that may affect your business. Before making any decision or taking any action that may affect your business, you should consult a qualified professional advisor. Deloitte, its affiliates, and related entities shall not be responsible for any loss sustained by any person who relies on this publication.

About Deloitte

As used in this document, “Deloitte” means Deloitte Consulting LLP, a subsidiary of Deloitte LLP. Please see www.deloitte.com/us/about for a detailed description of the legal structure of Deloitte LLP and its subsidiaries. Certain services may not be available to attest clients under the rules and regulations of public accounting.

Copyright © 2012 Deloitte Development LLC, All rights reserved.

[i] Deloitte Consulting LLP has a patent pending with the United States commissioner of patents titled Injury Group Based Claims Management System and Method, Patent No. 61/199,226 filed on November 12, 2008.