An insurer’s data is much more than bits and bytes to be assembled for required regulatory reporting. It is the very lifeblood of the insurance organization and provides the ingredients by which the insight is developed that drives action to improve operational results.  

Three convergent technology trends are opening dramatic new opportunities for insurers seeking competitive and operational advantage—and creating a potential crisis for their IT leaders. Predictive analytics, social networks, and location intelligence are on a rapid adoption curve in the insurance industry, but few companies are fully prepared to handle the crashing wave of the exponential growth in data that accompanies these new tools.   How does the modern insurer ensure that they don’t sink —or worse—drown? 

Insurers vary widely in the maturity of their database and data warehouse implementations. However, these innovation trends make good database performance a business imperative. As data gets bigger and its uses become more complex, demands on data structure grow exponentially. It is imperative that businesses maintain focus on data quality, relevance, and security.

Data Quality

Data quality is maintained by how it’s acquired and managed. As data availability explodes in the organization, quality initiatives must focus particularly on handling accuracy, completeness and cost-benefit. Master data management initiatives are essential to organize and sustain the most important enterprise data in a formalized way that ensures compliance with corporate wide requirements. These initiatives should include careful consideration of new analytical and social technology trends to anticipate their ultimate uses and the role of the data they require and create.

Data Relevance

 New technologies are extremely data intensive but few companies can practically manage all the data potentially available for collection. The first key to managing big data is to prioritize it starting with the data that explains relationships among customers, their products and policies, providers, and claims.

Understanding these relationships is critical to helping IT understand and generate the meaningful relationships in their database schemas. Second, user generated content, external data streams, and digital content fundamentally swamp the requirements of traditional databases. Now, more than ever, insurers need a judicious strategy for deciding what data are important and what levels of detail are required to meet business needs. And these strategies need to encompass both business requirements for meaningful information and IT requirements for manageable costs, storage and performance.

Security and Trust

Expanding access to data creates new risk around data being secured for appropriate use by the appropriate audience.  As data becomes more complex, data authenticity and credentials become more vital.  Data streams must be managed with a systematized structure encompassing security, quality and compliance.

Unless an organization centralizes its enterprise data to create a “single view of the truth,” it requires truth rules; for instance, a means of enforcing shared rules for managing missing and dirty data, tracking document authority, and cardinality for shared information like budgets and forecasts.

These issues impose a new degree of schema complexity unlike that realistically supported by many insurers’ current environments.

Technology Impact

The importance of data quality, relevance and security are magnified considerably when we turn our attention to new technology trends. Consider the data structure issues associated with three innovations.

Predictive Analytics

While analytical software makes adjustments for missing or skewed data, the best predictive models are built from high quality data. One quality dimension is granularity. Highly granular transactional data are usually expensive and cumbersome for analytics. Summarize the data too much, however, and data lose predictive power.

Pre-categorizing continuous quantitative data save storage and processing costs, but modeling efforts derive predictive power from grouping or “binning” data variables in segments most relevant to the analysis. For example, using age to predict automotive accident probabilities might show that fine groupings of the youngest and oldest drivers are useful but drivers in the middle years can be lumped together. 

IT leaders should proactively engage the full range of business analysts, statisticians, and business subject matter experts to identify key modeling variables and explore the required granularity. Without insight into the nature of these predictor variables, the data value may—ultimately—be quite limited.

Social Networks

Understanding relationships among constituents is critical to creating meaningful relationships in the database environment. The advent of digital social networks requires the linking of a much broader array of participants in the insurance ecosystems from a variety of public, commercial and proprietary sources.

For example, claims organizations are finding the ability to link and track relationships among claimants, providers, corporations and attorneys especially useful for identifying fraud and excessive treatment. Creating and maintaining these linkages, however, is a complex data undertaking.

Both the constituents in the network and their relationships must be managed as data entities and continuously refreshed. In most cases, insurers are likely to find that social networks linking data outside their organization are best licensed from specialized commercial providers. 

Data inside their organization, however, will likely prove a significant competitive differentiator and insurers should manage these as proprietary intellectual property.

Location Intelligence 

Insurers have long known the geo-spatial relationships can help explain a range of outcomes. Provider treatment patterns can vary regionally based on regional medical training and presence of specialized facilities.

Fraud tends to cluster geographically. While measuring these relationships has historically been cumbersome, innovations in location intelligence technology make it far easier to collect and analyze geographic data.

These technologies include free geographic information system (GIS) tools, digital data and imagery as well as mashups (combinations of data from two or more sources to create new data). For instance, digital images from accident scenes may soon be instantly transmitted to an insurer, their contents digitized and codified as predictive attributes and immediately used to improve severity scoring and claims assignment at first notice of loss. 

To prepare for location data use, IT leaders must understand the unique nature of geo-spatial data and its retrieval. Users of mapping and GIS tools from search engines must be aware of differences in geocoding accuracy.  When a precise location a house or building is required, for example, parcel geocoding delivers more accurate results than street-level geocoding.


Taken together, the rising tide of data availability combined with new technology innovations create a potential boon for insurance business leaders and new challenges for those tasked with managing the data. Insurers can profitably ride this wave by developing a clearer of sense of what data has meaning, creating flexibility to add new types of data into existing environments, and imposing simplicity wherever possible in the structure of their data schemas and management practices.

(Keith Peterson is Vice President of Advanced Analytics and Consulting for Mitchell Auto Casualty Solutions.  Keith is responsible for developing Mitchell’s extensive data assets and predictive solutions to assist clients with identifying performance insights and best practice opportunities that improve claims operations. Keith has held executive positions with Nielsen and Equifax. He is a frequent speaker and writer on business analytics.  He has a Ph.D. from Vanderbilt University.