When the topic of text analysis arises, what comes to mind first are claim notes — those long, yet unstructured records that contain raw materials for claims insights.

But insurers are rife with text across the board — medical notes for life and disability policies; property descriptions for real estate underwriting; press releases, legal notices and news articles for commercial policies.

The relatively recent addition of comments, complaints, praise and opinions from social media gives unprecedented access to the voices of customers.

This mass of raw information — by most estimates at least 75 percent of an organization’s data — floats under the waterline, beneath the structured tip of the data iceberg.

If you could tap into that mass, you could:

  • Process claims and applications faster, by automating the most time-consuming portions of the review process.
  • Improve your organization’s ability to spot suspect claims or fraudulent activity.
  • Quantify key elements of policyholders’ experiences and continuously monitor drivers of customer satisfaction and dissatisfaction.
  • Reduce risk by detecting potential noncompliance with regulations or corporate standards.

Utilizing unstructured data 

Insurance companies generate and collect thousands of pieces of free-form textual content each day: call center notes, claims adjuster comments, emails, open-ended responses in customer surveys and social media communications. But while organizations are storing these documents, many are not harnessing the full potential these rich, yet complex, data sources can provide.

Unlike the neatly structured data that sits in our warehouses and data marts, text-based insights are buried in free-form fields that are often challenging to analyze at scale. However, with more storage and processing options than ever before, and with increasingly sophisticated analytical tools, the time is right to seize on the benefits of this resource.

Related: As insurance fraudsters get smarter, so do investigators and their methods

It’s easy to see how much qualitative and descriptive value unstructured data sources offer. Often, they contain the critical “why” factor that explains someone’s past actions or future intent  — a factor that may be missing from the transactional or structured variables. Why did this policyholder decide to cancel their policy? What were the conditions leading up to the claim? How was their customer service experience?

With text analytics and natural language processing, answering these types of questions becomes possible  — and more importantly, on a scale far beyond what we could accomplish with manual review. But how can we replace human intuition with a machine? There are many techniques we can apply to these text documents. Which we choose depends on the type and characteristics of the data, the use case and the goals of the analysis.

Unstructured data sources can provide the “why” factor that explain a policyholder’s past actions or future intent. (Photo: Thinkstock)

Practical examples of text analytics

One goal may be to shorten the claims and application review process. We can apply categorization to sort the claims into logical bins based on the injury or accident descriptions from the claims notes. These categories can be used to more finely focus the existing routing patterns. Contextual extraction can pull out key entities and data elements such as dates, names, vehicles, dollar amounts or body parts injured.

For example, the names of vendors, attorneys and other third parties sometimes exist only in claim notes. Extract these from the text to create new structured fields, which can be used for both operational efficiencies as well as analytical pursuits. These new structured fields become potential predictors for models identifying claims which may be fraudulent or might result in high payouts.

Or let’s say you’re trying to use free-form comments from a customer survey or call center agent notes to predict an outcome (e.g., likelihood to recommend, policy cancellation, future claim). If you have sufficient training data, you might employ some of text analytics’ statistical and machine learning techniques.  

Oftentimes, incorporating text topics or clusters into a conventional data mining flow yields better model performance than the structured variables alone. One major health insurance carrier told us, “The verbatim are a treasure trove of useful information  — we find them more valuable than all the rest of the survey combined.”

Customer sentiment, emerging issues

With the advent of sophisticated, automated text analytics, gone are the days of waiting weeks for results to come back from an ad hoc analysis; we can easily trend customer sentiment or detect emerging issues with almost no latency. Adding that element of time gives additional context to your analysis.

Are customers promoting our brand more or less than last quarter? How about versus last week? Do those trends correlate to specific website changes, advertising or other marketing activities? Is there an emerging pattern in the text that may be an indication of systemic fraud?

In this spring’s inaugural “Forrester Wave: Big Data Text Analytics Platform,” Forrester asserted that “only the enterprises that are obsessed with winning, serving, and retaining customers will thrive in this highly competitive, customer-centric economy.” Text analytics provides a compelling opportunity for insurance carriers to listen to and serve customers in a personalized, proactive way that was unattainable in years past. Its potential has barely begun to be tapped within the insurance industry.

For carriers and payers who want to gain a competitive advantage and achieve deeper insights into their customers, the possibilities are vast. The tools are available, and the storage and processing options are cheaper than ever. The time is now.

Related: Analytics dynamic evolving between insurers and agencies

Christina Engelhardt is a text analytics consultant within the Global Technology Practice at SAS with a passion for helping customers extract the maximum value and insights from their unstructured data. 

Elizabeth Dykstra is a principal systems engineer with SAS. For 17 years she has worked with insurance companies to adopt, embrace and capitalize on the benefits of advanced analytics.