Using Concept Detection (with Word Embeddings) to Improve Our Claims Handling Process


• Brief note on concept detection and word embeddings
• Problem statement
• Detecting claims related to the concept “fracture”
• Detecting accidents that happened in a parking lot AND where our insured was hit by another vehicle
• RnD
• Helpful-Resources

Brief note on concept detection and word embeddings

By now, I’m sure you’ve heard of word embeddings and all the magic they bring to data science, specifically within natural language processing. They have essentially become a staple of the industry with many applications. A technique I’ll focus on here is one that The General® uses quite often, especially in the claims space, where we find a trove of text data. This technique is concept detection. Concept detection is how we convert words to concepts based on the semantic similarity of those words, given the context of the words in question. These mappings are created by a word embedding algorithm known as Word2Vec which positions word vectors in a vector space similarly based on context words they co-occur with (semantically similar words will have similar vector representations). In short, Word2Vec captures the similarity of words based on surrounding words. For instance, Word2Vec can detect that car and automobile are essentially referring to the same idea based on the co-occurring words it has seen in the training corpus. Continuing with this example, if we wanted to detect the concept “car” to determine which claims are semantically related to that idea, we would query our Word2Vec model and get the following results:

[('vehicle', 0.6949392557144165),
 ('veh', 0.6695335507392883),
 ('truck', 0.5504242777824402),
 ('cars', 0.539861261844635),
 ('van', 0.53106689453125),
 ('house', 0.5298298597335815),
 ('vehicles', 0.5075646042823792),
 ('two_cars', 0.5022691488265991),
 ('guy', 0.4957280158996582),
 ('vech', 0.491909384727478)]

Now, this particular example isn’t really significant since we are in the business of auto insurance and thus 99% of claims will refer to a car. But I will point out how important it is for us to be able to capture misspellings and shorthand variants of words. Some corpora will contain clean data that has very little misspellings (if any). Unfortunately we are not that lucky. This corpus is littered with misspellings and shorthands, but Word2Vec maintains its robustness.

Problem statement

Let’s move on to a more important example. On a project we recently finished at The General®, we wanted to predict which claims would eventually solicit attorney representation. This is important for our customers because when a claimant hires an attorney, they usually pay a significant amount in legal fees, which directly impacts the claimant’s take-home settlement amount after they have had a loss. In addition, The General® views personal, direct contact with all customers as an opportunity to provide an excellent service, which in theory could impact business retention.  By understanding why customers might choose to seek representation, The General® can begin to address those reasons and begin to create a more intentional and direct customer experience.

Predicting whether an attorney would become involved was, it turned out, the easy part. The difficult but more important piece was narrowing down the number of claims that we could impact based on specific business rules. One business rule we found impactful was whether the accident took place in a parking lot (or in a garage, at a stop sign, etc.); and if so, whether our insured or the other party was at fault. Another impactful business rule was based on injury severity—if the injury was severe (fractured/broken bones, brain injury, death, etc.), the likelihood of impacting attorney involvement and severity was too low for this area of focus.

Detecting claims related to the concept “fracture”

Let’s start with the scenario where we want to find all the instances of a fracture. One elementary way to mitigate this problem is to simply look for all occurrences of the word “fracture”. But this fails to capture any shorthands, misspellings, or inflected word-forms. Another way to solve this is to use a lexeme-based approach. In linguistics, a lexeme is an abstract unit of meaning that roughly corresponds to a set of word-forms that refer to the same word. For example, the words fracture, fractured, and fracturing all belong to the same lemma (or root word), “fractur”. Thus, “a lexeme is a lemma plus it’s inflected forms” (as one Stack Exchanger eloquently put it). To do this, we pass all known words through a lemmatizer to produce the root form. Therefore, fractured and fracture both become “fractur”. This is good practice for any NLP task, but doesn’t get us all the way there for this specific application. We don’t just want a set of words that refer to the same root form; we want entire concepts. Usually this involves finding multiple lexemes, multiple misspellings, multiple shorthands, and even bigrams. For example, our claim adjusters want to know how semantically similar a section of claim notes are to the very abstract concept “fracture”. Using word embeddings, which help us find semantically similar words regardless of which lexeme they belong to, we can query our Word2Vec model for words similar to the word “fracture” and we get back the following:

[('fx', 0.6720505356788635),
 ('fractures', 0.6105263233184814),
 ('fractured', 0.5950684547424316),
 ('acute_fracture', 0.5238978862762451),
 ('nondisplaced_fracture', 0.5142300724983215),
 ('fxs', 0.5074974894523621),
 ('compression_fracture', 0.5024744272232056),
 ('serious_injury', 0.49984192848205566),
 ('displaced_fracture', 0.49904364347457886),
 ('surgery', 0.4895785450935364)]

We can adjust the threshold to filter out any words we don’t want, but the following table shows that this approach works very well.

IncidentDescription fracture_concept
back wpossible t7 fracture 1.000000
fracture left cheek bone and cuts that could possible scaring 1.000000
pain in left knee fracture of unspecified metatarsal left foot 1.000000
bleeding in brain\r\nneck fracture\r\nfracture in lower back 1.000000
soft tissue backneck left side body headachesnauseaswelling possible fracture getting another xray 1.000000
road rash hand and thumb\r\nsprain left wrist nondisplaced fracture 1.000000
left foot fracture 1.000000
6 fx ribs on right side 40k hosp bill 0.672051
fx to hip and wrist hospital admission 0.672051
l heel fx 4 cm lac behind ear 0.672051

We can see that the first few records mention the word “fracture” explicitly. Then we see many occurrences of the very common shorthand “fx”.

After each claim is scored by the AttorneyPredictor model, we run those that score above a certain threshold through a ConceptDetector filter that contains the query “fracture”. Through trial and error, we set the threshold at 0.49. Therefore, all records that score above this are filtered out.

Detecting accidents that happened in a parking lot AND where our insured was hit by another vehicle

Another concept we want to detect is “parking lot”. Through much conversation with end users (claim adjusters, business owners, and other stake holders), we discovered that accidents in a parking lot or similar area (driveway, parking garage, carport, etc) are generally not good candidates for this process. To identify these claims is a bit more involved. First, we follow the same approach that we used for the concept “fracture”. Querying our Word2Vec model for ngrams related to the bigram “parking lot”, we get the following results:

[('parking_space', 0.8050099611282349),
 ('parking_spot', 0.7695923447608948),
 ('driveway', 0.7656093239784241),
 ('gas_station', 0.7567848563194275),
 ('shopping_center', 0.712405264377594),
 ('private_driveway', 0.6738905906677246),
 ('private_drive', 0.6313923001289368),
 ('gas_pump', 0.6188895106315613),
 ('restaurant', 0.5581246614456177),
 ('spot', 0.5539008378982544)]

All of these results are locations that we’d hope to get from this query. Now, we need to run these claims notes through another model that was trained on this data: a model that predicts who hit whom. We won’t go into the details of that model (maybe in a future post?), but it does a really good job of determining if our insured was hit by someone or vice versa. Viewing these results, we can see that our ConceptDetector filter works really well.

AccidentDescription IVWasHit  parking_lot_concept
CV hit IV in parking lot. 1 0.849936
OV rear ended IV in parking lot. 1 0.849936
IV was rear ended by CV in an unoccupied parking garage. 1 0.849936
Iv was rear ended by OV in a parking lot. 1 0.849936
IV was attempting a left turn into a parking area when CV merged into the turning lane and rear ended IV. 1 0.849936
IV was parking in the front of the store to go in and OV was pulling out and hit the left side of IV. 1 0.849936
CV rear-ended IV when IV slowed down for OV turning into a parking lot. 1 0.849936
IV just backed out of parking space and started to go forward when rear ended by CV, in parking lot. 1 0.849936
CV struck IV while merging into traffic from a parking lot. 1 0.849936
The IV was making a right turn into a parking lot, when the OV rear ended the IV. 1 0.849936

From the AccidentDescription field we can see that all of these records are instances where our insured was hit by another party in a parking area, whether it was a “parking area”, “parking garage”, “parking lot”, or “parking space”. The IVWasHit field tells us if our insured was hit by another party, and the parking_lot_concept gives the semantic similarity of these descriptions to the original query “parking_lot”.

We’ve experienced remarkable success by using concept detection for this project and plan to make further improvements.


We’re conducting ongoing research to improve these results. First, we think deep semantic role labeling (SRL) can help us improve our who-hit-who predictor by tagging the various semantic roles which tell us who did what to whom at where. For example, consider the following example:


This is an example we see quite often in our data and one we hope SRL will help us solve more easily.

Another area we’re looking into is context embeddings. A weakness of Word2Vec is that it doesn’t differentiate between similarity and relatedness. In short, semantic relatedness includes any relationship between two terms, while semantic similarity only includes “is a” relations. So far, we haven’t distinguished semantic similarity from semantic relatedness, and in some NLP tasks we may not want to, but for many we do.

As an example of how Word2Vec doesn’t make this distinction, querying the word “boy” from our Word2Vec model, we get back the following:

[('girl', 0.6435362100601196),
 ('man', 0.5221080780029297),
 ('little_boy', 0.5177751183509827),
 ('dog', 0.5172868967056274),
 ('kid', 0.49748730659484863),
 ('guy', 0.48987632989883423),
 ('sheila_beavers', 0.46082741022109985),
 ('alive', 0.45668825507164),
 ('puppy', 0.45001357793807983),
 ('kin', 0.4441930055618286)]

For the most part, these results make sense. But a few of these, although thematically and syntactically similar (both are nouns that co-occur with the same words), are conceptually very different. We’re hoping that context embeddings can help us improve our models by distinguishing between similarity and relatedness.

Helpful Resources

• Original Word2Vec paper by Tomas Mikolov et al.
• Foundations of Statistical Natural Language Processing by Christopher Manning and Hinrich Schütze, specifically section 3 on Linguistic Essentials
• Deep Semantic Role Labeling: What Works and What’s Next by Luheng He et al.
• Learning Generic Context Embedding with Bidirectional LSTM by Oren Melamud et al.
• Querying Word Embeddings for Similarity and Relatedness by Fatemeh Torabi Asr et al.