Extracting information from customer requests


Given a free form vet appointment reason we extract symptoms, diseases and requested services.

Input

A short text giving the appointment reason

Output

Any symptoms, diseases or required vet services that can be inferred from the text

Goal

Extract structured data from appointment reasons


Introduction


Our client offers home vet visits that can be booked online. The customers can enter a free form test as an appointment reason. To know which time slots are available it is important to be able to estimate the duration of each appointment. It was our goal develop a model that could extract Symptoms, Diseases and Services from the appointment reasons, so that our client could then use their experience and expertise to improve their processes.


Starting Point


Our client, felmo, offers vet appointments at home for cats and dogs across Germany. In order to book an appointment  online, customers can input a short text as an appointment reason. 

The goal of the project was to extract all symptoms, diseases and requested services from the appointment reasons.

To help train our model, felmo provided us with a dataset of thousands of appointments and their appointment reasons, each labeled with up to 10 symptoms, diseases or services which veterinarians had inferred from the text beforehand.

Person typing appointment reason for his vet appointment on laptop


Challenges


One of the most challenging aspects was that there were more than 500 symptoms, diseases and services to be distinguished and many symptoms had only subtle differences.

Another challenging aspect was the variation in style of writing. Most users gave a list of symptoms they observed. Some also provided speculations on what might be the underlying cause or also listed which symptoms of which they were certain their pet doesn’t have. Others wrote whole sentences describing their pets behavior over the last couple of days.


Solution


As a first step we wrote a simple algorithm which was able to find exact or almost exact (to account for typos) matches of the symptoms, services and diseases. This already worked pretty well when users provided a list of single words or short expressions.

To be able to handle speculations, negations and more descriptive sentences we used a pre-trained language model (a model trained on a large publicly available dataset to get some general language understanding), which we then fine-tuned on felmo’s data. We also worked together with felmo to improve the dataset, which ultimately led to an improvement in the models predictions.

This new model had as good accuracy on the list-like user inputs as our initial algorithm, but was now also able to recognize negations or infer symptoms from longer descriptions.

In the month-long proof of concept project we were able to:

  • show that a deep neural network could outperform algorithms using regular expressions on this task

  • already present a model which was delivering production-ready results

Now being able to extract the essential information from the appointment reasons, felmo can combine this with their experience and expertise to improve their processes saving their vets precious time!

Related projects