© unsplash/@campaign_creators

© unsplash/@campaign_creators
Retail, E-commerce & Marketplaces

Product feature extraction and standardization

Context

Clients on e-commerce websites search for products based on certain features, e.g. blue jeans or camera pixels in smartphones. The option to filter and quickly find the right product can be a major differentiator in the market, which drives purchases and conversions for e-tailers and marketplaces.

Challenges

The information about the features is available but might be distributed over different input sources and be available in different formats. For example, a fashion retailer wants to be able to filter by material, dominant colour and garment. However, fashion brands and e-tailers marketing their products on platforms such as Amazon Marketplace provide different information on products, sometimes even on the same products. This information needs to be unified such that the information is searchable and users can filter by fixed terminology within a website.

Potential solution approaches

Approximate string/pattern matching algorithms have delivered good results on similar tasks in the past. The features or attributes might be hand crafted or automatically created to provide rule-based approaches to match similar items and categorize them accordingly. This approach might work in cases where descriptions are relatively similar but not context dependent. If the context (e.g. product category) changes the semantics of the description, deep neural networks like CNNs or RNNs are alternatives, including word embeddings such as BERT.

Related webinars

Text recognition (OCR) - The first step on the way to a successful implementation of an NLP project

In this talk we will deal with the topic of text recognition.

Ewelina Fiebig

Machine Learning Scientist

Fabian Gringel

Machine Learning Scientist

Labeling Tools - The second step on the way to the successful implementation of an NLP project

The success of an NLP project consists of a series of steps from data preparation to modeling and deployment. Since the input data are often scanned documents, the data preparation step initially involves the use of text recognition tools (OCR for short) and later on also the use of so-called labeling tools. In this webinar we will deal with the topic of selecting a suitable labeling tool.

Ewelina Fiebig

Machine Learning Scientist

Fabian Gringel

Machine Learning Scientist

Semantic search and understanding of natural text with neural networks: BERT

In this webinar you will get an introduction to the application of BERT for Semantic Search using a real case study: Every year millions of citizens interact with public authorities and are regularly overwhelmed by the technical language used there. We have successfully used BERT to deliver the right answer from government documents with the help of colloquial queries - without having to use technical terms in the queries.

Konrad Schultka

Machine Learning Scientist

Jona Welsch

Machine Learning Scientist

Recurrent neural networks: How computers learn to read

The webinar will give an introduction to the functioning of RNNs and illustrate their use in an example project from the field of legal tech

Fabian Gringel

Machine Learning Scientist