Recorded talks


Open NLP meetup: Ethics in Natural Language Processing


Marty Oelschläger

November 5th, 2023


This talk covers two main topics. The first part delves into the ethical considerations in Natural Language Processing (NLP), discussing how language models are developed and used responsibly, addressing issues such as data privacy, algorithmic bias, and the societal impacts of automated language systems. The second segment provides a hands-on introduction to image retrieval, explaining the techniques and algorithms that enable the searching and finding of images based on content, metadata, or descriptive tags. This could include demonstrations of indexing images, feature extraction, and the use of search queries to navigate large image databases effectively.

Detecting Convective Clouds in Geostationary Satellite


William Clemens

November 4th, 2023


Detecting convective clouds is crucial for weather forecasting and climate studies. In his work, William Clemens, a Machine Learning Scientist at dida, leverages Convolutional Neural Networks (CNNs) to analyze geostationary satellite data for this purpose. CNNs are particularly adept at image recognition tasks, making them suitable for identifying the complex patterns and structures characteristic of convective clouds. Clemens's approach likely involves training the CNNs on large datasets of satellite imagery labeled with the presence of convective clouds, enabling the model to learn the distinguishing features of these clouds.

Information extraction: from graph neural networks to transformers


Augusto Stoffel

October 23rd, 2023


This talk aims to compare two prominent classes of models used in information extraction from semi-structured documents: Graph Neural Networks (GNNs) and specialized transformer-based architectures. While transformers are renowned for their text processing capabilities and come with pretrained weights, GNNs have the benefit of requiring much less computational power. The objective is to evaluate how these two types of models perform in practical scenarios, based on both project experience and internal research.

Graph neural networks for information extraction with PyTorch


Augusto Stoffel

October 23rd, 2023


In Augusto Stoffel's talk, he introduces graph neural networks (GNNs) by comparing them to convolutional neural networks (CNNs). He describes how an image can be represented as a graph to naturally transition into the basics of GNN architecture. The talk then covers Python implementations, particularly in the PyTorch framework, and focuses on GNN applications in information extraction from tabular documents in the field of NLP.
© unsplash/Markus Spiske

Semantic search and understanding of natural text with neural networks: BERT


Konrad Schultka and Jona Welsch

September 17th, 2020


In this webinar you will get an introduction to the application of BERT for Semantic Search using a real case study: Every year millions of citizens interact with public authorities and are regularly overwhelmed by the technical language used there. We have successfully used BERT to deliver the right answer from government documents with the help of colloquial queries - without having to use technical terms in the queries.

© unsplash/Raymond Rasmusson

Labeling Tools - The second step on the way to the successful implementation of an NLP project


Ewelina Fiebig and Fabian Gringel

May 19th, 2020


The success of an NLP project consists of a series of steps from data preparation to modeling and deployment. Since the input data are often scanned documents, the data preparation step initially involves the use of text recognition tools (OCR for short) and later on also the use of so-called labeling tools. In this webinar we will deal with the topic of selecting a suitable labeling tool.