You are invited to join us for a day full of machine learning.
A great space for learning and networking.
The conference will take place at B-Part, Berlin. An exact program will be announced soon. As last year, food will be provided over the entire day. Since the capacity is limited, we encourage you to register early.
This year we introduce the concept of a thematic stream: the Public Sector stream in 2024.
In light of our recent uptick in public sector machine learning projects and legislative changes, we're launching a stream dedicated to the public sector. The details for the two relevant program points are outlined below. The program points are assigned with a public sector batch.
Please register to reserve a place. Limited places are available. We look forward to seeing you in May!
Conference Schedule
Doors are open
Arrive and connect with others.
Introduction
Welcome to the dida conference.
Decision Process Automation with Large Language Models
Large Language Models impress with their adeptness in context-aware text generation, logic, as well as reasoning. Typically, downstream models fine tuned on chat data possess the remarkable ability to be directed towards solving tasks described in natural language without explicit further weight adaptation. In relevant applications, interesting use cases often relate multiple external data sources with each other and are characterized by a complex multistep decision process. In this talk, we discuss how predefining decision steps and integrating external data filtering can break down multifaceted problems into manageable, self-contained language processing tasks, which can readily be solved by LLMs.
Pretraining AI models for earth observation: transfer-learning and meta-learning
Pretraining involves training an AI model on a large dataset to learn general features, which can then be finetuned on specific tasks with smaller datasets. This decreases the need for time intensive dataset acquisition and training efforts for each new use case, reducing the costs of application development. While pretrained models are widely used in computer vision and natural language processing, their adoption for satellite data and earth observation applications remains limited. Our investigation focuses on comparing the capabilities of transfer-learning and meta-learning approaches for the pretraining of AI models in earth observation tasks, particularly crop type classification, and their potential to generalize insights across different geographical regions.
Coffee break
Anomaly Detection in Track Scenes
Within the sector initiative “Digitale Schiene Deutschland”, our client Deutsche Bahn is developing an automated driving system for trains. As a part of the efforts towards such a system we developed, together with Deutsche Bahn, a machine learning solution to detect anomalous and hazardous objects on and around the tracks using onboard RGB cameras. It is intentionally required that this system does not simply detect objects within a given collection of classes (such as people, signals or vehicles), but rather has the ability to detect any object and rank them by how anomalous they are. This presentation explains the challenges encountered, presents several approaches explored, and provides an overview of the final solution: In order to detect objects of possibly unkown classes we developed a unique pipeline containing multiple machine learning components, including a monocular depth estimation model, a segmentation stage, image embedding models and an anomaly detection model. As dataset, Digitale Schiene Deutschland provides us with OSDAR23, an open dataset that contains 45 scenes. Each scene contains images taken by several RGB cameras and infrared cameras, together with radar and lidar data. This dataset contains annotations for twenty classes of objects, which we use both for finetuning our model and for evaluating the final results. Besides, we were also granted access to a larger amount of unannotated data, which were used for self-supervised learning.
Lunch Break
Food and drinks will be available for all guests.
Diffusion Models for Speech Enhancement
Diffusion models have emerged as a distinct class of generative models with an impressive ability to learn complex data distributions such as those of natural images, music, and human speech. In the context of speech enhancement, diffusion models can be used to learn the conditional distribution of clean speech given the noisy mixture. Following this idea, we have proposed the method “Score-based Generative Models for Speech Enhancement” (SGMSE), a continuous-time diffusion model based on an Ornstein-Uhlenbeck process. In our experiments, we show competitive speech enhancement performance compared to predictive baselines, while generalization is better when evaluated in a mismatched training scenario. Subjective listening tests show that, on average, the enhanced speech is preferred over the predictive baselines and is often perceived as natural-sounding. However, for very challenging input, the model tends to hallucinate and generates speech-like sounds without semantic meaning. To address this problem, we have combined predictive and generative approaches, and conditioned the model on visual input of the speaker’s lip movements. Moreover, to improve robustness and address the problem of slow sampling speed in diffusion models, we have used a Brownian bridge as a stochastic process, and proposed a two-step training for diffusion-based speech enhancement that enables single and few-step generation.
Data Extraction in the Age of LLMs
In recent years, the advent of Large Language Models (LLMs) has changed the landscape of data extraction. These LLMs boast unparalleled text processing capabilities and come pre-trained on vast amounts of data, rendering them effective for information retrieval tasks. However, traditional methods such as graph neural networks and extractive models have historically been favored for their efficiency in resource utilization. Despite this, the question persists: how do LLMs compare with those models in practical data extraction applications? This presentation aims to delve into this inquiry, providing a comprehensive examination of LLMs' advantages and disadvantages compared to extractive models. Drawing from our project experiences and internal research, we aim to elucidate the practical implications of utilizing LLMs for data extraction, offering insights into their efficacy, resource requirements, and overall performance in real-world scenarios. Through this exploration, attendees will gain a deeper understanding of the role of LLMs in modern data extraction workflows and the considerations involved in their implementation.
Coffee break
On the Geometry of Images in Human and Neural Network Representation Spaces
The relationship between human cognition and neural network interpretation of the world is one of the most intriguing questions in modern AI research. This talk explores a recent line of inquiry into this question using images and addresses three key issues: How can we derive a geometric object that encapsulates human understandings of images? How does the geometry of these image representations compare to that derived from neural network representations? Can we realign a neural network representation to achieve a more human-like understanding of the world, thereby improving neural network performance?
New Opportunities: Applications of Machine Learning Technologies in the Public Sector
This presentation dives into the transformative potential of machine learning (ML) technologies in the public sector, highlighting opportunities for efficiency, transparency, and improved service delivery. Using a case example from practice, it illustrates how ML applications already have found their way into public processes.
Ambient DJ Set: Karolina
Check out Karolina on SoundCloud.
Poster Session
Jonas Golde (HU Berlin): fabricator - An Open Source Toolkit for Generating Labeled Training Data with Teacher LLMs
Fabio Barth (HU Berlin): Occiglot: A research collective for open-source development of Large Language Models by and for Europe
Jacek Wiland, Max Ploner (HU Berlin): BEAR: A Unified Framework for Evaluating Relational Knowledge in Causal and Masked Language Models
Mark Budgen (dida): Defect detection and classification for unroasted coffee beans
Maximilian Trescher (dida): Artificial intelligence for detecting respiration in dairy cows
William Clemens (dida): LaserSKI - Defect Detection in Laser Diodes
Konrad Mundinger (Zuse Institute Berlin): Neural Parameter Regression for Explicit Representations of PDE Solution Operators
Julius Richter (Universität Hamburg): Diffusion Models for Audio-Visual Speech Enhancement
Jakob Wagner (appliedAI): Neural Operators: Solutions with the Continuity Framework
Faried Abu Zaid (appliedAI): Applications of Uniformly Scaling Flows
Robert Müller (appliedAI): Imitation Actor Critic
Dinner & Networking
Food and drinks will be available for all guests.
Workshops: Room 1
Generative AI and legal implications
Generative AI is the talk of the town at the moment. Every week, a new model is released that can be used to generate videos, images, music or text. Many companies are currently faced with the question of whether and how they can utilise such AI. The legal framework plays an important role here.
The EU is currently working on new rules for artificial intelligence, but existing laws also contain guidelines and specifications for the use of AI. It is important to keep the entire life cycle of AI in mind, from training to use in real life. Each of these steps can give rise to complex problems, some of which require creative solutions. Christian Dürschmied will introduce you to the potentials and risks associated with the use of AI. Learn all about the do's and don'ts in relation to Generative AI.
Who develops AI? Who benefits from AI?
We want to take a closer look, who are the minds behind the world changing ML models and who actually benefits from the mined data and deployed models. We will look at some worrying developments as well as some ideas how to design participatory AI.
Fusion of Innovation and People – How are Intrapreneurship and Psychological Safety interconnected in modern work dynamics?
In the contemporary work landscape, the interaction between Intrapreneurship and Psychological Safety is crucial, as both concepts play integral roles in fostering an environment conducive to innovation and risk-taking within organizations. Intrapreneurship empowers employees to embrace entrepreneurial roles within the company, driving creativity and initiative. Whereas Psychological Safety ensures that work teams feel safe taking interpersonal risks and sharing their ideas without fear of reprisal. We want to explore together which insights and methods lead to a fusion of innovation and the well-being of people who define the organisation and its performance.
Ideation Workshop to explore potentials of computer vision and NLP in the public sector
Join us for a 45-minute workshop where we will explore the fields of application and potentials of computer vision and NLP in the public sector. We will connect with each other, co-work and share ideas on existing best practices and future ML ideas tailored to stakeholder needs.
Workshops: Room 2
Flexible training pipelines with Pytorch Lightning and Hydra
Training Machine Learning models requires rapid iteration and experimentation with hyperparameters, architectures and even approaches. On the other hand, it is difficult to maintain an extensive training codebase. In the workshop we introduce a modular training framework powered by Pytorch Lightning and Hydra that enables seamlessly swapping components as a potential solution to these challenges.
LangChain Expression Language for building LLM production pipelines
LangChain is widely known as a prominent Python library for interfacing with LLMs. However, its primary usage was for constructing proofs of concept (POCs), as it lacked the capability to develop intricate and scalable applications. In this workshop, we provide an overview of LangChain Expression Language's capabilities to enhance the efficiency and flexibility of constructing chain components.
Paper reading group
dida has a weekly internal reading group where our ML scientists discuss recent papers and how they can help with our projects. In this session we'll hold a live reading group meeting. We will discuss the papers LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images and KAN: Kolmogorov–Arnold Networks.
Ideation Workshop to explore potentials of computer vision and NLP in the public sector
Join us for a 45-minute workshop where we will explore the fields of application and potentials of computer vision and NLP in the public sector. We will connect with each other, co-work and share ideas on existing best practices and future ML ideas tailored to stakeholder needs.
Fabian Dechent
Fabian studied theoretical physics at the Humboldt University of Berlin and is a machine learning scientist at dida, currently specializing in LLMs.
Dr. Jan Macdonald
Jan holds a PhD in mathematics (TU Berlin), focusing on applied topics in optimization, functional analysis, and image processing. At dida he works as a machine learning scientist.
Dr. Maximilian Trescher
Max obtained his PhD in theoretical quantum and solid state physics from FU Berlin. At dida, he works as a machine learning scientist and project lead.
Julius Richter
Julius Richter is a PhD student in machine learning at Universität Hamburg, focusing on audio processing with diffusion-based generative modeling.
Axel Besinger
At dida, Axel work in the intersection of business development and customer engineering. He is the product lead of smartextract.
Dr. Augusto Stoffel
Augusto holds a PhD in mathematics (University of Notre Dame, USA) and did research in the field of algebraic topology and its application as a foundation of quantum field theory. At dida he works as a machine learning scientist.
Holger Pannhorst
Holger is an economist with a background in statistics. With more than 15 years of experience in the data and analytics world he is now leading the data and analytics department at the Bundesdruckerei.
Dr. Robert Vandermeulen
Robert is a machine learning postdoc at TU Berlin, focusing mainly on deep anomaly detection and nonparametric statistics.
Arne Doll
After graduating in psycholinguistics, Arne gained experience in various management and leadership positions. He loves working with people and has a passion for successful communication depending on specific contexts and situations.
Christian Dürschmied
Christian works as an attorney for Eversheds Sutherland, primarily focusing on privacy and data protection law and their connections to cybersecurity and data law in general. He is a regular speaker on these topics and an author in professional publications
Dr. Liel Glaser
Liel holds a PhD in theoretical physics from the Niels Bohr Institute in Copenhagen. At dida, Liel works as a machine learning scientist.
Bela Baganz
Bela studies organizational and behavioral psychology. At dida, he supports in all areas of personal and organizational development.
Ksenia Nuykina
Ksenia studies innovation and entrepreneurship at IU International University of Applied Sciences in Berlin. At dida, Ksenia helps in business development, market analysis, and customer outreach.
Anton Shemyakov
Anton has studies applied mathematics and has a focus of building robust machine learning systems. At dida, he works as a machine learning scientist.
Thanh Long Phan
Long studied mathematics (HU Berlin) with a focus on differential geometry and functional analysis. At dida, he works as a machine learning scientist, with a special focus on LLMs.
Dr. William Clemens
Will holds a PhD in string theory and quantum chromodynamics at the University of Southampton. At dida, he works as a machine learning scientist.
Julius Lauenstein
Julius has a background in engineering and management, is interested in the bridge between technology and its application and is motivated by a digitally competent governance. Julius is responsible for dida’s activities in the public sector.