The 4th annual dida conference
After three incredible years of insights and innovation, dida conference 2026 is officially on the horizon — deeper, broader, more focused than ever.
After three successful editions, the dida conference returns in 2026. Join us at the frizzforum in Berlin Kreuzberg for a day of applied and technical machine learning talks, hands-on workshops, networking, and good food. As the number of participants is limited, we recommend early registration.
After three incredible years of insights and innovation, dida conference 2026 is officially on the horizon — deeper, broader, more focused than ever.
Impressions from last year's eventThis year, we're at frizzforum, Berlin Kreuzberg. Whether you're here for the deep-tech sessions, the hands-on workshops, or the high-level networking, this is a day you won't want to miss.

The confirmed dida conference 2026 lineup — research, industry, scientific machine learning, explainability, tabular foundation models, cryptography and PDEs.
Jonas Köhler
Member of Technical Staff · CuspAI
Simulating Molecules in the Age of AI
Member of Technical Staff · CuspAI
TALK PREVIEWMany open problems in carbon capture, clean water, and the future of compute are fundamentally materials problems. Because the relevant chemical space is too large for trial and error, machine learning is now used throughout materials discovery. But predictions only become observables through statistical mechanics, so simulation cannot be removed from the pipeline. The infrastructure connecting these pieces was built decades ago for CPUs and offers little composition between methods, potentials, and ensembles, which leaves modern accelerators underused. In this talk, I present kUPS, an open-source JAX-native simulation engine developed at CuspAI in which Monte Carlo, molecular dynamics, relaxation, and arbitrary combinations thereof compose freely with classical or learned potentials in a single JIT-compiled program, with order-of-magnitude throughput improvements on production workloads. More broadly: as AI handles more application code, the engineering work shifts to designing small, well-typed primitives that humans and agents can compose. Molecular simulation, with its strict demands on physical accuracy, performance, and extensibility, is a useful test for that claim.
Gerhard Koch
Chief Software & Product Officer · Siemens Energy
Industrial & Energy AI challenges “solved” with a new structural and explainable AI approach
Chief Software & Product Officer · Siemens Energy
TALK PREVIEWThe energy transition demands precise control, yet in practice, many AI solutions fail due to unpredictable signal noise and a lack of reliability. This presentation introduces an innovative structural approach that overcomes these instabilities through a new principle of signal stability and adaptive classification logic. Instead of relying on raw computing power, we focus on the efficiency, transparency, and traceability of AI models to prevent critical errors in sensitive grid environments. We demonstrate how AI for the energy sector becomes truly reliable, sustainable, and explainable through situational awareness. Discover a solution designed to guarantee security of supply even under extreme real-world conditions.
Noah Hollmann
Co-Founder & CTO · Prior Labs
The Revolution in Tabular Data Prediction
Co-Founder & CTO · Prior Labs
TALK PREVIEWFoundation models have transformed NLP and computer vision - but tabular data prediction still largely works the way it did ten years ago: pick an algorithm, engineer features, tune hyperparameters, repeat for every new dataset. In this talk, I present TabPFN, a foundation model that brings the same paradigm shift to tabular data. TabPFN is a transformer trained on billions of synthetic prediction problems that can be applied to new datasets in-context - replacing what used to be hours of pipeline work with a single API call that returns predictions in seconds. I will show how this works under the hood, how it compares to gradient-boosted trees and other baselines across hundreds of benchmarks, and what it means in practice: a simpler, faster, and more homogeneous workflow for tabular prediction. I will also discuss where we are headed next - from time series foundation models to agentic data science - and what we have learned bringing this from research into production.
Anastasia Borovykh
Professor, Imperial College London
Beyond left-to-right text generation with diffusion models
Professor, Imperial College London
TALK PREVIEWDiffusion models underpin image and video generation, using iterative refinement to produce realistic and original compositions. Today’s frontier LLMs, by contrast, mostly rely on autoregressive generation: predicting text left-to-right, one token at a time. But is this sequential paradigm really the best we can do for language, or can diffusion models also prove useful for discrete text data? In this talk, I will introduce diffusion-based approaches to language generation and explain how they differ from autoregressive LLMs. Their potential advantages include faster inference, more parallelizable generation, and new ways to trade off quality, latency, and compute. Could diffusion models eventually outperform autoregressive systems or is the future more likely to be hybrid, and what does this tell us about the search for architectures beyond today’s LLM paradigm?
Maria Bergmann
Development Engineer · Innomotics
Automated Vibration Prediction for Electric Motors via PDE Approximation
Development Engineer · Innomotics
TALK PREVIEWTo computationally predict the vibration behavior of an electric motor, finite element (FE) simulations are used. They provide information about how sensitive the structure is to the excitation mechanisms that occur and thus enable the prediction of the expected vibrations during the operation of the motor. Design variants are tested and ultimately a design is selected that ensures reliable operation with low vibration levels. It is demonstrated how these simulations can be automated for a product line of large electrical motors using open-source software. This automation significantly speeds up the design process but also reaches certain limits. At this use case the limits and difficulties are demonstrated, that occur with conventional FE calculations. Looking ahead, it is discussed whether AI could be used to improve the process.
Gabriel Nobis
Research Associate · Fraunhofer HHI
Fractional Noise in Diffusion-Based Generative Models
Research Associate · Fraunhofer HHI
TALK PREVIEWMost continuous-time diffusion models rely on Brownian motion as the source of randomness in the noising process, a memoryless mechanism characterized by independent increments. While this choice offers mathematical tractability, it constrains the expressivity of the noising dynamics, limits the richness of their temporal structure, and excludes noise processes with long-range temporal dependencies. After reviewing continuous-time diffusion models, this talk explores diffusion-based generative models driven by an approximation of fractional Brownian motion, a non-Markovian process with long-range temporal dependence. We further present applications of this novel generative paradigm to image generation, sampling, and the prediction of protein conformations.
Markus Düttmann
Managing Director · dida
Trusting the Agent: Citation-Backed Responses & LLM-as-a-Judge Evaluation
Managing Director · dida
TALK PREVIEWFurther details about this talk will be announced soon.
Annette Rudolph
Professor · TU Berlin
How to feed Machine Learning algorithms with proper metrics?
Professor · TU Berlin
TALK PREVIEWThe majority of multivariate statistics and machine learning algorithms expect Euclidean metrics on unconstrained data spaces. On the other hand, most variables are strictly positive and capped by physical constraints, which leads to pointless arithmetic measures. Disobeying these constraints may obscure meaningful patterns, produce spurious correlations, or senseless measures of model quality. Useful recipes to overcome common pitfalls in multivariate statistics and machine learning for (a) common physically constrained and (b) compositional data spaces will be presented with hands-on examples.
Peter Kristel
Forschungsreferenz Schlüsseltechnologie · Cyberagentur
AI-Methods in Symmetric Cryptography
Forschungsreferenz Schlüsseltechnologie · Cyberagentur
TALK PREVIEWA block cipher is a cryptographic primitive that takes a key and a message, both of a fixed length, and turns it into a ciphertext. Ideally, one can only recover the message from the ciphertext by using the key. However, a theorem from Shannon from 1948 says that perfect secrecy can only be achieved if the key is of the same length as the ciphertext, which is a completely impractical constraint. In practice, the key is much shorter than the ciphertext. This means that statistical analysis of the ciphertext reveals some information about the message (or even the key). Thus, when developing a block cipher, the goal is to make this statistical analysis computationally infeasible. While there are design principles for block ciphers, the proof is in the pudding, and the only way to determine if a cipher is secure is to try to break it. I will explain some basics of block cipher design, and then review some recent work applying Machine Learning to the analysis of block ciphers. After this overview, I would encourage the participants to brainstorm and discuss what kind of AI-methods could be promising for future research.
Leonard Schenk
Senior Analyst & Engineer · SPRIND
The European AI Playbook - Why, What and How
Senior Analyst & Engineer · SPRIND
TALK PREVIEWEurope’s reliance on US and Chinese AI models creates systemic strategic vulnerabilities. This workshop outlines a playbook for establishing sovereign Frontier AI within Europe. Rather than replicating US-centric LLM scaling, we prioritize AI that aligns with European industrial strengths. The session details the strategic imperative (“Why”), specific focus domains like robotics or energy-efficient architectures that fit the local landscape (“What”), and implementation via distributed training and semiconductor sovereignty (“How”). The outcome is a roadmap for deployable, "continent-aligned" models for long-term competitiveness.
Benjamin Ulbrich
Director Global Digital Processes · Innomotics
Automated Vibration Prediction for Electric Motors via PDE Approximation
Director Global Digital Processes · Innomotics
TALK PREVIEWTo computationally predict the vibration behavior of an electric motor, finite element (FE) simulations are used. They provide information about how sensitive the structure is to the excitation mechanisms that occur and thus enable the prediction of the expected vibrations during the operation of the motor. Design variants are tested and ultimately a design is selected that ensures reliable operation with low vibration levels. It is demonstrated how these simulations can be automated for a product line of large electrical motors using open-source software. This automation significantly speeds up the design process but also reaches certain limits. At this use case the limits and difficulties are demonstrated, that occur with conventional FE calculations. Looking ahead, it is discussed whether AI could be used to improve the process.
Ma Li
ML Scientist · dida
dida Reading Group
ML Scientist · dida
TALK PREVIEWdida has a weekly internal reading group where our ML scientists discuss recent papers and how they can help with our projects.
Papers presented will be announced soon.
Tom Freudenberg
Postdoctoral Researcher · Uni Bremen
TorchPhysics and Beyond: Making Deep Learning for PDEs Accessible
Postdoctoral Researcher · Uni Bremen
TALK PREVIEWDeep learning has become a promising tool in surrogate modelling, offering flexible alternatives to classical numerical methods for PDEs. However, practical application remains challenging due to the complexity of incorporating physical constraints, selecting suitable architectures, and designing effective training strategies. In this talk, we first give an introduction to physics-informed and operator learning, discussing advantages and challenges based on our experience in industrial applications. We then present our work on developing software tools to simplify and structure this process. This includes a modular framework designed to support experimentation with physics-informed models, as well as ongoing work on automated guidance through algorithm suggestions and structured design choices.
Nick Heilenkötter
Research Associate · Uni Bremen
TorchPhysics and Beyond: Making Deep Learning for PDEs Accessible
Research Associate · Uni Bremen
TALK PREVIEWDeep learning has become a promising tool in surrogate modelling, offering flexible alternatives to classical numerical methods for PDEs. However, practical application remains challenging due to the complexity of incorporating physical constraints, selecting suitable architectures, and designing effective training strategies. In this talk, we first give an introduction to physics-informed and operator learning, discussing advantages and challenges based on our experience in industrial applications. We then present our work on developing software tools to simplify and structure this process. This includes a modular framework designed to support experimentation with physics-informed models, as well as ongoing work on automated guidance through algorithm suggestions and structured design choices.
Kai Hartmann
Senior Lecturer · FU Berlin
How to feed Machine Learning algorithms with proper metrics?
Senior Lecturer · FU Berlin
TALK PREVIEWThe majority of multivariate statistics and machine learning algorithms expect Euclidean metrics on unconstrained data spaces. On the other hand, most variables are strictly positive and capped by physical constraints, which leads to pointless arithmetic measures. Disobeying these constraints may obscure meaningful patterns, produce spurious correlations, or senseless measures of model quality. Useful recipes to overcome common pitfalls in multivariate statistics and machine learning for (a) common physically constrained and (b) compositional data spaces will be presented with hands-on examples.
Lorenz Richter
Managing Director & CTO · dida
Moderation
Managing Director & CTO · dida
TALK PREVIEW
Johannes Müller
Postdoctoral Researcher · TU Berlin
Resolving Ill-Conditioning in Scientific Machine Learning
Postdoctoral Researcher · TU Berlin
TALK PREVIEWWe introduce an “optimize-then-project” framework for scientific machine learning that addresses numerical instability caused by differential operators in training objectives. The core concept is to first formulate optimization algorithms in a continuous (infinite-dimensional) setting, and then discretize them within the tangent space of the chosen neural network model. We demonstrate how this approach can be applied to practical settings, including physics-informed neural networks and variational Monte Carlo methods for quantum many-body systems - areas where neural network-based models are increasingly gaining traction. We also discuss key considerations for scaling these methods to larger, real-world problems.
Philipp Trunschke
Postdoctoral Researcher · PTB
Tensor Trains for High-Dimensional PDEs
Postdoctoral Researcher · PTB
TALK PREVIEWMany industrial PDE models become computationally challenging once parameters, uncertainty, or multiscale resolution are included. This talk introduces tensor trains as a low-rank representation for PDE solutions, coefficients, and operators, offering a structured alternative to physics-informed neural networks. We sketch the tensor train format and the basic numerical toolbox, highlight where these methods can pay off in practice, and contrast them with neural networks.
Bela Baganz
People & Communications Manager · dida
AI Is Making Some Human Factors More Salient at Work - How Do We Work with Them?
People & Communications Manager · dida
TALK PREVIEWAI is currently taking over a wide variety of tasks and changing how we work at a pace that makes adaptation necessary, by continuously identifying which tasks can be automated. But it is the things at work that AI cannot take over that show us which processes might stay exclusively human, or, in a way, that make us human after all. These factors also need to be looked at, because they will become more salient and important to understand. If that is true, how are we able to diagnose and develop these "human factors" and is this actually a new problem? In this 90-minute participative session, we'll work through the hypothesis that AI is showing us what being human at work means, the empirical evidence for examining "human factors," and try to look at what part of being human that we might want to focus on in the future of work.
Felix Zeh
Exec. Director Marketing Intelligence · Ketchum
The Strategic Question: What Are We Actually Measuring - and For Whom?
Exec. Director Marketing Intelligence · Ketchum
TALK PREVIEWBrand visibility in generative AI systems is not a technical problem dressed up as a business problem. It is a strategic problem that happens to require technical infrastructure to answer. The question of whether a brand appears in an LLM response only becomes meaningful when it is tied to a specific person, a specific situation, and a specific moment in a decision process. Without that context, a visibility score is a number without a reference frame.
Felix's part of the talk addresses the upstream questions that determine whether a GEO measurement is actionable or merely interesting. Starting from the S-O-R model — where the LLM is the organism, the response is probabilistic, and the prompt is the only controllable stimulus — he will argue that the real methodological challenge is not running the measurement, but defining what should go into it.
Which audiences matter? What decision stages are relevant? What does the competitive landscape actually look like inside an LLM, as opposed to in a traditional market definition? And when the data comes back, what does it mean for content strategy, earned media, and third-party validation?
Drawing on cross-industry findings from live GEO analyses, Felix will walk through the eight systematic patterns that hold across markets, languages, and LLM configurations — including the branded-to-unbranded visibility collapse, the role of independent authorities as citation anchors, and the structural difference between what a brand publishes and what an LLM actually cites.
He will close with the strategic implications: which levers work, why most GEO discourse is solving the wrong problem, and where the methodology still has unresolved questions about external validity and persona construction.
Gustav Block
Head of Data Analytics · Ketchum
The Technical Setup: Building a Measurement System for Probabilistic Outputs
Head of Data Analytics · Ketchum
TALK PREVIEWMeasuring brand visibility in LLMs requires infrastructure that does not yet exist off the shelf. LLMs produce no impression logs, no rankings, no defined selection mechanism. Every measurement is an inference from a sample of prompted outputs — which means the validity of the measurement depends entirely on the validity of the prompts, and the reliability depends on how stochastic variance is controlled across runs.
Gustav's part of the talk covers the technical architecture his team built to run GEO analyses at scale: direct API access across ChatGPT, Gemini, Claude, and Perplexity with full model version pinning; a parallel execution pipeline with configurable batch sizes and retry logic; a parsing pipeline for brand extraction, source classification, and domain categorisation; and a proprietary visibility index that combines exposure share and run-consistency into a single comparable score.
He will explain why each of these design decisions matters for the integrity of the output — and what breaks if they are skipped.
The core of the talk is the prompt as a high-dimensional variable. Persona, phrasing, knowledge state, funnel stage, and model choice each produce measurable variance in output. Gustav will show empirically that the phrasing delta between two semantically equivalent prompts regularly exceeds the persona effect — and that removing the brand name from a prompt reduces share of voice by 85 to 95 percent, a finding that holds across both cases in the dataset.
He will then describe the methodological response: psychometrically grounded synthetic personas, structured funnel definitions, prompt families validated for human plausibility before every API call, and a minimum of three runs per query to make stochastic variance tractable. The talk closes with an honest account of what this setup still cannot solve — and an open question to the audience about where the methodology needs to go.
Irena Bojarovska
Applied Scientist · Zalando
Beyond Accuracy: Agentic Workflows and Foundation Models in Demand Forecasting
Applied Scientist · Zalando
TALK PREVIEWTraining and maintaining accurate forecasting models is a significant operational hurdle. At Zalando, this means managing 700+ specialized time-series models predicting demand and other target variables across 25+ markets. Recent foundation models offer a potential all-in-one solution, raising the critical question of whether pre-trained architectures can replace years of specialized tuning. Yet, even if a pre-trained model dominates standard benchmarks, accuracy alone is not enough. To be genuinely useful for downstream business decisions, forecasts must be fundamentally stable and reliable when exposed to the noisy realities of retail data.
In this talk, we share our experience testing these foundation models in a production-like environment. Moving beyond standard accuracy metrics, we demonstrate how we used Agentic AI workflows to rapidly translate academic research on forecast stability into interactive, reliable decision systems. This session offers data professionals a real-world roadmap for delivering high-quality forecasts without sacrificing operational trust.
Jannis Chemseddine
Research Associate · TU Berlin
Advances in Continuous Diffusion Language Models
Research Associate · TU Berlin
TALK PREVIEWIn images and video, continuous diffusion models are the dominant generative approach. In language the situation is different. Autoregressive models lead, and discrete diffusion has become the most promising alternative. Continuous diffusion saw some early attention, which then faded in favor of discrete diffusion. That is beginning to change.
We will introduce flow-based generative modeling and show how it can be applied to discrete data such as language. We will outline the key design decisions this involves, survey the recent work behind the renewed interest and present our own contribution at this frontier, which builds flows on the sphere using von Mises–Fisher distributions.
Arrive, check in and settle in before the programme starts.
Welcome to the dida conference.
Food and drinks will be available for all guests.
Poster details will be announced soon.
Scientific machine learning, simulation, PDEs, physics-informed AI and engineering applications.
Room 1Human factors, sovereign AI, applied AI methods and practical industry perspectives.
Room 2Interactive discussions, peer exchange and focused workshop sessions throughout the day.
Room 3Informal networking before and after the conference programme. Evening gathering with dinner, drinks and networking.
RooftopMorning breakfast gathering for invited guests.
Dinner, drinks and networking on the rooftop terrace.
We recommend early registration as seats are limited. Attendance is free for this invitation-based event.