Thesis topics

Master’s and Bachelor’s Thesis Projects

Please contact the co-supervisor if you are interested in a thesis.


Metabolic Network Reconstruction and Optimization

Motivation:

Inferring and manipulating the metabolic network of single-cell organisms holds excellent promise for synthesizing novel chemical substances. Together with the Fraunhofer-Institut für Grenzflächen- und Bioverfahrenstechnik IGB we want to contribute to methodologies of inferring the metabolic network (and proposing possible interventions) of organisms in a bioreactor. As a first step, we want to develop and solve toy problems close to real-world data.

Challenges: formalization of the problem setting together with domain experts, proposing an algorithm, efficient implementation

Prerequisites: ideally, some background in probability theory and combinatorial optimization, potentially deep learning

Co-Supervisor: Gerrit Großmann (with support of Jonathan Fabarius)

Related Work: Unsupervised Relational Inference Using Masked Reconstruction and Genome-scale reconstruction and system level investigation of the metabolic network of Methylobacterium extorquensAM1 and In Silico Constraint-Based Strain Optimization Methods: the Quest for Optimal Cell Factories.


Learning in Deep Weight Spaces

Motivation:

Exploring the prediction of properties of neural networks (both deep and shallow) based on their weights is an intriguing new research direction filled with numerous unanswered questions. Our goal is to investigate which properties can be inferred from the weights of neural networks and to understand the characteristics of these weights. Key questions of interest include:

  • What does the manifold of deep weight spaces look like?
  • Which network architectures (and invariants) are useful for predicting properties?
  • What kinds of properties can be predicted?
  • How does training shape the manifold (e.g., can we identify phase transitions)?
  • Can we pinpoint modules dedicated to specific subtasks?
  • Is it possible to enhance training or alter aspects of a neural network’s behavior entirely through the use of a weight predictor network?
  • What is the relationship between activation and weight space?

Challenges: The main challenges involve identifying a suitable research question and task class, training neural networks for the chosen task, and then developing a secondary neural network for weight space prediction. It also includes the empirical examination of various facets of these predictions.

Prerequisites: A strong foundation in deep learning and an interest in mechanistic interpretability.

Co-Supervisor: Gerrit Großmann

Related Work:


Diffusion Modes for Graph and Molecule Generation

Motivation: Diffusion models have shown great promise in the generation of discrete structures like graphs and molecules, but many problems still remain (see our seminar’s paper list for more information.) We offer several topics in this area, for instance:

  • latent space (or multi-resolution) graph generation
  • exploring novel types of forward (and the corresponding backward) processes and formalisms (e.g., based on CTMCs)
  • investigating different guiding algorithms for the forward and backward process (e.g., discriminative or rejection-based methods)

Challenges: derivation of the corresponding equations; efficient implementation

Prerequisites: background in probability theory and deep learning, PyTorch

Co-Supervisor: Gerrit Großmann

Related Work: See Deep Generative Diffusion Models Seminar website.


Complex System Interaction Inference Through Deep Symbolic Regression

Motivation: Inferring the underlying (i.e., hidden) interaction structure of complex systems from time-series data presents a significant challenge. Recent advancements in deep learning techniques have demonstrated remarkable progress in identifying these interaction structures by jointly learning the structure and a prediction model based on the time-series data. In this research, the student aims to improve this process by replacing the conventional prediction model with a symbolic regression layer.

Deep symbolic regression leverages discrete optimization to derive mathematical equations that accurately represent the given observations. By integrating a symbolic regression layer into the deep learning model, the proposed approach seeks to uncover more interpretable and generalizable representations of the hidden interaction structures within complex systems. This enhanced method will not only increase the transparency of the learning process but also facilitate a deeper understanding of the underlying dynamics governing the system.

Challenges: exploring suitable methods of integrating symbolic regression in network inference methods; efficient implementation, evaluation

Prerequisites: background in probability theory and deep learning

Co-Supervisor: Gerrit Großmann

Related Work: See Unsupervised Relational Inference Using Masked Reconstruction and Deep symbolic regression for physics guided by units constraints: toward the automated discovery of physical laws


Enhanced Techniques for Sub-Graph Detection to Boost Generation and Explainability

Motivation: Numerous graph types, such as molecular graphs, consist of recurring functional units. This thesis explores graph mining strategies to deduce either a sub-graph vocabulary or a graph transformation, aiming to enhance tasks related to molecular machine learning. This includes not only the interpretability and explainability of graph regression tasks but also seeks to improve the data efficiency of generative tasks.

Challenges: The thesis primarily challenges the development of efficient techniques (likely based on differentiable graph representation) for graph translations and sub-graph mining, and their integration into existing graph ML methodologies.

Prerequisites: background in GNNs and interest in molecular ML