Advanced Topics in Diffusion Modeling - From Theory to Implementation

Organizers

Dr. Gerrit Großmann
Prof. Dr. Verena Wolf

For any issues regarding the seminar, please e-mail Gerrit Großmann and have [DeepDiffusion2023] (including the brackets) in the subject line.

In case you are interested in writing a thesis on this topic, please also contact Gerrit Großmann.

Organization

  • Please use the seminar assignment system to register.
  • Please register only if you are available at the time slot of the seminar.
  • If you want to take the seminar but were not selected by the assignment system, please apply for the waiting list by emailing us. (Currently, there is one open spot.)
  • The seminar takes place Fridays from 14:15 to 16:00 in room 1.06 (E1.1) (in-person).
  • The kick-off meeting takes place on Friday, October 27.
  • The seminar language is English.
  • The seminar earns you 7 ECTS.
  • The seminar is eligible for bachelor/master/graduate students of computer science and related courses.
  • Depending on the study regulations, you need to register in HISPOS/LSF.

Schedule

  • Oct 27: Kick-off
  • Oct 30: Milestone I - Send three topic preferences to Gerrit
  • Nov 3: No seminar
  • Nov 10: No seminar
  • Nov 17: Q&A session topics 1-6
  • Nov 24: Q&A session topics 7-12
  • Dec 1: No seminar
  • Dec 8: Milestone II - Submit slides + Q&A session topics 1-6
  • Dec 15: Presentation slot 1 (topics 1,2,4)
  • Dec 22: Q&A session topics 7-12 (can be moved)
  • Jan 5: Presentation slot 2 (topics 5,6,7)
  • Jan 12: Presentation slot 3 (topics 8,9)
  • Jan 19: No seminar
  • Jan 26: Q&A session topics 1-6
  • Feb 2: Q&A session topics 7-12
  • Feb 9: Milestone III - Submit and present tutorial notebook
  • Feb 23: Milestone IV - Submit reviews
  • Mar 23: Milestone V - Submit final tutorial notebook

Topic Overview

Diffusion models have recently transformed the field of generative deep learning. This seminar will explore this vibrant area of research.

This seminar is targeted at students who already have a background in deep learning (theoretical and practical) and are keen to dive deeper into probabilistic diffusion and related concepts such as stochastic processes and normalizing flows. Experience with diffusion models will be beneficial, but not required. While we do not aim to provide comprehensive coverage of the topic, we have selected certain papers that we find exceptionally engaging, fun, and thought-provoking.

The seminar is equally divided into two segments: a classical part, where students present a concept based on a research paper, and a practical part, where students develop a tutorial notebook inspired by the ideas from that paper.

Requirements

General background knowledge and practical experience in deep learning are strongly recommended. Experiences with diffusion models will be helpful but are not necessary.


Grade

To pass the seminar, you have to attend all sessions and:

  • give a presentation;
  • write a tutorial notebook;
  • review three tutorials;
  • update your tutorial notebook based on the reviews you received;
  • participate in discussions;

… with a passing grade. The final grade is based on your presentation (40%), tutorial notebook (50%), and reviews (10%). You will fail the whole seminar if you receive a failing grade in any of the three parts. If we are undecided, we will also consider the discussion participation.

Presentation (40%)

Identify the key ideas and concepts and give a self-consistent presentation explaining these concepts to your fellow students.

The presentation should be 15 to 20 minutes long.

Here are some suggestions for a good presentation (we will use this as a basis for grading the presentations):

  • The goal is to tell a (self-consistent and entertaining) story - not to convince us that you understand the paper.
  • Put time and effort into creating visualizations and preparing (running) examples.
  • Prioritize concreteness, simplicity, and clarity.
  • Focus on intuition, high-level understanding, and contextualization, not on technical details.
  • Don’t overcrowd your slides. Try to avoid full sentences and be cautious with bullet points.
  • Explore and include supplementary material where it seems useful (literature, youtube, GitHub, medium articles, OpenReview, etc.).
  • Use equations only when necessary; use color-coded equations to improve their readability (example).
  • Be critical of the authors’ claims, don’t fall for overselling.
  • Use slide numbers.
  • Submit your slides as .pdf.

Tutorial Notebook (50%)

Your task is to create a self-consistent tutorial within a Jupyter Notebook that elucidates the key concepts of the selected paper through code examples. The scope of this project may vary depending on the complexity of the paper. A complete re-implementation may be achievable or it may be more appropriate to concentrate on a small toy problem focusing on a single concept. The Notebook should be well-structured, containing text and figures that clarify the concepts, as well as well-documented code. For guidance and inspiration, you may refer to projects like the Annotated Diffusion Model and the TeachOpenCADD platform, for instance, E(3)-invariant Graph Neural Networks. In addition, you can checkout the Stanford CS224W Graph ML Tutorials.

Formalities:

  • The tutorial notebook should run on Google Colab and utilize Pytorch.
  • We will invite each student to explain their code in an individual meeting. You should be able to explain the code in detail and justify design choices.
  • Submit your notebook by uploading it to MS Teams using the naming scheme Topic03_Grossmann_GeometricLatentDiffusionModelsFor3DMoleculeGeneration.ipynb.
  • You can download external data (weights, training data) from within your notebook.
  • Plagiarism will get you expelled from the seminar and potentially exmatriculated. However, you can discuss solutions with your fellow students and take inspiration from open-source implementations.
  • You can also copy&paste small code snippets if they are clearly indicated.
  • Comment your code extensively and assign proper names to your functions, variables, and classes.
  • Ensure to set seeds to make your code deterministic, and remember to save the final weights for reproducibility.
  • Include the URL of the original paper, your name, and any external resources you use.
  • The notebooks will be made publically available.
  • The presentation of your notebook should take about 5 minutes.
  • The deadline for submissions is 23:59 local time.

Reviews (10%)

We will assign three tutorial notebooks to each student to review. The primary objective of this review is to offer suggestions for enhancements and to identify any potential mistakes or ambiguities, whether they be technical, conceptual, or grammatical in nature. Each review should be approximately one to two pages in length and must be sent to the author and Gerrit via email.


Topics

Topic Student Paper
1 Salaheldin Y. A. Mohamed Structured Denoising Diffusion Models in Discrete State-Spaces
2 - Efficient and Degree-Guided Graph Generation via Discrete Diffusion Modeling
3 - Geometric Latent Diffusion Models for 3D Molecule Generation
4 Bartłomiej Pogodziński Consistency Models
5 Akansh Maurya TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets
6 Davronbek Islamov Training Diffusion Models with Reinforcement Learning
7 Yasin Esfandiari Graphically Structured Diffusion Models
8 Soumava Paul Generative Modelling with Inverse Heat Dissipation
9 Monseej Purkayastha Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise
10 - Equivariant flow matching
11 - Generative Modeling with Optimal Transport Maps
12 - Improving and Generalizing Flow-Based Generative Models with Minibatch Optimal Transport

The campus library will provide a semester reserve featuring the books Probabilistic Machine Learning: Advanced Topics and Deep Learning with PyTorch: Build, train, and tune neural networks using Python tools.

Otherwise, we recommend:

For the implementation, we also recommend: