UVA Deep Learning II Course

UvA master's programme in Artificial Intelligence.

Find Out More


Deep learning II is taught in the MSc program in Artificial Intelligence of the University of Amsterdam. In this course we study the theory of deep learning, namely of modern, multi-layered neural networks trained on big data. The course is coordinated by Efstratios Gavves, Erik Bekkers, Wilker Aziz Fereira and Christos Athanasiadis.

The Teaching Assistants (TAs) are: Stefanos Achlatis, Alejandro Garcia, Metod Jazbec, Cong Liu, Yongtuo Liu, Philipp Tremuel, Haochen Wang, Andrii Zadaianchuk, Max Zhdanov

Stefanos Achlatis Alejandro Garcia Metod Jazbec Cong Liu

Yongtuo Liu Philipp Tremuel Haochen Wang Andrii Zadaianchuk Max Zhdanov


Erik Bekkers

This module covers the topic of geometric deep learning, touching upon all its five G's (Grids, Groups, Graphs, Geodesics, and Gauges) but with a strong focus on group equivariant deep learning. The impact that CNNs made in fields such as computer vision, computational chemistry and physics, can largely be attributed to the fact that convolutions allow for weight sharing, geometric stability, and a dramatic decrease in learnable parameters by leveraging symmetries in data and architecture design. These enabling properties arise from the equivariance property of convolutions. In this module you will learn how to equip neural networks with equivariance properties. The module is split in 4 lectures with accompanying tutorials: This module is split into 4 lectures:

  • Lecture 1: Regular group convolutional neural networks (G-CNNs). In this lecture we cover the basics of group convolutional NNs and show how to leverage symmetries in data and practical problems.
  • Lecture 2: Steerable G-CNNs. In this lecture we introduce a very general class of G-CNNs that allows to handle (rotational) symmetries in a flexible and powerful way. These methods are at the core of the most successful methods to handle 3D data such as atomic point clouds, but are also at the core of gauge equivariant methods that are applicable to arbitrary Riemannian manifolds.
  • Lecture 3: Equivariant graph NNs. Many problems in computational chemistry and computational physics are now-a-days solved via graph NNs. The SotA in these domains derive their effectives from the geometric structure and symmetries presented by the data and underlying physics. In this lecture cover tools for SE(3) equivariance in the context of state-of-the-art in geometric graph NNs.
  • Lecture 4: Recap and/or, if time allows, further exploration of topics covered in this module (e.g. equivariant transformers, geometric latent spaces, …).

All the lectures can be find here: https://uvagedl.github.io/


No documents.

Lecture recordings:

No recordings.

Wilker Aziz Ferreira

Many (if not most) advanced DL models are probabilistic models (or at the very least key aspects of their design and training are given probabilistic treatment). The focus of this module (or this part of the module) is to learn to prescribe probability distributions over complex sample spaces (discrete, continuous, structured), parameterise these distributions using NNs, and estimate model parameters to maximise (bounds on) likelihood via gradient descent. The goal is to get students to expand their toolbox, to see modelling ideas and estimation algorithms as modules they can compose (ie, VI is not exclusive to VAEs, VAEs are not necessarily built upon Gaussians, autoregressive models are not exclusive to one data type or another, reparameterisation is a general tool, MLE is a general tool, etc). We cover two main classes of models, depending on whether a key function (the likelihood function) can be assessed tractably given a set of observations and a parameter vector.

TL;DR In this module you learn to view data as a byproduct of probabilistic experiments. You will parameterise joint probability distributions over observed random variables, however complex/structured they may be, and perform parameter estimation by regularised gradient-based maximum likelihood estimation.

Relationship to other modules:

  • Advanced generative models are (rather special) instances of probabilistic models, this module gives you some background knowledge that can ease your way into advanced generative models such as normalising flows, energy-based models and diffusion processes.
  • Certain advanced probabilistic models (e.g., latent variable models) require techniques to approximate intractable computations in a principled manner, those techniques are discussed in the amortised variational inference module. Because amortised VI concerns probabilistic models, this module can be thought of as background to it.
  • Bayesian models are also probabilistic, but you don't necessarily need to content of this module to understand Bayesian deep learning (it does help, but you can live without).


Lecture recordings:

No recordings.

Efstratios Gavves

In this module we will study the interface and overlap between neural networks, dynamical systems, ordinary/partial/stochastic differential equations, and physics-based neural networks. We will study how and where dynamical systems be found in neural networks with implicit functions and neural ODEs. We will also see how neural networks can be used to model dynamical systems like Navier-Stokes with physics-informed neural networks, as well as with Fourier-inspired architectures and autoregressive neural networks.


No documents.

Lecture recordings:

No recordings.



This module covers the topic of geometric deep learning, touching upon all its five G's (Grids, Groups, Graphs, Geodesics, and Gauges) but with a strong focus on group equivariant deep learning. The impact that CNNs made in fields such as computer vision, computational chemistry and physics, can largely be attributed to the fact that convolutions allow for weight sharing, geometric stability, and a dramatic decrease in learnable parameters by leveraging symmetries in data and architecture design. These enabling properties arise from the equivariance property of convolutions. In this module you will learn how to equip neural networks with equivariance properties.


No documents.

Lecture recordings:

Recordings will be added soon.


Many (if not most) advanced DL models are probabilistic models (or at the very least key aspects of their design and training are given probabilistic treatment). The focus of this module (or this part of the module) is to learn to prescribe probability distributions over complex sample spaces (discrete, continuous, structured), parameterise these distributions using NNs, and estimate model parameters to maximise (bounds on) likelihood via gradient descent. The goal is to get students to expand their toolbox, to see modelling ideas and estimation algorithms as modules they can compose (ie, VI is not exclusive to VAEs, VAEs are not necessarily built upon Gaussians, autoregressive models are not exclusive to one data type or another, reparameterisation is a general tool, MLE is a general tool, etc). We cover two main classes of models, depending on whether a key function (the likelihood function) can be assessed tractably given a set of observations and a parameter vector.

TL;DR In this module you learn to view data as a byproduct of probabilistic experiments. You will parameterise joint probability distributions over observed random variables, however complex/structured they may be, and perform parameter estimation by regularised gradient-based maximum likelihood estimation.

Relationship to other modules:

  • Advanced generative models are (rather special) instances of probabilistic models, this module gives you some background knowledge that can ease your way into advanced generative models such as normalising flows, energy-based models and diffusion processes.
  • Certain advanced probabilistic models (e.g., latent variable models) require techniques to approximate intractable computations in a principled manner, those techniques are discussed in the amortised variational inference module. Because amortised VI concerns probabilistic models, this module can be thought of as background to it.
  • Bayesian models are also probabilistic, but you don't necessarily need to content of this module to understand Bayesian deep learning (it does help, but you can live without).


No documents.

Lecture recordings:

No recordings.


In this module we will study the interface and overlap between neural networks, dynamical systems, ordinary/partial/stochastic differential equations, and physics-based neural networks. We will study how and where dynamical systems be found in neural networks with implicit functions and neural ODEs. We will also see how neural networks can be used to model dynamical systems like Navier-Stokes with physics-informed neural networks, as well as with Fourier-inspired architectures and autoregressive neural networks.


No documents.

Lecture recordings:

No recordings.

If you are interested in older versions of the lectures, you can find them below.


Contact us!

If you have any questions or recommendations for the website or the course, you can always drop us a line! The knowledge should be free, so feel also free to use any of the material provided here (but please be so kind to cite us). In case you are a course instuctor and you want the solutions, please send us an email.