Mila's Deep Learning Theory Group is a discussion group aimed at keeping up with the latest research, and collaborating and brainstorming to push the boundaries of theoretical aspects of deep learning. We meet every other Monday at 1:00 PM EST. The guidelines and schedule are provided in this website while poolings and internal discussions are conducted in Mila's Slack Channel .
Structure: Each meeting is managed by one or a few leaders. Based on the topic, the leader will have a ~10-20 minutes presentation about the background. The leader can choose to present the topic on high-level or detailed. For example the leader can choose to
The leader can choose however they want to present the background, it can be slides, notes, going over a paper or even just speaking. After presenting the background, there will be a ~40 minutes discussion. The leader should lead the discussion and encourage the engagement of participants by:
The last ~5 minutes of every meeting is spent to select the topic of the next meeting based on the votes and interests of the participants (available in the excel or website).
In each session there will also be a facilitator. The role of the facilitator is as follows:
Note: The goal is to have meetings that are stand-alone so that if someone missed one meeting, they would not have to worry about not being able to follow the next meeting.
An archive of some of the discussions of previous iterations of the reading group is provided below:
Date | Leader | Topic | Resources |
Jan 25 2017 | Jason Jo | Understanding deep learning requires rethinking generalization | Link |
Feb 8 2017 | Jason Jo | Train faster, generalize better: Stability of stochastic gradient descent | Link |
Mar 8 2017 | Jason Jo | Entropy-SGD | Link |
Apr 19 2017 | Joseph Cohen | Early Stopping Without a Validation Set | Link |
Nov 15 2017 | Brady Neal | Generalization in Deep Learning | Link |
Nov 22 2017 | Anirudh Goyal | Information Bottleneck | Link |
Nov 29 2017 | Sherjil Ozair | Generalization in GANs Slides | Link |
Dec 13 2017 | Aristide Baratin | PAC-Bayes Generalization Slides | Link |
Jan 15 2018 | Mike Pieper | Landscape of the Empirical Risk in Deep Learning | Link |
Jan 22 2018 | Ahmed Touati | A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds | Link |
Jan 29 2018 | Ahmed Touati | Exploring Generalization in Deep Learning | Link |
Feb 5 2018 | Jean Michel | Sellier Why does deep and cheap learning work so well? | Link |
Feb 12 2018 | Brady Neal | Deep Learning without Poor Local Minima | Link |
Feb 19 2018 | Rémi Le Priol | Concentration Inequalities Tutorial | Link |
Feb 26 2018 | Salem Lahlou | PAC-Bayes Tutorial | Link |
Mar 12 2018 | Gabriel Huang | No Free Lunch Theorem Tutorial | Link |
Mar 19 2018 | Vidhi Jain | SGD Learns Networks that Provably Generalize on Linearly Separable Data | Link |
Mar 26 2018 | Matthew Scicluna | Revisit Understanding deep learning requires rethinking generalization | Link |
Apr 2 2018 | Ishmael Belghazi | MINE: Mutual Information Neural Estimation | Link |
Apr 23 2018 | Vincent Gripon | Matching Convolutional Neural Networks without Priors about Data | Link |
May 7 2018 | Nicolas Gagné | Opening the Black Box of Deep Neural Networks via Information | Link |
May 14 2018 | Gaetan Marceau Caron | Do Deep Learning Models Have Too Many Parameters? | Link |
Aug 13 2018 | Ari Benjamin | Measuring and regularizing networks in function space | Link |
Aug 20 2018 | Jennifer She | Implicit Acceleration by Overparameterization | Link |
Aug 27 2018 | Mohammad Pezeshki | Dynamics of Learning and Inference in Neural Networks | Link |
Sep 10 2018 | Brady Neal | What is deep learning theory and why do we care? | Link |
Sep 17 2018 | Rémi Le Priol | Empirical Analysis of the Hessian of Over-Parametrized Neural Networks | Link |
Sept 24 2018 | Vikram Voleti | Visualizing the Loss Landscape of Neural Nets | Link |
Oct 15 2018 | Brady Neal | Measuring the Intrinsic Dimension of Objective Landscapes | Link |
Oct 22 2018 | Xavier Bouthillier | Understanding the Role of Over-Parametrization in Generalization | Link |
Oct 29 2018 | César Laurent | Natural Gradient Tutorial | Link |
Nov 5 2018 | Isabela Albuquerque | Data-Dependent Stability of Stochastic Gradient Descent | Link |
Nov 12 2018 | Rémi Le Priol | The Mechanics of n-Player Differentiable Games | Link |
Nov 19 2018 | Reyhane Askari | A Lyapunov Analysis of Momentum Methods in Optimization | Link |
Nov 26 2018 | Levent Sagun | Over-paramertrization in neural networks: observations and a definition | Link |
an 29 2019 | Pablo Piantanida | Introduction to Information Theory - Part 1 | Link |
Feb 5 2019 | Pablo Piantanida | Introduction to Information Theory - Part 2 | Link |
Feb 12 2019 | Brady Neal | Discussion on bias-variance trade-off | Link |
Feb 19 2019 | Sharan Vaswani | Train faster, generalize better: Stability of stochastic gradient descent | Link |
Feb 26 2019 | Gauthier Gidel | Implicit Regularization of Gradient Dynamics in Linear Neural Networks | Link |
Please enter your name and the topics of your interest in the spreadsheet provided in the slack channel. The topics of upcomming meetings will be selected from the list below. The list is in sync with the spreadsheet.
Suggested by | Topics you know about | Topics you like to learn about |
Please join #deep-learning-theory on Mila's slack for any suggestion and/or discussing different matters. Previous website and materials are accessible at Mila's internal site. This website and the group are currently organized by Adam Ibrahim, Reyhane Askari, and Mohammad Pezeshki.