Deep Learning (Spring 2023)


Project (Latest Instructions Update: Feb. 17 2023)

The deadline for submission is changed to May. 4th.

Please send your

  • pdf

  • colab link with experiments/code

to (if you use a different email alias, your assignment may risk ending “lost” in my inbox).

Choose one topic from those presented below. If you wish to explore a different direction, send me a proposal by email.

Score Based Generative Modeling

We have (or plan to) introduce generative models in the lectures, notably GANs. In this work, you will study a recent, related, yet different approach to be able to synthesize new datapoints. This process does not rely on a low dimensional subset of vectors, but is rather aimed at generating new points in a certain space using noise sampled from the same space. This process was described in a recent ICLR’21 paper and summarized in the following blog post. In this memoir I ask you to summarize these findings and apply their method to generate points using a different dataset that those considered in the paper. While you can reuse parts of their code, I expect you to write original functions to test these methods on the dataset of your choice, and explain how you were able to tune hyperparameters.

Automatic differentiation of optimization solutions

A common criticism formulated against deep learning is the lack of interpretability of “black-box” networks. The alternative that is often proposed is that of more classic statistical procedures which typically rely on specifying a simpler model (e.g. linear) and train it with convex solvers (e.g. least-squares, lasso).

However, an important feature of deep learning over “training with simple models with convex loss” is that they can be used end-to-end, that is all intermediate representations of data can be considered in a single data-processing pass, as opposed to fragmenting ML pipeline in well-posed but “broken” flow (e.g. do k-means, then PCA of those clusters, then linear regression).

There has been an important research effort in recent years to turn the output of these “well posed” convex solvers into solve implicit functions around solutions of convex programs to obtain Jacobians for lasso, quadratic programs, conic programs and disciplined convex problems. For discrete problems, SAT solvers, ranking problems and more generic mixed integer programs (also here) have been handled using various flavors of relaxations and smoothing.

In this assignment, I ask you to consider two of the papers described above, summarize them, and propose experimental results on a different dataset.