[Post in progress]
This semester I’m taking a new topics course in Deep Learning given by Joan Bruna. He has a public webiste for the course on github that includes course slides. Joan adds a lot of intuition and details on top of what is written in the slides, so I plan on blogging/taking notes about the course as a way to internalize the material.
An overview of what he covers (also found in lecture one lecture slides): 1. Mathematical models of deep convolutional networks. - Supervised and unsupervised learning using deep models. - Applications to computer vision, speech and time series. - Relationships between Deep Learning and “classic” models. - Open mathematical/statistical questions.
Deep Learning Models: “A class of parametrized non-linear representations encoding appropriate domain knowledge (invariance and stationarity) that can be (massively) optimized efficiently using stochastic gradient descent.”
Classification, Kernals and Metrics
High-dim Recognition Setup
Our input data \(x\) for the neural network lives in a high-dimensional space, sometimes even infinite dimensional space.
In the last case think of images in 1,2 or 3 dimensions as functions of colour on these spaces. The \(L^2\) assumption is reasonable given our limited colour range.
Our observations are in the form \((x_i, y_i), i = 1, …, n\) where the \(y_i\)’s are “response” variables.
##Review: Separable Scattering Operators