Date | Topic | Reading | Notes |
---|---|---|---|
Jan 17 | Introduction; linesearch | Ch. 1-2 of Givens+Hoeting | See Sun and Yuan (2006) for further details on convergence analysis. Berland's notes on automatic differentiation. Mahsereci and Hennig (2016) on Bayesian linesearch. HW: derive the convergence rate of the secant method. |
Jan 24 | Choosing search directions: Newton, generalized linear models, inexact Newton, quasi-Newton, Fisher scoring, BFGS. Exploiting special structure to solve Newton linear equations more efficiently: banded, sparse, low-rank, block-structured (etc.) matrices | See Vandenberghe's notes for some further background | |
Jan 31, Feb 7 | Conjugate gradients. Preconditioning. Toeplitz, circulant, and Kronecker matrices. Application: Gaussian processes | Shewchuk (1994); see Chan and Ng (1996) on PCG for Toeplitz systems. | See Rasmussen and Williams (2006) for more background on GP regression. Also notes by John Cunningham. Rahimi+Recht '07 and Fastfood on random features; Drineas + Mahoney '16 on randomized linear algebra. HW: code up a GP regression. |
Feb 14 | Constrained and non-smooth optimization: convex functions; interior point methods. Convex duality and KKT conditions. Linear, quadratic, and semidefinite programs. | Boyd and Vandenberghe, ch. 3-5 | |
Feb 21 | No class | ||
Feb 28 | LASSO methods. Some advanced topics: proximal methods, dual decomposition, convex relaxation | Efron et al (2004), Zou et al (2007), Friedman et al (2010), Bradley et al (2011), Tibshirani et al (2012) | More reading: Bach et al (2011), Boyd et al (2011) |
Mar 7, 21 | Graphical models; dynamic programming; message passing; LP relaxations | Rabiner tutorial, Wainwright lecture notes | Background: Wainwright and Jordan (2008), MP and AMP notes by A. Maleki, LP relaxation notes by Y.-W. Teh |
Mar 14 | No class: spring break | Send me project ideas for discussion by Mar 21. | |
Mar 21, 28, Apr 4 | Monte Carlo. Rejection sampling; importance sampling; control variates. Metropolis-Hastings; Gibbs sampling. Hamiltonian Monte Carlo; Bouncy particle sampler. MCMC diagnostics. Rao-Blackwellization. Adaptive simulated tempering. | Ch. 1-7 of Robert and Casella; Neal (2010) | Background: Devroye (1986); Hoffman and Gelman (2012), Pakman and Paninski (2013), Bouchard-Cote et al (2015+), Park and Casella (2008), Neal (2003), Doucet (2010), Dellaportas and Kontoyiannis (2012), Carlson et al (2016), D. Wilkinson's blog on "exact approximate" MCMC. Also, see this nice video. |
Apr 11 | Sequential Monte Carlo | Doucet and Johansen (2011), Pitt and Shephard (1999) | Further reading collected by A. Doucet; Kantas et al (2014) |
Apr 11 | Expectation maximization and variational inference | Dempster et al (1977), Neal and Hinton (1999), Blei et al (2016) | |
Apr 18 | Stochastic gradient methods | Duchi et al (2011), Lacoste-Julien et al (2012), Ranganath et al (2014), Kingma and Welling (2014), Dubey et al (2016), Emtiyaz Khan and Lin (2017) | |
Apr 25 | Project presentations | Send me your report as a .pdf by May 9. |