Student Seminar Series – Spring 2015

Columbia University Statistics Student Seminar

Schedule for Spring 2015

Seminars are on Wednesdays
Time: 12:00pm – 1:00pm
Location: Room 903 SSW, 1255 Amsterdam Avenue

If you are one of the great speakers, please click here to find out information for speakers.

Our aim is, via the talks of the speakers, to provide the students with opportunities to explore different research potentials in a relatively casual environment.

This year (2014-2015) the seminars are organized by Yunxiao Chen (email) and Yuanjun Gao (email).

Previous schedules: Fall 2014

Yunxiao Chen

Yunxiao Chen

Yuanjun Gao

Yuanjun Gao






Dr. David Pettey (SIG)

“The Wholesaler Marketplace: Handling Retail Order Flow”

SIG recently (2014) entered the Wholesaler equity business. This presentation will show in part how we tried to independently piece together what this business was about, and how we could get rough independent estimates on the potential profitability of competitors. One approach we used was to restrict ourselves to publically available data, see how much we could learn and what bounds we could reasonably put on the profitability. The focus of the talk will be how you can independently use publically available data to understand certain lesser known segments of the marketplace.


Prof. Guang Cheng (Purdue University)

“Nonparametric Bernstein-Von Mises Phenomenon: A Tuning Prior Perspective”

In this talk, we investigate statistical inference on infinite-dimensional parameters in a Bayesian framework. The main contribution is to demonstrate that nonparametric Bernstein-von Mises theorem can be established in a very general class of nonparametric regression models under a novel tuning prior (indexed by a non-random hyperparameter). Surprisingly, this type of prior ingeniously connects two important classes of statistical methods: nonparametric Bayes and smoothing spline. The intrinsic connection with smoothing spline greatly facilitates both theoretical analysis and practical implementation for nonparametric Bayesian inference. For example, we can employ generalized cross validation to select a proper tuning prior, under which the constructed credible regions/intervals are frequentist valid.

2/04/2015 Prof. Tony Jebara (Columbia)
2/11/2015 Heng Yang (CUNY)
2/18/2015 Prof. Christopher Rothe (Columbia)
2/25/2015 Dr. Daniel Soudry (Columbia)

Prof. David Blei (Columbia)

“Scaling and Generalizing Variational Inference”

Latent variable models have become a key tool for the modern statistician, letting us express complex assumptions about the hidden structures that underlie our data. Latent variable models have been successfully applied in numerous fields including natural language processing, computer vision, population genetics, and many others.

The central computational problem in latent variable modeling is posterior inference, the problem of approximating the conditional distribution of the latent variables given the observations. Inference is essential to both exploratory and predictive tasks. Modern
inference algorithms have revolutionized Bayesian statistics, revealing its potential as a usable and general-purpose language for data analysis.

Bayesian statistics, however, has not yet reached this potential. First, statisticians and scientists regularly encounter massive data sets, but existing algorithms do not scale well. Second, most approximate inference algorithms are not generic; each must be adapted to the specific model at hand. This requires significant model-specific analysis, which precludes us from easily exploring a variety of models.

In this talk I will discuss our recent research on addressing these two limitations. First I will describe stochastic variational inference, an approximate inference algorithm for handling massive data sets. Stochastic inference is easily applied to a large class of
Bayesian models, including topic models, time-series models, factor models, and Bayesian nonparametric models. Then I will discuss black box variational inference, a generic algorithm for approximating the posterior. We can use black box inference on many models with little model-specific derivation. Together, these algorithms make Bayesian statistics a flexible and practical tool for modern data analysis.

This is joint work based on these two papers:

M. Hoffman, D. Blei, J. Paisley, and C. Wang.  Stochastic variational
inference.  Journal of Machine Learning Research, 14:1303-1347.

R. Ranganath, S. Gerrish and D. Blei.  Black box variational inference.
Artificial Intelligence and Statistics, 2014.

3/11/2015  Ju Sun (Columbia)
4/01/2015 Prof. Peter F. Halpin (NYU)
4/08/2015 Dr. Alekh Agarwal (Microsoft Research)
4/15/2015 Dr. Lars Buesing (Columbia)