Schedule for Fall 2022
Seminars are on Wednesdays
Time: 12:00 – 1:00pm
Location: Room 903, 1255 Amsterdam Avenue
Zoom Link: columbiauniversity.zoom.us/j/
Meeting ID: 991 5989 3951
Passcode: 284676
Contacts: Jaesung Son, Luhuan Wu
Information for speakers: For information about schedule, direction, equipment, reimbursement and hotel, please click here.
9/14/2022 |
Director of Graduate Studies, Professor John Cunningham
Our new DGS, Prof. John Cunningham, will share a few words with us and then we can introduce ourselves.
|
9/21/22 |
Sharing various summer internship experiences from our fellow students: Nick Galbraith, Zhen Huang, Jialin Ouyang.
|
9/28/22 |
Nikolaos Ignatiadis (Post-Doc, Columbia Stats)
Title:
Covariate-Powered Empirical Bayes Estimation Abstract: |
10/5/22 |
Professor Wayne Lee (Columbia Stats)
“Thoughts on the intersection between applied statistics and humanities”
|
10/12/22 |
Julia (Two Sigma)
Two Sigma is a financial sciences company. Our community of scientists, technologists, and academics look beyond traditional finance to understand the bigger picture and develop creative solutions to some of the world’s most challenging economic problems. We rely on the scientific method, rooted in hypothesis, analysis, and experimentation, to drive data-driven decisions, to manage risk, and to expand into new areas of focus. In this way, we create systematic tools and technologies to forecast the future of global markets. Our Quant Researchers presenting include: Ding completed her PhD in Statistics at Columbia University in 2021, and she currently works as a quantitative researcher at Two Sigma. Her day-to-day work includes building technical alpha models for equities. She loves statistics, finance, Two Sigma, New York City, and life. Yuting graduated from Columbia in 2016 with a PhD in Statistics and currently works as a modeler at Two Sigma. Her day-to-day work includes predictive modeling and machine learning research. |
10/19/22 |
Long Zhao, Ari Blau, Jitong Qi (Columbia Stats) “Summer Intern Workshop“
|
10/26/22 |
Professor Marco Avella (Columbia Stats)
Noisy convex optimization for differentially private inference with M-estimators
We propose a general optimization-based framework for computing differentially private M-estimators and a new method for the construction of differentially private confidence regions. First, we show that bounded-influence M-estimators can naturally be used in conjunction with noisy gradient descent and noisy Newton methods in order to obtain optimal private estimators with global linear or quadratic convergence, respectively. We establish finite sample global convergence guarantees, under both local strong convexity and self-concordance, showing that our private estimators converge with high probability to an optimal neighborhood of the non-private M-estimators. We then tackle the problem of parametric inference by constructing differentially private estimators of the asymptotic variance of our private M-estimators. Finally, we discuss ongoing work that explores the potential practical and theoretical benefits of a noisy sketched Newton algorithm.
|
11/2/22 |
Xuming He (University of Michigan)
|
11/9/22 |
Charles Margossian (Flat Iron Institute)
Title: Markov chains Monte Carlo using modern hardware
Abstract: The current trend in hardware development is to support processors which can run a large number of operations in parallel. How well is MCMC positioned to take advantage of massive parallelization? Over the past two years, several algorithms have been developed to run many chains in parallel on GPUs. But our MCMC workflow — i.e. length of the burnin / warmup and sampling phases, convergence diagnostics, tuning parameters — is still rooted in the tradition of running one, maybe a few, long Markov chains. The same can be said of our theory, where asymptotic analyses often focus on infinitely long — and therefore stationary — chains. The many chains regime suggests taking limits in another direction: an infinite number of finite non-stationary chains. This perspective paves the way to developing a principled MCMC workflow, eliciting persistent questions: how many chains should we run? How long should the warmup and sampling phases be? How should we initialize the chains?
This work is partly based on a preprint: https://arxiv.org/
|
11/16/22 |
Arnab Auddy (Columbia Stats) Title: Statistical Benefits and Computational Challenges of Tensor Spectral Learning Abstract: As we observe progressively more complex data, it becomes necessary to model higher order interactions among the observed variables. Orthogonally decomposable tensors provide a unified framework for many such problems, whereby tensor spectral estimators become a natural choice to learn the latent factors of the model. While this is a natural extension of matrix SVD, tensor based estimators automatically provide much better identifiability and estimability properties. In addition to the attractive statistical properties, these methods present us with intriguing computational considerations. In the second part of the talk, I will illustrate these phenomena in the particular application to Independent Component Analysis (ICA). Interestingly there is a gap within the information theoretic and computationally tractable limits of the problem. Additionally we provide noise robust algorithms based on spectral truncation, which provide rate optimal estimators for the mixing matrix of ICA. Our estimators are also asymptotically normal thus allowing confidence interval construction. |
11/23/22 |
No Seminar |
11/30/22 |
Zoraida Rico (Post-Doc, Columbia Stats)
Title: On optimal covariance matrix estimation.
Abstract: We present an estimator of the covariance matrix of a random d-dimensional vector from an i.i.d. finite sample. Our sole assumption is that this vector satisfies a bounded Lp-L2 moment assumption over its one-dimensional marginals, for p greater than or equal to 4. Given this, we show that the covariance can be estimated from the sample with the same high-probability error rates that the sample covariance matrix achieves in the case of Gaussian data. This holds even though we allow for very general distributions that may not have moments of order greater than p. Moreover, our estimator is optimally robust to adversarial contamination. This result improves the recents works by Mendelson and Zhivotovskiy and Catoni and Giulini, and matches parallel work by Abdalla and Zhivotovskiy. This talk is based on a joint work with Roberto I. Oliveira (IMPA).
|
12/7/22 |
Professor Sumit Mukherjee and Professor Arian Maleki, who will both be talking about their current research interests and what it’s like to do research with them.
Sumit’s Talk:
Variational inference/Mean field approximation is a powerful computational technique used to approximate complex distributions arising from machine learning and statistical physics. Typically, such applications focus on
fast computations, and are not always accompanied by rigorous guarantees of how good such approximations are (or whether they are any good at all). In this talk, I will touch upon (at a high level) some of the recent techniques used to provide rigorous guarantees for such approximations, and explain one example (high dimensional bayesian linear regression model/ising model). Arian’s Talk:
I will mainly talk about the problems I would like to study in the next 4-5 years.
|
12/14/22 |
Professor Hongseok Namkoong (Columbia Business School)
|
|
|