Statistics Seminar Series

Choose which semester to display:

Schedule for Fall 2022

Seminars are on Mondays
Time: 4:00pm - 5:00pm

Location: Room 903 SSW, 1255 Amsterdam Avenue



Eren Kizildag (Columbia)

Title. Algorithmic Barriers in Random Optimization Problems via Overlap Gap Property

Abstract. Random combinatorial optimization problems often exhibit a so-called statistical-to-computational gap (SCG): the best efficient algorithmic guarantee is often worse than the existential guarantee. In this talk, we focus on the SCGs in two models and explore the limits of efficient algorithms:

Our first focus is on the symmetric binary perceptron model (SBP), a toy one-layer neural network model and a random constraint satisfaction problem. The SBP exhibits a dramatic SCG as well as a striking conundrum: the existence of polynomial-time algorithms coincides with an extreme form of clustering, where almost all of its solutions are isolated singletons at all subcritical densities. This conundrum challenges the view that clustering implies algorithmic hardness. In order to explain the true algorithmic tractability of this model, we conduct a different landscape analysis. Guided by insights from statistical physics, we show that the SBP exhibits the multi Overlap Gap Property (m-OGP), an intricate geometrical barrier for large classes of algorithms. Our analysis shows the m-OGP threshold (a) is well below the satisfiability threshold; and (b) is nearly tight: it asymptotically matches the best algorithmic threshold up to logarithmic factors. Next, we establish that m-OGP is a barrier to stable algorithms and we conjecture that the onset of m-OGP marks in fact the onset of true algorithmic hardness. The proof of this barrier uses Ramsey Theory from extremal combinatorics. We furthermore show that a known algorithm by Kim-Roche for the perceptron model is stable.

Our second focus is on the random number partitioning problem (NPP), a problem that has many applications in statistics, including the design of randomized controlled trials. The NPP exhibits yet another SCG. In order to address this gap, we establish the presence of OGP and subsequently leverage it to show that (a) stable algorithms fail to find near-optimal solutions and that (b) a very natural Monte Carlo Markov Chain dynamics mixes slowly. When m is constant, we show the presence of m-OGP for linear energy exponents as well as its absence for sub-linear exponents. Interestingly, however, by considering overlaps with growing values of m, we show the presence of OGP for a much broader range of energy exponents.

Based on joint works with David Gamarnik, Will Perkins and Changji Xu.


Jose Zubizarreta (Harvard)

Title: Bridging Matching, Regression, and Weighting as Mathematical Programs for Causal Inference

Abstract: A fundamental principle in the design of observational studies is to approximate the randomized experiment that would have been conducted under controlled circumstances. Across the health and social sciences, statistical methods for covariate adjustment are used in pursuit of this principle. Basic methods are matching, regression, and weighting. In this talk, we will examine the connections between these methods through their underlying mathematical programs. We will study their strengths and weaknesses in terms of study design, computational tractability, and statistical efficiency. Overall, we will discuss the role of mathematical optimization for the design and analysis of studies of causal effects.


Jianqing Fan (Princeton)

Title: How do noise tails impact on deep ReLU networks?

Abstract: This talk is on the stability of deep ReLU neural networks for nonparametric regression under the assumption that the noise has only a finite p-th moment. We unveil how the optimal rate of convergence depends on p, the degree of smoothness and the intrinsic dimension in a class of nonparametric regression functions with hierarchical composition structure when both the adaptive Huber loss and deep ReLU neural networks are used.  This optimal rate of convergence cannot be obtained by the ordinary least squares but can be achieved by the Huber loss with a properly chosen parameter that adapts to the sample size, smoothness, and moment parameters.  A concentration inequality for the adaptive Huber ReLU neural network estimators with allowable optimization errors is also derived.  To establish a matching lower bound within the class of neural network estimators using the Huber loss, we employ a different strategy from the traditional route:  constructing a deep ReLU network estimator that has a better empirical loss than the true function and the difference between these two functions furnishes a low bound.  This step is related to the Huberization bias, yet more critically to the approximability of  deep ReLU networks.   As a result, we also contribute some new results on the approximation theory of deep ReLU neural networks.

 (Joint work with Yihong Gu and Wenxin Zhou)


Genevera Allen (Rice)

Title: Graph Learning for Functional Neuronal Connectivity


Understanding how large populations of neurons communicate and jointly fire in the brain is a fundamental open question in neuroscience.  Many approach this by estimating the intrinsic functional neuronal connectivity using probabilistic graphical models.  But there remain major statistical and computational hurdles to estimating graphical models from new large-scale calcium imaging technologies and from huge projects which image up to one hundred thousand neurons in the active brain.   In this talk, I will highlight a number of new graph learning strategies my group has developed to address many critical unsolved challenges arising with large-scale neuroscience data.  Specifically, we will focus on Graph Quilting, in which we derive a method and theoretical guarantees for graph learning from non-simultaneously recorded and pairwise missing variables.  We will also highlight theory and methods for graph learning with latent variables via thresholding, graph learning for spikey data via extreme graphical models, and computational approaches for graph learning with huge data via minipatch learning.  Finally, we will demonstrate the utility of all approaches on synthetic data as well as real calcium imaging data for the task of estimating functional neuronal connectivity.


Gautam Kamath (U Waterloo)

Title: CoinPress: Practical Private Mean and Covariance Estimation

We consider point estimation and generation of confidence intervals under the constraint of differential privacy. We provide a simple and practical framework for these tasks in relatively general settings. Our investigation addresses a novel challenge that arises in the differentially private setting, which involves the cost of weak a priori bounds on the parameters of interest. This framework is applied to the problems of Gaussian mean and covariance estimation. Despite the simplicity of our method, we are able to achieve minimax near-optimal rates for these problems. Empirical evaluations, on the problems of mean estimation, covariance estimation, and principal component analysis, demonstrate significant improvements in comparison to previous work.

No knowledge of differential privacy will be assumed. Based on joint works with Sourav Biswas, Yihe Dong, Jerry Li, Vikrant Singhal, and Jonathan Ullman.



Xiaoyue Niu (Penn State)

Title: Learning Network Properties without Network Data -- A Correlated Network Scale-up Model


The network scale-up method based on ``how many X's do you know?'' questions has gained popularity in estimating the sizes of hard-to-reach populations. The success of the method relies primarily on the easy nature of the data collection and the flexibility of the procedure, especially since the model does not require a sample from the target population, a major limitation of traditional size estimation models. In this talk, we propose a new network scale-up model which incorporates respondent and subpopulation covariates in a regression framework, includes a bias term that are correlated between subpopulations, and introduce a new scaling procedure utilizing the correlation structure. In addition to estimating the unknown population sizes, our proposed model depicts people's social network patterns in an aggregated level without using the network data.


Zhigang Yao (NUS & Harvard)

Title: Principal flow, submanifold and boundary

Abstract: While classical statistics has dealt with observations which are real numbers or elements of a real vector space, nowadays many statistical problems of high interest in  the sciences deal with the analysis of data which consist of more complex objects, taking values in spaces which are naturally not (Euclidean) vector spaces but which  still feature some geometric structure. I will discuss the problem of finding principal components to the multivariate datasets, that lie on an embedded nonlinear Riemannian manifold within the higher-dimensional space. The aim is to extend the geometric interpretation of PCA, while being able to capture the non-geodesic form  of variation in the data. I will introduce the concept of a principal sub-manifold, a manifold passing through the center of the data, and at any point on the manifold  extending in the direction of highest variation in the space spanned by the eigenvectors of the local tangent space PCA. We show the principal sub-manifold yields the  usual principal components in Euclidean space. We illustrate how to find, use and interpret the principal sub-manifold, by which a principal boundary can be further defined  for data sets on manifolds.


Xuming He (U Michigan)

Title: Covariate-adjusted Expected Shortfall: Some Recent Developments

Abstract: Expected shortfall, measuring the average outcome (e.g., portfolio loss) above a given quantile
of its probability distribution, is a common financial risk measure. The same measure can be
used to characterize treatment effects in the tail of an outcome distribution, with applications
ranging from policy evaluation in economics and public health to biomedical investigations.
Expected shortfall regression is a natural approach of modeling covariate-adjusted expected
shortfalls. Because the expected shortfall cannot be written as a solution of an expected loss
function at the population level, computational as well as statistical challenges around
expected shortfall regression have led to stimulating research. We discuss some recent
developments in this area, with a focus on a new optimization-based semiparametric approach
to estimation of conditional expected shortfall that adapts well to data heterogeneity with
minimal model assumptions.


Academic Holiday


Sumanta Basu (Cornell)

Title: Frequency-domain graphical models for multivariate time series

Abstract: Graphical models offer a powerful framework to capture intertemporal and contemporaneous relationships among the components of a multivariate time series. For stationary time series, these relationships are encoded in the multivariate spectral density matrix and its inverse. In the first part of this talk, we will present adaptive thresholding and penalization methods for estimation of these objects under suitable sparsity assumptions. We will discuss new optimization algorithms for complex Lasso and Graphical Lasso, and investigate consistency of estimation under a double-asymptotic regime where the dimension of the time series increases with sample size. In the second part of the talk, we will introduce a frequency-domain graphical modeling framework for multivariate nonstationary time series that captures a new notion of conditional nonstationarity. We will provide numerical evidence that for locally stationary time series, conditional nonstationarity can be captured by nodewise regression of discrete Fourier transforms of different frequencies using a complex group Lasso.



Belinda Tzen (Columbia)

Title:  Theoretical guarantees for sampling and inference in generative models with latent diffusions

Abstract:  ``Neural differential equations'', as they are popularly known, have substantially influenced the recent discourse on deep learning theory, as well as had practical consequences:  some of the most compelling text-to-image models (e.g., Google's Imagen) are underpinned by diffusion processes.  In this talk, we discuss our theoretical contributions to this line of work.  In particular, we study a class of probabilistic generative models, where the latent object is a finite-dimensional diffusion process -- e.g., a ``neural SDE'' -- on a finite time interval, and the observed variable is drawn conditionally on the terminal point of the diffusion.  We provide a unified viewpoint on both sampling and variational inference in such generative models through the lens of stochastic control.  We quantify the expressiveness of diffusion-based generative models. Specifically, we show that one can efficiently sample from a wide class of terminal target distributions by choosing the drift of the latent diffusion from the class of multilayer feedforward neural nets, with the accuracy of sampling measuredby the Kullback-Leibler divergence to the target distribution.  Time permitting, we will also present and analyze a scheme for unbiased simulation of generative models with latent diffusions and provide bounds on the variance of the resulting estimators. 

This is joint work with Maxim Raginsky.



Promit Ghosal (MIT)

Title: Limit theorems for online Stochastic Gradient Descent in High dimensional regression.

Abstract: We consider the online stochastic gradient descent (SGD) iterates in high dimensional linear regression model. We show that linear statistics of the SGD iterates shows a central limit theorem when dimension grows exponentially in sample size. Furthermore, we show a mean-field limit of the trajectory of the online SGD iterates and retrieve a stochastic partial differential equation as the scaling limit of its fluctuation. We will discuss some of the applications of these results.

The talk will be based on ongoing joint works with Bhavya Agrawalla, Krishnakumar Balasubramanian and Ye He. 


Linjun Zhang (Rutgers)


Oscar Madrid Padilla (UCLA)