Schedule for Fall 2022
Seminars are on Mondays
Time: 4:00pm – 5:00pm
Location: Room 903 SSW, 1255 Amsterdam Avenue
9/12/22
|
Eren Kizildag (Columbia) Title. Algorithmic Barriers in Random Optimization Problems via Overlap Gap Property Abstract. Random combinatorial optimization problems often exhibit a so-called statistical-to-computational gap (SCG): the best efficient algorithmic guarantee is often worse than the existential guarantee. In this talk, we focus on the SCGs in two models and explore the limits of efficient algorithms: Our first focus is on the symmetric binary perceptron model (SBP), a toy one-layer neural network model and a random constraint satisfaction problem. The SBP exhibits a dramatic SCG as well as a striking conundrum: the existence of polynomial-time algorithms coincides with an extreme form of clustering, where almost all of its solutions are isolated singletons at all subcritical densities. This conundrum challenges the view that clustering implies algorithmic hardness. In order to explain the true algorithmic tractability of this model, we conduct a different landscape analysis. Guided by insights from statistical physics, we show that the SBP exhibits the multi Overlap Gap Property (m-OGP), an intricate geometrical barrier for large classes of algorithms. Our analysis shows the m-OGP threshold (a) is well below the satisfiability threshold; and (b) is nearly tight: it asymptotically matches the best algorithmic threshold up to logarithmic factors. Next, we establish that m-OGP is a barrier to stable algorithms and we conjecture that the onset of m-OGP marks in fact the onset of true algorithmic hardness. The proof of this barrier uses Ramsey Theory from extremal combinatorics. We furthermore show that a known algorithm by Kim-Roche for the perceptron model is stable. Our second focus is on the random number partitioning problem (NPP), a problem that has many applications in statistics, including the design of randomized controlled trials. The NPP exhibits yet another SCG. In order to address this gap, we establish the presence of OGP and subsequently leverage it to show that (a) stable algorithms fail to find near-optimal solutions and that (b) a very natural Monte Carlo Markov Chain dynamics mixes slowly. When m is constant, we show the presence of m-OGP for linear energy exponents as well as its absence for sub-linear exponents. Interestingly, however, by considering overlaps with growing values of m, we show the presence of OGP for a much broader range of energy exponents. Based on joint works with David Gamarnik, Will Perkins and Changji Xu. |
9/19/22 |
Jose Zubizarreta (Harvard) Title: Bridging Matching, Regression, and Weighting as Mathematical Programs for Causal Inference Abstract: A fundamental principle in the design of observational studies is to approximate the randomized experiment that would have been conducted under controlled circumstances. Across the health and social sciences, statistical methods for covariate adjustment are used in pursuit of this principle. Basic methods are matching, regression, and weighting. In this talk, we will examine the connections between these methods through their underlying mathematical programs. We will study their strengths and weaknesses in terms of study design, computational tractability, and statistical efficiency. Overall, we will discuss the role of mathematical optimization for the design and analysis of studies of causal effects. |
9/26/22 |
Jianqing Fan (Princeton) Title: How do noise tails impact on deep ReLU networks? Abstract: This talk is on the stability of deep ReLU neural networks for nonparametric regression under the assumption that the noise has only a finite p-th moment. We unveil how the optimal rate of convergence depends on p, the degree of smoothness and the intrinsic dimension in a class of nonparametric regression functions with hierarchical composition structure when both the adaptive Huber loss and deep ReLU neural networks are used. This optimal rate of convergence cannot be obtained by the ordinary least squares but can be achieved by the Huber loss with a properly chosen parameter that adapts to the sample size, smoothness, and moment parameters. A concentration inequality for the adaptive Huber ReLU neural network estimators with allowable optimization errors is also derived. To establish a matching lower bound within the class of neural network estimators using the Huber loss, we employ a different strategy from the traditional route: constructing a deep ReLU network estimator that has a better empirical loss than the true function and the difference between these two functions furnishes a low bound. This step is related to the Huberization bias, yet more critically to the approximability of deep ReLU networks. As a result, we also contribute some new results on the approximation theory of deep ReLU neural networks. (Joint work with Yihong Gu and Wenxin Zhou) |
10/3/22 |
Genevera Allen (Rice) Title: Graph Learning for Functional Neuronal Connectivity Abstract: Understanding how large populations of neurons communicate and jointly fire in the brain is a fundamental open question in neuroscience. Many approach this by estimating the intrinsic functional neuronal connectivity using probabilistic graphical models. But there remain major statistical and computational hurdles to estimating graphical models from new large-scale calcium imaging technologies and from huge projects which image up to one hundred thousand neurons in the active brain. In this talk, I will highlight a number of new graph learning strategies my group has developed to address many critical unsolved challenges arising with large-scale neuroscience data. Specifically, we will focus on Graph Quilting, in which we derive a method and theoretical guarantees for graph learning from non-simultaneously recorded and pairwise missing variables. We will also highlight theory and methods for graph learning with latent variables via thresholding, graph learning for spikey data via extreme graphical models, and computational approaches for graph learning with huge data via minipatch learning. Finally, we will demonstrate the utility of all approaches on synthetic data as well as real calcium imaging data for the task of estimating functional neuronal connectivity. |
10/10/22 |
Gautam Kamath (U Waterloo) Title: CoinPress: Practical Private Mean and Covariance Estimation Abstract: No knowledge of differential privacy will be assumed. Based on joint works with Sourav Biswas, Yihe Dong, Jerry Li, Vikrant Singhal, and Jonathan Ullman.
|
10/17/22 |
Xiaoyue Niu (Penn State) Title: Learning Network Properties without Network Data — A Correlated Network Scale-up Model Abstract: The network scale-up method based on “how many X’s do you know?” questions has gained popularity in estimating the sizes of hard-to-reach populations. The success of the method relies primarily on the easy nature of the data collection and the flexibility of the procedure, especially since the model does not require a sample from the target population, a major limitation of traditional size estimation models. In this talk, we propose a new network scale-up model which incorporates respondent and subpopulation covariates in a regression framework, includes a bias term that are correlated between subpopulations, and introduce a new scaling procedure utilizing the correlation structure. In addition to estimating the unknown population sizes, our proposed model depicts people’s social network patterns in an aggregated level without using the network data. |
10/24/22 |
Zhigang Yao (NUS & Harvard) Title: Principal flow, submanifold and boundary Abstract: While classical statistics has dealt with observations which are real numbers or elements of a real vector space, nowadays many statistical problems of high interest in the sciences deal with the analysis of data which consist of more complex objects, taking values in spaces which are naturally not (Euclidean) vector spaces but which still feature some geometric structure. I will discuss the problem of finding principal components to the multivariate datasets, that lie on an embedded nonlinear Riemannian manifold within the higher-dimensional space. The aim is to extend the geometric interpretation of PCA, while being able to capture the non-geodesic form of variation in the data. I will introduce the concept of a principal sub-manifold, a manifold passing through the center of the data, and at any point on the manifold extending in the direction of highest variation in the space spanned by the eigenvectors of the local tangent space PCA. We show the principal sub-manifold yields the usual principal components in Euclidean space. We illustrate how to find, use and interpret the principal sub-manifold, by which a principal boundary can be further defined for data sets on manifolds. |
10/31/22 |
Xuming He (U Michigan) Title: Covariate-adjusted Expected Shortfall: Some Recent Developments Abstract: Expected shortfall, measuring the average outcome (e.g., portfolio loss) above a given quantile |
11/7/22 |
Academic Holiday |
11/14/22 |
Sumanta Basu (Cornell) Title: Frequency-domain graphical models for multivariate time series Abstract: Graphical models offer a powerful framework to capture intertemporal and contemporaneous relationships among the components of a multivariate time series. For stationary time series, these relationships are encoded in the multivariate spectral density matrix and its inverse. In the first part of this talk, we will present adaptive thresholding and penalization methods for estimation of these objects under suitable sparsity assumptions. We will discuss new optimization algorithms for complex Lasso and Graphical Lasso, and investigate consistency of estimation under a double-asymptotic regime where the dimension of the time series increases with sample size. In the second part of the talk, we will introduce a frequency-domain graphical modeling framework for multivariate nonstationary time series that captures a new notion of conditional nonstationarity. We will provide numerical evidence that for locally stationary time series, conditional nonstationarity can be captured by nodewise regression of discrete Fourier transforms of different frequencies using a complex group Lasso.
|
11/21/22 |
Belinda Tzen (Columbia) Title: Theoretical guarantees for sampling and inference in generative models with latent diffusions Abstract: “Neural differential equations”, as they are popularly known, have substantially influenced the recent discourse on deep learning theory, as well as had practical consequences: some of the most compelling text-to-image models (e.g., Google’s Imagen) are underpinned by diffusion processes. In this talk, we discuss our theoretical contributions to this line of work. In particular, we study a class of probabilistic generative models, where the latent object is a finite-dimensional diffusion process — e.g., a “neural SDE” — on a finite time interval, and the observed variable is drawn conditionally on the terminal point of the diffusion. We provide a unified viewpoint on both sampling and variational inference in such generative models through the lens of stochastic control. We quantify the expressiveness of diffusion-based generative models. Specifically, we show that one can efficiently sample from a wide class of terminal target distributions by choosing the drift of the latent diffusion from the class of multilayer feedforward neural nets, with the accuracy of sampling measuredby the Kullback-Leibler divergence to the target distribution. Time permitting, we will also present and analyze a scheme for unbiased simulation of generative models with latent diffusions and provide bounds on the variance of the resulting estimators. This is joint work with Maxim Raginsky. |
11/28/22 |
Promit Ghosal (MIT) Title: Limit theorems for online Stochastic Gradient Descent in High dimensional regression. Abstract: We consider the online stochastic gradient descent (SGD) iterates in high dimensional linear regression model. We show that linear statistics of the SGD iterates shows a central limit theorem when dimension grows exponentially in sample size. Furthermore, we show a mean-field limit of the trajectory of the online SGD iterates and retrieve a stochastic partial differential equation as the scaling limit of its fluctuation. We will discuss some of the applications of these results. The talk will be based on ongoing joint works with Bhavya Agrawalla, Krishnakumar Balasubramanian and Ye He. |
12/5/22 |
Linjun Zhang (Rutgers) Title:The Cost of Privacy in Statistical Estimation
Abstract:Privacy-preserving techniques have been a central focus in data analysis nowadays. In this talk, we propose a general technique, the score attack, for lower bounding the differential-privacy-
Bio: Linjun Zhang is an Assistant Professor in the Department of Statistics, at Rutgers University. He obtained his Ph.D. in Statistics at the Wharton School, the University of Pennsylvania in 2019, and received J.Parker Bursk Memorial Prize and Donald S. Murray Prize for excellence in research and teaching, respectively upon graduation. His current research interests include privacy-preserving data analysis, algorithmic fairness, deep learning theory, and high-dimensional statistics.
|
12/12/22 |
Oscar Madrid Padilla (UCLA) Title:High Dimensional Latent Panel Quantile Regression with an Application to Asset Pricing
Abstract: We propose a generalization of the linear panel quantile regression model to accommodate both sparse and dense parts: sparse means that while the number of covariates available is large, potentially only a much smaller number of them have a nonzero impact on each conditional quantile of the response variable; while the dense part is represent by a low-rank matrix that can be approximated by latent factors and their loadings. Such a structure poses problems for traditional sparse estimators, such as the $\ell_1$-penalised Quantile Regression, and for traditional latent factor estimators such as PCA. We propose a new estimation procedure, based on the ADMM algorithm, that consists of combining the quantile loss function with $\ell_1$ and nuclear norm regularization. We show, under general conditions, that our estimator can consistently estimate both the nonzero coefficients of the covariates and the latent low-rank matrix. This is done in a challenging setting that allows for temporal dependence, heavy-tail distributions, and the presence of latent factors. Our proposed model has a “Characteristics + Latent Factors” Quantile Asset Pricing Model interpretation: we apply our model and estimator with a large-dimensional panel of financial data and find that (i) characteristics have sparser predictive power once latent factors were controlled (ii) the factors and coefficients at upper and lower quantiles are different from the median.
Bio: Oscar Hernan Madrid Padilla is a Tenure-track Assistant Professor in the Department of Statistics at University of California, Los Angeles. Previously, from July, 2017 to June, 2019, he was Neyman Visiting Assistant Professor in the Department of Statistics at University of California, Berkeley. Before that, he earned a Ph.D. in statistics at The University of Texas at Austin in May 2017 under the supervision of Prof. James Scott. His undergraudate degree was a B.S in Mathematics completed at CIMAT (in Mexico) in April 2013, advised by Prof. Daniel Hernandez-Hernandez.
|