Schedule for Fall 2024
Seminars are on Mondays
Time: 4:10 pm - 5:00 pm
Location: Room 903 SSW, 1255 Amsterdam Avenue
9/9/2024
|
Caleb Miles (Columbia University Title: Optimal Tests of the Composite Null Hypothesis Arising in Mediation Analysis Abstract: The indirect effect of an exposure on an outcome through an intermediate variable can be identified by a product of regression coefficients under certain causal and regression modeling assumptions. In this context, the null hypothesis of no indirect effect is a composite null hypothesis, as the null holds if either regression coefficient is zero. A consequence is that traditional hypothesis tests are severely underpowered near the origin (i.e., when both coefficients are small with respect to standard errors). We propose hypothesis tests that (i) preserve level alpha type 1 error, (ii) meaningfully improve power when both true underlying effects are small relative to sample size, and (iii) preserve power when at least one is not. One approach gives a closed-form test that is minimax optimal with respect to local power over the alternative parameter space. Another uses sparse linear programming to produce an approximately optimal test for a Bayes risk criterion. We also propose adaptations to large-scale hypothesis testing as well as modifications that yield improved interpretability. We provide an R package that implements the minimax optimal test. Bio: Dr. Caleb Miles works on developing semiparametric methods for causal inference and applying them to problems in public health. His applied work is largely in HIV/AIDS, psychiatry, anesthesiology, and drug abuse. His methodological research interests include causal inference, its intersection with machine learning, mediation analysis, transportability/generalizability, and measurement error. |
9/16/2024 |
Alexander Aue ( University of Californa, Davis) Title: Testing general linear hypotheses in a high-dimensional regression model with spiked covariance |
9/23/2024 |
Pragya Sur (Harvard University) Title: Spectrum-Aware Debiasing: A Modern Inference Framework with Applications to Principal Components Regression Abstract: Debiasing methodologies have emerged as a powerful tool for statistical inference in high dimensions. Since its original introduction, the methodology witnessed a major advancement with the introduction of degrees- of-freedom debiasing in Bellec and Zhang (2019). While overcoming limitations of initial debiasing approaches, this updated method suffered a limitation—it relied on sub-Gaussian tails and independent, identically distributed samples. In this talk, we propose a novel debiasing formula that breaks this barrier by exploiting the spectrum of the sample covariance matrix. Our formula applies to a broader class of designs known as right rotationally invariant designs, which include some heavy-tailed distributions, as well as certain dependent data settings. Our correction term differs significantly from prior work but recovers the Gaussian-based formula as a special case. Notably, our approach does not require estimating the high-dimensional population covariance matrix yet can account for dependence among features and samples. We demonstrate the utility of our method for several statistical inference problems. As a by-product, our work also introduces the first debiased principal component regression estimator with formal guarantees in high dimensions. This is based on joint work with Yufan Li.
|
9/30/2024 |
Yihong Wu (Yale University) Title: Recent advances in Empirical Bayes: statistical and optimization perspective Abstract: Introduced by Robbins in the 1950s, Empirical Bayes is a powerful approach and popular framework for large-scale inference that aims at learning and adapting to latent structure in data, by finding data-driven estimators to compete with the Bayesian oracle that knows the true prior. This talk surveys some recent theoretical, methodological, and algorithmic advances in empirical Bayes, in both classical sequence models and extensions where latent variables and data interact through more complex designs. A central theme of this talk is the nonparametric maximum likelihood estimator of Kiefer and Wolfowitz. Along the way, I will introduce various open problems in the theory and practice of empirical Bayes. Time permitting, I will discuss applications of empirical Bayes techniques to score matching and diffusion models. Bio: Yihong Wu is James A. Attwood Professor and Department Chair of Statistics and Data Science at Yale University. He received his B.E. degree from Tsinghua University in 2006 and Ph.D. degree from Princeton University in 2011. He was a postdoctoral fellow with the Statistics Department in The Wharton School at the University of Pennsylvania from 2011 to 2012 and an assistant professor in the Department of ECE at the University of Illinois at Urbana-Champaign from 2013 to 2015. His research interests are in the theoretical and algorithmic aspects of high-dimensional statistics, information theory, and optimization. He was elected an IMS fellow in 2023 and was a recipient of the NSF CAREER award in 2017 and the Sloan Research Fellowship in Mathematics in 2018. |
10/7/2024 |
Nicolas Garcia-Trillos (UW Madison) Title: Minimax rates for the learning of spectra of differential operators from data. Abstract: The field of graph-based learning is one that is closely connected to manifold learning. It explores the following question: given a collection of data points $x_1 , \dots, x_n$ and a similarity graph over it, how can one use this graph to learn relevant geometric features from the dataset and in turn learn about the distribution that generated it? The question becomes a geometric or analytical problem when one assumes that the sampling distribution is supported on an unknown low dimensional manifold, as it is postulated by the manifold hypothesis. In this talk, I will discuss that, despite the multiple questions and answers that have been explored in the area of graph-based learning, there are several fundamental questions in statistical theory that have been largely unexplored, all of which are essential for manifold learning. Examples of these questions include: 1) What is the best possible estimator (potentially not graph-based), from a sample efficiency perspective, for learning features of unknown manifolds from observed data? 2) What is the asymptotic efficiency of popular graph-based estimators used in unsupervised learning? I will focus on the first type of question in the context of the task of learning the spectra of elliptic differential operators from data and will present new results that can be interpreted as a first step in bridging the gap between the mathematical analysis of graph-based learning and the analysis of fundamental statistical questions like the ones mentioned above. Throughout the talk, I will highlight the connection between the spectrum estimation and density estimation problems, and through this connection I will motivate a series of open mathematical questions related to operator learning and generative models using contemporary machine learning tools. This talk is based on very recent work with my PhD student Chenghui Li (UW-Madison) and with Raghavendra Venkatraman (Utah). Bio: Nicolás García Trillos is an associate professor at the department of statistics of University of Wisconsin Madison. He does work at the intersection of applied analysis, applied probability, machine learning, and statistics. Some of his current interests include adversarial machine learning, operator learning, and optimal transport in statistics. |
10/14/2024 |
Yao Xie (Georgia Institute of Technology) Title: Generative model for statistical inference: Iterative algorithm in probability space Abstract: Generative AI (GenAI), with its ability to synthesize and generate data, offers great potential for addressing high-dimensional data challenges. While generative models have advanced considerably, their full impact on high-dimensional statistical inference remains underexplored. We view generative models as neural network-based representations of continuous probability density functions learned from data and propose a framework that adapts the flow-based generative models for statistical inference and decision-making under uncertainty. Our approach moves beyond merely generating data resembling observations, emphasizing instead the mapping of a reference measure to a target distribution—whether defined by samples, an optimization solution, or a parametric form up to an unknown constant. We demonstrate the utility of this framework for density estimation, distributionally robust optimization (DRO), posterior sampling, and importance sampling. Bio: Yao Xie is the Coca-Cola Foundation Chair, Professor at Georgia Institute of Technology in the H. Milton Stewart School of Industrial and Systems Engineering, and Associate Director of the Machine Learning Center. From September 2017 until May 2023, she was the Harold R. and Mary Anne Nash Early Career Professor. She received her Ph.D. in Electrical Engineering (minor in Mathematics) from Stanford University in 2012 and was a Research Scientist at Duke University. Her research lies at the intersection of statistics, machine learning, and optimization in developing computationally efficient and statistically powerful methods for problems motivated by real-world applications. She received the National Science Foundation (NSF) CAREER Award in 2017, the INFORMS Gaver Early Career Award for Excellence in Operations Research in 2022, and the CWS Woodroofe Award in 2024. She is currently an Associate Editor for IEEE Transactions on Information Theory, Journal of the American Statistical Association-Theory and Methods, the American Statistician, Operations Research, Sequential Analysis: Design Methods and Applications, INFORMS Journal on Data Science, an Area Chair of NeurIPS, ICML, and ICLR.
|
10/17/2024 |
Peng Ding (Berkeley Title: Factorial Difference-in-Differences. Abstract: In many social science applications, researchers use the difference-in-differences (DID) estimator to establish causal relationships, exploiting cross-sectional variation in a baseline factor and temporal variation in exposure to an event that presumably may affect all units. This approach, which we term factorial DID (FDID), differs from canonical DID in that it lacks a clean control group unexposed to the event after the event occurs. In this paper, we clarify FDID as a research design in terms of its data structure, feasible estimands, and identifying assumptions that allow the DID estimator to recover these estimands. We frame FDID as a factorial design with two factors: the baseline factor, denoted by G, and the exposure level to the event, denoted by Z, and define the effect modification and causal interaction as the associative and causal effects of G on the effect of Z, respectively. We show that under the canonical no anticipation and parallel trends assumptions, the DID estimator identifies only the effect modification of G in FDID, and propose an additional factorial parallel trends assumption to identify the causal interaction. Moreover, we show that the canonical DID research design can be reframed as a special case of the FDID research design with an additional exclusion restriction assumption, thereby reconciling the two approaches. We extend this framework to allow conditionally valid parallel trends assumptions and multiple time periods, and clarify assumptions required to justify regression analysis under FDID. We illustrate these findings with empirical examples from economics and political science, and provide recommendations for improving practice and interpretation under FDID.
|
10/21/2024 |
Johannes Schmidt-Hieber (University of Twente) Title: A statistical analysis of image classification Abstract: The availability of massive image databases resulted in the development of scalable machine learning methods such as convolutional neural network (CNNs) filtering and processing these data. While the very recent theoretical work on CNNs focuses on standard nonparametric denoising problems, the variability in image classification datasets does, however, not originate from additive noise but from variation of the shape and other characteristics of the same object across different images. To address this problem, we consider a simple supervised classification problem for object detection on grayscale images. While from the function estimation point of view, every pixel is a variable and large images lead to high-dimensional function recovery tasks suffering from the curse of dimensionality, increasing the number of pixels in our image deformation model enhances the image resolution and makes the object classification problem easier. We propose and theoretically analyze two different procedures. The first method estimates the image deformation by support alignment. Under a minimal separation condition, it is shown that perfect classification is possible. The second method fits a CNN to the data. We derive a rate for the misclassification error depending on the sample size and the number of pixels. Both classifiers are empirically compared on images generated from the MNIST handwritten digit database. The obtained results corroborate the theoretical findings.
This is joint work with Sophie Langer and Juntong Chen.
|
10/28/2024 |
Jeremias Knoblauch (University College London) Title: Post-Bayesian machine learning Abstract: In this talk, I provide my perspective on the machine learning community's efforts to develop inference procedures with Bayesian characteristics that go beyond Bayes' Rule as an epistemological principle. I will explain why these efforts are needed, as well as the forms which they take. Focusing on some of my own contributions to the field, I will trace out some of the community's most important milestones, as well as the challenges that lie ahead. I will provide success stories of the field, and emphasise the new opportunities that open themselves up to us once we dare to go beyond orthodox Bayesian procedures. Keywords: generalised Bayes, robustness, Bayesian machine learning
|
11/4/2024 |
HOLIDAY |
11/11/2024 |
Adel Javanmard (University of Southern California) Title: Learning from aggregated responses: Improving model utility under privacy constraints. Abstract: In many real-world scenarios, training data is aggregated before being shared with the learner, to protect users' privacy. This talk explores recent advancements in aggregate learning, where datasets are grouped into bags of samples with only summary responses available for each bag. I will discuss novel loss construction and bagging schemes that enhance model accuracy while maintaining privacy. Key topics include using priors to inform bag construction and an iterative boosting algorithm that refines priors through sample splitting. Additionally, I will discuss learning rules in this framework that achieve PAC learning guarantees for classification. While Empirical Proportional Risk Minimization (EPRM) achieves fast rates under realizability, it can falter in agnostic settings. To address this, I will introduce a debiased proportional square loss that achieves "optimistic rates" in both realizable and agnostic scenarios. Bio: Adel Javanmard is the Dean's Associate Professor in the department of Data Sciences and Operation at the University of Southern California (USC) where he also holds a courtesy appointment with the computer science department. His research interests are broadly in the area of high-dimensional statistics, machine learning, optimization, and personalized decision-making. He is the recipient of several awards and fellowships, including the Alfred P. Sloan Research fellowship in Mathematics, the IMS Tweedie Researcher award, the NSF CAREER award, Thomas Cover dissertation award from the IEEE Society and several faculty awards from Google Research, Adobe Research, and Amazon.
|
11/18/2024 |
Boris Hanin (Princeton University) Title: Scaling Limits of Neural Networks Abstract: Neural networks are often studied analytically through scaling limits: regimes in which taking to infinity structural network parameters such as depth, width, and number of training datapoints results in simplified models of learning. I will survey several such approaches with the goal of illustrating the rich and still not fully understood space of possible behaviors when some or all of the network’s structural parameters are large.
|
11/25/2024 |
Omiros Papaspilopoulous (Bocconi) Title: Variational inference for bi-linear mixed models: A Study in ideal points Abstract: Generalized bi-linear mixed models are the workhorse of applied Statistics and they are used for varied tasks such as small area estimation, item response theory, recommendation and analysis of networks. In modern applications it is common that both the size of the data and the number of random effects are large. Bio: Omiros Papaspiliopoulos is a Full Professor at Bocconi University and previous to this he has held faculty positions at UPF in Barcelona and Warwick University, postdoctoral positions at Lancaster and Oxford, and has also worked in Berlin, Osaka, Paris, Madrid and Lima. He is co-editor of Biometrika since 2018, and has been an AE for JRSSB. He received the Guy Medal in 2010 by the Royal Statistical Society, and the DeGroot prize in 2021 for his book “An Introduction to Sequential Monte Carlo” with Nicolas Chopin. He has extensive experience at directing postgraduate and undergraduate programs, and at executive education including teaching at the SDA Bocconi MBA program. |
12/2/2024 |
Chao Gao (University of Chicago) Title: Abstract:
|
12/9/2024 | TBA |
|
|
|
|