Schedule for Spring 2024
Seminars are on Wednesdays
Time: 12:00 - 1:00pm
Location: Room 903, 1255 Amsterdam Avenue
Contacts: Wribhu Banik, Seunghyun Lee, Anirban Nath
1/24/2024 |
Speakers: Samory Kpotufe & Bodhi Sen (Columbia Stats) Title: TBA Abstract: Samory and Bodhi will both be talking about their current research interests and what it's like to do research with them. |
1/31, 2/7 |
NO SEMINAR |
2/14/2024 |
Speaker: Dan Lacker (Columbia IEOR) Title: The (projected) Langevin dynamics: sampling, optimal transport, and variational inference Abstract: This is a talk in two parts. The first part will survey a classical picture of the Langevin diffusion, with a focus on its applications to sampling and optimization. The second part will discuss my recent work on one or two (as time permits) analogous diffusion dynamics, which are designed to sample from probability measures arising in (1) entropic optimal transport and (2) mean field variational inference.
|
2/21/2024 |
Speaker: Genevera Allen (Rice) Title: Statistical Machine Learning for Scientific Discovery Abstract: In this talk, I will give an overview of my research program which develops new statistical machine learning techniques to help scientists make reproducible and reliable data-driven-discoveries from large and complex data. The first part will focus on an example of my research motivated by neuroscience: Understanding how large populations of neurons communicate in the brain at rest, in response to stimuli, or to produce behavior are fundamental open questions in neuroscience. Many approach this by estimating the intrinsic functional neuronal connectivity using probabilistic graphical models, but there remain major statistical and computational hurdles to graph learning from new large-scale calcium imaging technologies. I will highlight a new graph learning strategy my group has developed to address a critical unsolved neuroscience challenge that we call Graph Quilting, or graph learning from partial covariances resulting from non-simultaneously recorded neurons. The second part will focus on an example of my research in interpretable machine learning: Feature importance inference has been a long-standing statistical problem that helps promote scientific discoveries. Instead of testing for parameters that are only interpretable for specific models, there has been increasing interest in model-agnostic methods that can be applied to any statistical or machine learning model. I will highlight a new approach to feature occlusion or leave-one-covariate-out (LOCO) inference that leverages minipatch ensemble learning to increase statistical power and improve computational efficiency without making any limiting assumptions on the model or data distribution. Finally, I will conclude by highlighting current and future research directions in my group related to modern multivariate analysis, graphical models, ensemble learning, machine learning interpretability and fairness, and applications in neuroscience and genomics.
|
2/28/2024 |
Speaker: Dr. Simon Tavare (Columbia Stats and Bioscience)
Title: Cancer by the Numbers
Abstract: After a brief overview of the history of cancer, I will illustrate how the mathematical sciences can contribute to cancer research through a number of examples. Cancer development is characterized by occurrences of genomic alterations ranging in extent and impact, and the complex interdependence between these genomic events shapes the selection landscape. Stochastic modeling can help evaluate the role of each mutational process during tumor progression, but existing frameworks only capture certain aspects of tumorigenesis. I will outline CINner, a stochastic framework for modeling genomic diversity and selection during tumor evolution. The main advantage of CINner is its flexibility to incorporate many genomic events that directly impact cellular fitness, from driver gene mutations to copy number alterations (CNAs) including focal amplifications and deletions, mis-segregations, and whole-genome duplication. CINner raises a number of difficult statistical inference problems due to the lack of a feasible way to compute likelihoods. I will outline a new approach to approximate Bayesian computation – ABC – that exploits distributional random forests. I will give some examples of how this ABC-DRF methodology works in practice, and try to convince you that ABC has really come of age.
|
3/6/2024 |
Speaker: Zhongyuan Lyu (Columbia DSI) Title: Optimal Clustering of Multi-layer Networks Abstract: We study the fundamental limit of clustering networks when a multi-layer network is present. Under the mixture multi-layer stochastic block model (MMSBM), we show that the minimax optimal network clustering error rate, which takes an exponential form and is characterized by the Rényi-1/2 divergence between the edge probability distributions of the component networks. We propose a novel two-stage network clustering method including a tensor-based initialization and a one-step refinement procedure by likelihood-based Lloyd’s algorithm. Our proposed algorithm achieves the minimax optimal network clustering error rate and allows extreme network sparsity under MMSBM. We also extend our methodology and analysis framework to study the minimax optimal clustering error rate for mixture of discrete distributions including Binomial, Poisson, and multi-layer Poisson networks.
|
3/13/2024 |
NO SEMINAR
|
3/20/2024 |
Speaker: David Blei (Columbia Stats & CS) Title: Hierarchical Causal Models Abstract: Analyzing nested data with hierarchical models is a staple of Bayesian statistics, but causal modeling remains largely focused on "flat" models. In this talk, we will explore how to think about nested data in causal models, and we will consider the advantages of nested data over aggregate data (such as data means) for causal inference. We show that disaggregating your data---replacing a flat causal model with a hierarchical causal model---can provide new opportunities for identification and estimation. As examples, we will study how to identify and estimate causal effects under unmeasured confounders, interference, and instruments.
|
3/27/2024 |
Speaker: Alberto Gonzalez Sanz (Columbia Stats) Title: Abstract:
|
4/3/2024 |
Speaker: Yixin Wang (Michigan Stats) Title: Abstract:
|
4/10/2024 |
Speaker: Cindy Rush (Columbia Stats) Title: Abstract: |
4/17/2024 |
Speaker: Title: Abstract: |
4/24/2024 |
Speaker: Title: Abstract:
|
5/1/2024 |
Speaker: Title: Abstract:
|
5/8/2024 |
Speaker: Title: Abstract: |
5/15/2024 |
Speaker: Title: Abstract: |