Statistics Seminar Series
Semester Schedule: Statistics – Spring 2014 Seminars are on Mondays |
|
Feb 17
|
Rebecca C. Steorts (CMU)
Title: Will the Real Steve Fienberg Please Stand Up: Getting to Know a Abstract: We propose a novel unsupervised approach for linking records across arbitrarily many files, while simultaneously detecting duplicate records within files. Our key innovation is to represent the pattern of links between records as a {\em bipartite} graph, in which records are directly linked to latent true individuals, and only indirectly linked to other records. This flexible new representation of the linkage structure naturally allows us to estimate the attributes of the unique observable people in the population, calculate $k$-way posterior probabilities of matches across records, and propagate the uncertainty of record linkage into later analyses.
|
Feb 24 |
Alexandra Chronopoulou, CUNY Title: Statistical Inference for fractional SDEs and applications. Abstract: Stochastic differential equations driven by fractional Brownian motion have an increasing presence in a wide range of applications, as they can model successfully phenomena that are characterized by long memory and/or selfsimilarity. In this talk, we will review their basic theoretical properties, focus on the statistical inference of their parameters and discuss particular applications in mathematical finance. |
March 3 | Noureddine El Karoui, UC Berkeley
|
March 10 | Yi Yu, University of Cambridge Title: Fused community detection Abstract: Community detection is one of the most widely studied problems in This is joint work with Dr. Yang Feng (Columbia University) and Prof. Richard J. Samworth
|
March 17 | Spring Recess
|
March 24 | Zongming Ma, UPENN “Estimating High-dimensional Matrices: Convex Geometry and Computational Barriers” In this talk, we introduce a unified approach for studying estimation of high-dimensional matrices, which yields tight non-asymptotic minimax rates for a large collection of loss functions in a variety of problems. Based on the convex geometry of finite-dimensional Banach spaces, the minimax rates of oracle (unconstrained) matrix denoising problem is determined for all unitarily invariant norms. This result is then extended to denoising with submatrix sparsity, where the excess risk depends on the sparsity constraints in a completely different manner. The approach is also applicable to matrix completion under low-rank constraint and extends beyond the normal mean model. In
|
March 31 | Jiashun Jin, CMU
Fast Network Community Detection by SCORE Consider a network where the nodes split into K dierent communities. The |
April 7 | Yee Whye Teh
|
April 14 | Grant Weller, CMU
Title: Inference for Hidden Regular Variation in Multivariate Extremes
Abstract: A fundamental deficiency of classical multivariate extreme value theory is the inability to distinguish between asymptotic independence and exact independence. In this work, we examine multivariate threshold exceedance modeling in the framework of regular variation. Under this framework, dependence in the tail of a distribution is described by a limiting measure, which in some cases is degenerate on joint tail regions despite possible dependence in such regions at finite levels. Hidden regular variation, a higher-order tail decay on these regions, offers a refinement of the classical theory. We develop a representation of random vectors possessing hidden regular variation as the sum of independent regular varying components. The representation is shown to be asymptotically valid via a multivariate tail equivalence result. We develop a likelihood-based estimation procedure from this representation via a Monte Carlo expectation-maximization algorithm which has been modified for tail estimation. The methodology is demonstrated on simulated data and applied to a bivariate series of air pollution measurements.
|
April 21 |
Mary Meyer, Colorado State University “Variable and Shape Selection in the Generalized Additive Model” The partial linear generalized additive model is considered, where the goal is to choose a subset of predictor variables and describe the component relationships with the response, in the |
April 28 | Liza Levina, University of Michigan
Title: Fast Community Detection in Large Sparse Networks Abstract: Community detection is one of the fundamental problems in network analysis, with many diverse applications, and a lot of work has been done on models and algorithms that find communities. Perhaps the most commonly used probabilistic model for a network with communities is the stochastic block model, and many algorithms for fitting it have been proposed. Since finding communities involves optimizing over all possible assignments of discrete labels, most existing algorithms do not scale well to large networks, and many fail on sparse networks. In this talk, we propose a pseudo-likelihood approach for fitting the stochastic block model to address these shortcomings. Pseudo-likelihood is a general statistical principle that involves trading off some of the model complexity against computational efficiency. We also derive a variant that allows for arbitrary degree distributions in the network, making it suitable for fitting the more flexible degree-corrected stochastic block model. The pseudo-likelihood algorithm scales easily to networks with millions of nodes, performs well empirically under a range of settings, including on very sparse networks, and is asymptotically consistent under reasonable conditions. If times allows, I will also discuss spectral clustering with perturbations, a new method of independent interest we use to initialize pseudo-likelihood, which works well on sparse networks where regular spectral clustering fails. |
May 5 | Juerg Huesler, University of Bern
On high exceedances and excursions Abstract |