Student Seminar – Fall 2020

Schedule for Fall 2020

The Student Seminar has migrated to Zoom for the Fall 2020 semester.

Seminars are on Wednesdays
Time: 12:00 – 1:00pm

Contacts: Diane Lu, Leon Fernandes

Information for speakers: For information about schedule, direction, equipment, reimbursement and hotel, please click here.


Welcome to the New Academic Year & Campuswire Workshop

11:30am – 12:00pm: Welcome to the New Academic Year.
12:00pm – 1:00pm: Campuswire Workshop.


Elliott Rodriguez, Ding Zhou, Zhi Wang, Yuanzhe Xu and others (Columbia)

“Sharing Summer Internship Experiences”


George Hripcsak (Columbia)

Title: Drawing reproducible conclusions from observational medical data with OHDSI
Observational Health Data Sciences and Informatics (OHDSI) is multi-stakeholder, interdisciplinary, international collaborative with a coordinating center at Columbia University. Its mission is to improve health by empowering a community to collaboratively generate the evidence that promotes better health decisions and better care. With 300 researchers from 30 countries and 600 million unique patients, OHDSI carries out federated studies at sufficient scale to answer questions about diagnosis and treatment. Initial studies provided insights on treatment pathways for chronic diseases around the world. Current work addresses the bias inherent in the medical literature by carrying out research at large scale, automating the analysis, correcting for confounding, and calibrating on residual confounding. OHDSI has produced evidence to inform the US and the European hypertension guidelines, running over half a million hypotheses related to hypertension treatment. Its large-scale propensity score (LSPS) algorithm has been demonstrated not only to handle measured confounders but also important normally unmeasured confounders. A possible mechanism for its success will be discussed.



Two Sigma

Title: Two Sigma Quant Talk

At Two Sigma, our community of scientists, technologists and academics collaborate to solve some of the most challenging economic problems.

We rely on the scientific method, rooted in hypothesis, analysis, and experimentation, to drive data-driven decisions, to manage risk, and to expand into new areas of focus. In this way, we create systematic tools and technologies to forecast the future of global markets.

If you’re interested in hearing more about the scientific method to modeling, please join our Quant Talk. We hope to see you there!

Our Quant Researchers Include:

  • Yuting graduated from Columbia in 2016 with a PhD in Statistics and currently works as a modeler at Two Sigma. Her day-to-day work includes predictive modeling and machine learning research.
  • Richard graduated in 2017 with a PhD in Statistics from Columbia University’s Statistics Department, where he worked on credit risk modeling and, to his own surprise, on quantum materials science. He is now a quantitative researcher at Two Sigma, developing predictive models for time series/panel data. Please join him in this week’s Student Seminar to learn more about statistical work in the world of systematic trading, and how to successfully transition from academia to industry research.
James Roger (Metrum Research Group)
Title: Pharmacometrics is Like This
Scientists working in biomedical research often have some sense of what to expect from a proper “biostatistician”, but relatively few know what to make of a statistician who calls himself or herself a “pharmacometrician”. Thus freed from the shackles of other people’s expectations, the pharmacometric statistician encounters problems and opportunities that are different from those encountered by the more conventionally branded biostatistician. Generally speaking, “pharmacometric analyses” put greater emphasis on understanding data generating mechanisms and evaluating associated causal narratives. In this talk I will try to convey the spirit and the value of pharmacometric approaches by way of three real examples. 
  • The first example will be from Alzheimer’s Disease, where a joint (multiple endpoint) longitudinal model was used to help select a dose for a phase 3 trial. 
  • The second example will be based on a pharmacokinetic / pharmacodynamic model for blood pressure, used to determine the maximum tolerated dose that could be studied in subsequent trials. 
  • The third example will be from Multiple Sclerosis, where a joint (multiple endpoint) sort-of-causal longitudinal model will be used as the basis for extrapolation into a previously unstudied “prodromal” population. 

I won’t have time to discuss any single application in depth, but I will try to convey the broad contours of each model and give a sense of the value proposition associated with each analysis.


Prof. Victor H. de la Pena (Columbia)

Title: Some Open Problems in Probability and Statistics

Abstract: In this talk I will discuss a few open problems. The man references for the talk are:


  1. de la Pena H. and Gine E. (1999). Decoupling: From Dependence to Indepen- dence. Springer.
  2. de la Pena H. (2019). From Decoupling and Self-Normalization to Machine Learning. Notices of the American Society November Issue.
  3. de la Pena, H., Lai T. and Shao, Q. (2009) Self-Normalized Processes: Limit Theorems and Statistical Applications. Springer.



Sumit Mukherjee (Columbia) 

Title: Viewing a permutation as a copula

Abstract: The idea of viewing a permutation as a copula, (i.e. a probability measure on the unit square with uniform marginals) first originated in Combinatorics. Using this representation, we can compute limiting properties of various statistics under non uniform probability models on the space of permutations. Examples include the number of fixed points, the number of cycles of a given length, and the number of inversions. Focusing on Statistics, we analyze a class of non uniform probability measures on permutations, which include the celebrated Mallows models. We compute the limiting log normalizing constant for such models, and give an iterative algorithm for computing this limit. We also show consistency of the MLE and the Pseudo-likelihood estimator in these models.


Kobi Abayomi (Seton Hall University) 
Title:  What is Data  
Abstract:  An intentional singularization to illustrate some examples from business use cases where pseudo-experimental is as good as it gets.