News & Announcements

The Editors and Editorial Board of Epidemiology are pleased to announce the selection of José Zubizarreta as the winner of this year’s Rothman Epidemiology Prize.…

Professor Victoria Stodden of Columbia University, who led a roundtable on the topic of reproducibility in 2009 and an ICERM workshop on Reproducibility in Computational and Experimental Mathematics in 2012, gave the keynote address at the XSEDE14 workshop. She raised the issue of a credibility crisis…

“Reproducibility has hit the …

Congratulations to Department of Statistics Faculty member Professor Tian Zheng on being honored as an ASA Fellow “For creating novel statistical methodology in statistical genetics, bioinformatics and computational biology, and social network theory, especially as related to the measuring homophily and to surveying hard to reach populations, and for being …

- Junyi ZhangDate: October 1, 2014Starts: 12:00 pmEnds: 1:00 pmLocation: 903 SSWDescription: Student Seminar
- Eitan Bachmat (Department of Computer Science, Ben Gurion University) "Airplane boarding, express lines, modular symmetries, lenses, soap bubbles and more."Date: October 2, 2014Starts: 1:10 pmEnds: 2:10 pmLocation: 303 MuddDescription: Applied Probability and Risk Seminar

Title: Airplane boarding, express lines, modular symmetries, lenses, soap bubbles and more.

Abstract: Airplane boarding is an operations research problem with high customer v! isibility. OR departments and management in airlines constantly experiment with boarding policies in an attempt to optimize the procedure. Recently, a couple of airlines experimented with the boarding policy which allows priority boarding to passengers with no carry on luggage to be placed in the overhead bins. We will try to analyze this policy. The policy is somewhat reminiscient of express line queues in supermarkets which also attempt to separate "fast" from "slow" ones. We note however that the analysis of express lines shares nothing with the analysis of airplane boarding which is based on Lorentzian (space-time) geometry. In this superficial analogy, the service time distribution at the supermarket counter is analogous to the aisle clearing time of a passenger in airplane boarding. We show that the effect of the distribution on express line queues is approximately the square of the (local) effect on airplane boarding. This means that we can trans! port the large body of work on express line queues to obtain approxima te results on airplane boarding. In particular, there is a modular symmetry on distributions which preserves the effect on express line queues and roughly on airplane boarding as well. Regarding the question of who should go first, the slow or fast passengers, it turns out not to matter much. The optimal solution is related to the construction of thin lenses centered around soap bubbles in space-time geometry. - Tim Leung (Columbia University)Date: October 2, 2014Starts: 4:10 pmEnds: 5:25 pmLocation: 903 SSW
- Yuri Bakhtin (NYU) "Burgers equation with random forcing."Date: October 3, 2014Starts: 12:00 pmEnds: 1:00 pmLocation: 520 MathDescription: Probability Seminar

Yuri Bakhtin (NYU)

"Burgers equation with random forcing"

Abstract:

The Burgers equation is one of the basic nonlinear evolutionary PDEs. The study of ergodic properties of the Burgers equation with random forcing began in 1990's. The natural approach is based on the analysis of optimal paths in the random landscape generated by the random force potential. For a long time only compact cases of the Burgers dynamics on a circle or bounded interval were understood well. In this talk I will discuss the Burgers dynamics on the entire real line with no compactness or periodicity assumption on the random forcing. The main result is the description of the ergodic components and existence of a global attracting random solution in each component. The proof is based on ideas from the theory of first or last passage percolation. The kicked forcing case is an extension of the Poissonian forcing case considered in a joint work with Eric Cator and Kostya Khanin. - Ping Li (Rutgers) “BigData: Hashing Algorithms for Large-Scale Search and Learning.”Date: October 6, 2014Starts: 4:10 pmEnds: 5:25 pmLocation: 903 SSWDescription: Statistics Seminar

Ping Li (Rutgers)

“BigData: Hashing Algorithms for Large-Scale Search and Learning.”

Abstract: The talk will begin with an interesting story about Cauchy distribution. Consider two data vectors, u and v, and a vector R of i.i.d. Cauchy variables. Pr{sgn(<u,R>) = sgn(<v,R>)} is essentially a monotonic function of the chi-square similarity (a nonlinear kernel) between u and v. This observation leads to useful bigdata (LINEAR) algorithms for building large-scale statistical models and searching for near neighbors, in terms of the chi-square similarity (kernel). Chi-square similarity has been known for its superb performance in data generated from histograms (e.g., computer vision and NLP).

Modern applications of internet search and machine learning routinely encounter datasets with (hundreds of) billions of examples in billion or even billion square dimensions (e.g., documents represented by high-order n-grams). Developing novel algorithms for efficient search and machine learning has become an active area of research. Hashing can be very useful in many scenarios, for example,

(1) Some device only has limited computing/storage/power resources;

(2) To achieve higher accuracy, we may want to explicitly consider pairwise or 3-way interactions (or high-order n-grams) in linear models.

(3) We hope to reduce the complexity of learning models (e.g., deep nets) by hashing the inputs or hashing the outputs.

(4) Hashing is an effective way of indexing (and space partitioning), which allows efficient sub-linear time near neighbor search.

(5) Perhaps surprisingly, our newest research can show that, if designed carefully, hashing (which naturally leads to linear algorithms) can also model the nonlinear effect (e.g., nonlinear kernels). Examples of such kernels include resemblance, chi-square, and CoRE kernels.

This talk will cover a variety of hashing algorithms including sign Cauchy projections, b-bit minwise hashing, one permutation, and densified one permutation hashing, etc.

Bio-Sketch: Ping Li is Associate Professor at Rutgers University, in the Department of Statistics and the Department of Computer Science. He graduated from Stanford University with Ph.D. in Statistics (and Master’s degrees in both CS and EE). Ping Li’s research interests include probabilistic hashing algorithms for big data, machine learning, information retrieval, boosting, data streams, and compressed sensing. His research has been funded by the Department of Defense (DoD), Microsoft, Google, and the National Science Foundation (NSF). In particular, he was one of the PIs of the recent NSF-Bigdata program. Ping Li received the Young Instigator Award (YIP) from the Air Force Office of Scientific Research (AFOSR) and the YIP from the Office of Naval Research (ONR). He also won a prize in 2010 Yahoo! Learning to Rank Grand Challenge using “robust logitboost” (Li, UAI 2010).

Columbia University

In the City of New York

DEPARTMENT OF STATISTICS

Columbia University

Room 1005 SSW, MC 4690

1255 Amsterdam Avenue

New York, NY 10027

Phone: 212.851.2132

Fax: 212.851.2164

Columbia University

Room 1005 SSW, MC 4690

1255 Amsterdam Avenue

New York, NY 10027

Phone: 212.851.2132

Fax: 212.851.2164