Adji Bousso Dieng

I am a Ph.D student in the department of Statistics at Columbia University where I am jointly being advised by David Blei and John Paisley. My work at Columbia is about combining probabilistic graphical modeling and deep learning to design better sequence models. I develop these models within the framework of variational inference which enables efficient and scalable learning. My hope is that my research can be applied to many real world applications particularly to natural language understanding.

Prior to joining Columbia I worked as a Junior Professional Associate at the World Bank. I did my undergraduate training in France where I attended Lycee Henri IV and Telecom ParisTech--France's Grandes Ecoles system. I hold a Diplome d'Ingenieur from Telecom ParisTech and spent the third year of Telecom ParisTech's curriculum at Cornell University where I earned a Master in Statistics.



Linkedin          Github          Curriculum Vitae          Google Scholar           Twitter

News

Feb 2018: I will be part of the Women Techmakers 2018 Summit panel at Google, New York.

Feb 2018: I will be giving a spotlight talk at the NYAS ML Symposium.

Oct 2017: I thank the NIPS Foundation and the NIPS organizers for the travel grant!.

Sep 2017: Our paper "Variational Inference via Chi-Upper Bound Minimization" has been accepted at NIPS.




Publications



TopicRNN: A Recurrent Neural Network With Long-Range Semantic Dependency

Adji B. Dieng, Chong Wang, Jianfeng Gao, and John Paisley

International Conference on Learning Representations, 2017

PDF         POSTER


Neural network-based language models have achieved state of the art results on many NLP tasks. One difficult problem is to capture long-range dependencies as motivated in the introduction of this paper. We propose to solve this by integrating latent topics as context and jointly training these contextual features with the parameters of an RNN language model. We provide a natural way of doing this integration by modeling stop words that are excluded by topic models but needed for sequential language models. This is done via binary classification where the probability of being a stop word is dictated by the hidden layer of the RNN. This modeling approach is possible when the contextual features as provided by the topics are passed directly to the softmax output layer of the RNN as additional bias. We report SOTA-comparable results on the Penn TreeBank and the IMDB.









Variational Inference via Chi-Upper Bound Minimization

Adji B. Dieng, Dustin Tran, Rajesh Ranganath, John Paisley, and David M. Blei

Neural Information Processing Systems, 2017 (To Appear)

PDF          SLIDES           POSTER


Variational inference with the traditional KL(q || p) divergence can run into pathologies. For example it typically underestimates posterior uncertainty. We propose CHIVI, a complementary algorithm to traditional variational inference. CHIVI is a black box algorithm that minimizes the $\chi$-divergence from the posterior to the family of approximating distributions and provides an upper bound of the model evidence. CHIVI performs well on different probabilistic models. On Bayesian probit regression and Gaussian process classification it yielded better classification error rates than expectation propagation (EP) and classical variational inference (VI). When modeling basketball data with a Cox process, it gave better estimates of posterior uncertainty. Finally, the CHIVI upper bound (CUBO) can be used alongside the classical VI lower bound (ELBO) to sandwich-estimate the model evidence.

Talks


Tufts University CS Department, Medford, MA, April 2018
Harvard University NLP Group Meeting, Cambridge, MA, April 2018
Stanford University NLP Seminar, Stanford, CA, April 2018
New York Academy of Science ML Symposium, NY, March 2018
Machine Learning and Friends Seminar, UMass, Amherst, MA, February 2018
Black in AI Workshop, Long Beach, CA, December 2017
MSR AI, Microsoft Research, Redmond, WA, August 2017
SSLI Lab, University of Washington, Seattle, WA, August 2017
DeepLoria, Loria Laboratory, Nancy, France, April 2017
AI With The Best, Online, April 2017
OpenAI, San Francisco, CA, January 2017
IBM TJ Watson Research, Yorktown Heights, NY, December 2016
Microsoft Research, Redmond, WA, August 2016