Friday, March 24, 2017 - 10:30am
Location:
2315 Doherty HallSpeaker:
Jason Eisner Johns Hopkins UniversityEvent Website:
https://www.lti.cs.cmu.edu/lti-colloquium-11700Tractable Deep Models of Sequential Structure
ABSTRACT Recurrent neural networks such as LSTMs have become an indispensable tool for building probabilistic sequence models. With discussion of the statistical motivations, I'll give some not-so-obvious ways that expressive LSTMs can be harnessed to help model sequential data:
1. To score chunks of candidate latent structures in their fully observed context. The chunks can be assembled by dynamic programming, which preserves tractable marginal inference. (Applications: string transduction, parsing, ...)
2. To predict sequences of events in real time. This resembles neural language modeling, but the real-time setting means that you are predicting each event jointly with the entire preceding interval of non-events. (Applications: social media, patient histories, consumer actions, ...)
3. To classify latent syntactic properties of a language from its observed surface ordering. This essentially converts a hard and misspecified unsupervised learning problem to a simpler supervised one. To deal with the shortage of supervised languages to train on, we manufacture new synthetic languages. (Applications: grammar induction, etc.)
BIO Jason Eisner is Professor of Computer Science at Johns Hopkins University, where he is also affiliated with the Center for Language and Speech Processing, the Machine Learning Group, the Cognitive Science Department, and the national Center of Excellence in Human Language Technology. His goal is to develop the probabilistic modeling, inference, and learning techniques needed for a unified model of all kinds of linguistic structure. His 100+ papers have presented various algorithms for parsing, machine translation, and weighted finite-state machines; formalizations, algorithms, theorems, and empirical results in computational phonology; and unsupervised or semi-supervised learning methods for syntax, morphology, and word-sense disambiguation. He is also the lead designer of Dyna, a new declarative programming language that provides an infrastructure for AI research. He has received two school-wide awards for excellence in teaching.