Carnegie Mellon University

Multi-Field Hierarchical Discovery and Tracking

Information Retrieval, Text Mining and Analytics

By Yiming Yang

Modeling information dynamics at different levels of granularity is an open challenge. We are developing new Bayesian VonMieses-Fischer topical clustering techniques, including hierarchical and dynamic models that outperform existing methods and scale to large data. Our approach consists of multi-field graphical models for correlated latent topics, semi-supervised topology learning, metric learning, transfer learning and temporal trend modeling. We evaluate on large datasets of scientific literature, as well as news story collections.