Wednesday, June 24, 2020 - 8:00am to 10:00am


to take place via Zoom


Zhuyun Dai

Event Website:

For More Information, Contact:

Corey Bisbal,


Jamie Callan (Chair), Carnegie Mellon University 

Graham Neubig, Carnegie Mellon University 

Tie-Yan Liu, Carnegie Mellon University & Microsoft Research 

Yiqun Liu, Tsinghua University, Beijing


For 50-60 years, information retrieval (IR) systems have relied on bag-of-words approaches. Although bag-of-words retrieval has several long-standing limitations, attempts to solve these issues have been mostly unsuccessful. Recently, neural networks provide a new paradigm for modeling natural languages. This dissertation aims to combine insights from IR and the key advantages of neural networks to bring deeper language understanding into IR.

The first part of this dissertation focuses on how queries and documents are matched. State-of-the-art rankers have previously relied on exact lexical match, which causes the well-known vocabulary mismatch problem. This dissertation develops neural models that bring soft match into relevance ranking. Using distributed text representations, our models can soft match every query word to every document word. As the soft match signals are noisy, this dissertation presents a novel kernel-pooling technique that groups soft matches based on their contribution to relevance. This dissertation also studies whether pre-trained model parameters can improve low-resource domains, and whether the model architectures are re-usable in a non-text retrieval task.  Our approaches outperform previous state-of-the-art ranking systems by large margins.

The second part of this dissertation focuses on how queries and documents are represented. A typical search engine uses frequency statistics to weight words, but frequent words are not necessarily essential to the meaning of the text.
This dissertation develops neural networks to estimate word importance based on how a word interacts with its linguistic context.  A weak-supervision approach is developed that allows training our models without any human annotations. Our models can run offline, significantly improving first-stage retrieval without hurting efficiency.

To summarize, this dissertation formulates a new neural retrieval paradigm that overcomes classic retrieval models' limitations in matching and importance weighting. It points out several promising paths in neural relevance ranking, deep retrieval models, and deep document understanding for IR. 

For a copy of the defense thesis please go to the following link:    


LTI PhD Theseis Defense