Thursday, May 7, 2020 - 1:00pm to 3:00pm




Khyathi Chandu

Event Website:

For More Information, Contact:

Stacey Young,


Alan W Black, (Co-chair)
Eric Nyberg (Co-chair)
Abhinav Gupta
Devi Parikh, (Facebook AI Research, Georgia Tech)


Anthropomorphic narrative generation in natural language in the form of stories, procedures, etc., has been a long-standing dream of artificial intelligence. Working towards this goal brings forth the need to adhere to the innate human characteristics of narratives. This includes content (relevance), structure (coherence) and surface form realization (expression). Anchoring these narrative properties is maneuvered not only by task-specific requirements but also by the availability of annotated data. Moreover, steep acceleration in brewing new content every day both surmounts and impedes the need for extensive annotations. This calls for a spectrum of anchoring between supervised and unsupervised, ranging from token-level to sparse narrative-level annotations. The main contribution of this thesis is a novel two-dimensional taxonomy of anchoring these three properties by locally and globally conditioned training objectives. This framework taps into techniques for anchoring to improve narrative intelligence. To begin with, I investigate content relevance by (i) hierarchically attending to a central mind map of entities and their references while generating visual stories, and (ii) anchoring question words in query oriented summarization. Followed by this, I demonstrate gains from the structural layout by (i) scaffolding structure representation in the generation of cooking recipes from images, and (ii) reordering content at the narrative level. Finally, I present surface form realization in generation by (i) multitasking with lexical level language information, and (ii) adversarial training for realizing forms of each word. In the afore-mentioned work, (i) indicates local and (ii) indicates global anchoring. In addition to availing external anchoring, I propose to work on narrative infilling to investigate denoising techniques leading to a tradeoff between coherence and novelty of content in visual stories and recipes. I also plan to improve persona based visual storytelling by learning to transfer disentangled persona at sentence level from genre based corpora.

For a copy of the thesis proposal please go to the following link.


LTI PhD Thesis Proposal