Friday, December 13, 2019 - 10:00am to 12:00pm


Michael Miller Yoder

Computational Models of Identity Presentation in Language


Carolyn P. Rosé, (Chair)
Yulia Tsvetkov
Geoff Kaufman
David Jurgens, (School of Information, University of Michigan)


Language use varies across demographic, social, and cultural distinctions among speakers. Consequently, many researchers in computational sociolinguistics have built models of how language use reflects latent, stable identities of language users. However, researchers in sociolinguistics, linguistic anthropology, and discourse analysis posit that language also constitutes these very identities. Identity, which we define as the positioning of self and others in interaction, is in part reproduced, challenged, and performed in language. This thesis attempts to apply this perspective in computational tools that investigate not only how the identity of language users affects language, but how language positions the identity of the speaker and others. We explore identity positioning in language along several dimensions, from the effects of explicit self-labeling on social media to the implicit framing of gender and sexuality in narrative. We investigate each of these with machine learning and methods drawn from computational linguistics for use on large datasets of linguistic and social interaction.

