Princeton University Library Catalog

Constructing and Deconstructing an Empowerment-Offensiveness Classifier: Intersections of Big Data, Polarization, and Bias on Twitter During the 2016 Presidential Election

Kirgios, Erika [Browse]
Senior thesis
LaPaugh, Andrea S. [Browse]
Princeton University. Department of Computer Science [Browse]
Class year:
Summary note:
Twitter is becoming an increasingly common forum for discussion of politics and current events. Such conversation can often be contentious and polarizing. Using Twitter data from the week of the 2016 presidential election, this paper aims to improve upon previous classifiers trained to detect hate speech by using neural network models and psycholinguistic content. Furthermore, this paper adds complexity to the current literature on offensiveness detection by classifying tweets as empowering and neutral as well as offensive while maintaining high accuracy. Through a sociolinguistic and technosocial lens, we discuss the process of algorithmic construction with online text corpora, merging technical and sociocultural literature to address ethical concerns and sources of algorithmic bias. We also offer a novel tool to detect spam tweets from tweet metadata and content alone rather than relying on user history, achieving high accuracy of 89.9\% and low false positive rate of .3\%.