Visualizing Expression: A Visual Analysis of Literary Works and Nonliteral Copying in the Context of Copyright Infringment

Jullian, Marianne
Senior thesis
Kernighan, Brian
Princeton University. Department of Computer Science
Class year:
66 pages
Summary note:
In the domain of copyright law that deals with fictional works, issues of nonliteral copying have been quite contentious. The focus has been on how to protect the public domain against monopolies of ideas that serve as fodder for creative writings, while also providing adequate protection for authors expressions of ideas in order to incentivize future work. Several judges have developed tests that can be applied to fictional works, however they are rather abstract and rely on the discretion of those involved in individual court cases. With this in mind, I sought out to develop an automated method that seeks to identify unique expressions of ideas in literary works. Drawing from discussions of nonliteral copying in the context of copyright infringement, expressions are hereafter defined as patterns composed of the following literary components: writing style, character personalities, plot themes, plot developments, and relationships between characters. The method I propose as a tool for detecting nonliteral copying is a data visualization. This method relies on computational linguistics and also on the power of data visualization to uncover otherwise obscured patterns of expression through the use of color, layers, and small multiples. The efficacy of the linguistic analysis and data visualization is judged by its ability to accurately identify important characters, concepts, and plot developments on works in isolation. Additionally, the efficacy of the data visualization as a tool for identifying nonliteral copying is analyzed using works written by the same author and the discussion of its application to a works by distinct authors.