Finished! Looks like this project is out of data at the moment!

See Results

Thank you for your efforts! We've completed our project!

Click here for more details. To browse our other active projects that still need your classifications, check out our Citizen Readers page.

Research

The goal of this project is to generate knowledge about the behaviour of literary characters at large scale and make this data openly available to the public. Characters are the scaffolding of great storytelling. This Zooniverse project will allow us to crowdsource data to train AI models to better understand who characters are and what they do within diverse narrative worlds to help answer one very big question: why do human beings tell stories?

In the nineteenth-century heyday of the novel, there were over 1.5 million literary characters invented just in English alone. Today, with the continued growth of literary markets around the world and the explosion of creative writing on the internet through fan communities, that number is orders of magnitude higher.

How on earth can we possibly understand all of this creativity?

This is where you, the reader, come in. We need your help to build better, more transparent AI models to understand human storytelling. To be clear: our goal is not to build AI to generate stories or create smarter chatbots. Our aim is fundamentally academic: we want to develop models to help us understand stories and thus learn more about this essential human activity. Most AI development is happening inside of black boxes behind closed doors. Our models will be open to the public as will all of the annotations made by readers like you. You are a key participant in how we will understand the future of stories.

How does it work?

To understand the "life" of a literary character, we need to know who they are and what they do.

Our first workflow, "Annotating Character Interactions," focuses on identifying the types of social interactions in which characters engage. We know from the real world that social networks tell us a great deal about human behaviour. So what can fictional networks tell us about storytelling?

In order to train AI models to detect and understand character interactions, we have to be able to predict the kinds of actions characters engage in with one another. Sometimes this can be very straightforward and in other cases highly ambiguous. For example, consider this passage from the short story "Silence," by the Canadian Nobel Prize Winner, Alice Munro:

Here we see how an unnamed character ("a woman") recognizes Juliet ("her") as the two characters begin to talk to each other as a result of this recognition. There are thus two kinds of interactions taking place between these characters: perceptual ("recognizing") and communicative ("talking").

In this task you will annotate passages similar to this one for a range of interaction types (observing, communicating, associating, touching, etc.)

For this task, all of our data is drawn from a collection of contemporary fiction spanning numerous different genres, from bestsellers to prizewinners to mysteries to romances to scifi. You can read more about the underlying data we are using here. You'll only ever see small snippets of text so there is no risk of recreating the original works in any meaningful sense. Whenever we use new data sources -- like fan fiction or classics or whatever you might suggest! -- we will make that explicit in the challenge and accompanying tutorial.

For our second workflow, "Annotating Character Identities," we will be using plot summaries from Wikipedia and ask you to help identify different identity markers around characters. How old are they? What is their education level or gender or nationality? What role do they play? From these characteristics we should be able to better understand how writers from different time periods or cultures have focused on different types of characters.

Why?

Imagine a map of all the different types of interactions across all the different kinds of fictional characters in the world. This would help reveal the nature and purpose of storytelling across an array of different cultural contexts. We are only beginning to imagine all of the ways we can study storytelling using new AI models. We welcome suggestions from the community -- what do you want to know about how stories work?

The data annotated by you along with all of the models we build will be publicly accessible in this repository. This is AI for and by the world of readers.