Bounding Box Bonanza

Finished! Looks like this project is out of data at the moment!

Thank you everyone for making such quick work of this project! We'll post back on our results.

FAQ

How is this task different than other Zooniverse transcription projects?
This project is just asking you to 'box' as many lines of text as you see on a given page of handwritten text. There is no transcription step.

Why am I being asked to do this?
We need you to help us improve the accuracy of the line detection models we are using. It turns out we need boxes to train the machines to draw lines! So we are asking you to draw boxes around each line of text on a page.

Why don't you just run a regular transcription project and have humans provide the transcriptions?
Our aim is to increase the quality of automated line detection, so volunteers can focus on transcribing, rather than drawing lines. Our previous research into transcription methods increased the quality of results by reducing the need for line aggregation, and we know from the Talk boards that people much prefer transcribing to annotating! This work is the next step in helping us get a better sense of where to focus our efforts as we think about the text transcription tools we provide in the Project Builder.

Will training a machine learning model for line prediction use in future projects reduce the need for volunteer assistance entirely?
Absolutely not! Machine annotation and transcription alone has a long way to go. Our main goal is to identify where machine learning can help alleviate some of the effort, particularly in more complex task types, so that volunteers can focus on more interesting efforts!

How will you use the results?
These images have already been transcribed as part of another project, but we still need these data to carry out our research. We will first train a machine model on the bounding boxes you provide where the model will learn to predict both lines and boxes. We can then compare the machine predicted lines to those from a previous version of the project where volunteers drew lines. With the new, improved model, we may decide to then run a 'correct-a-machine' version of the transcription project with the eventual goal to compare the transcription data to gold-standard transcription data. This should allow us to measure whether the new machine model trained with your bounding boxes improves the overall quality of transcription data produced through this method.