Finished! Looks like this project is out of data at the moment!
In June 2020 I have discussed with the Royal Netherlands Institute in Rome and the University of Groningen what the next steps will entail. You can read an update on the results page.
Why does this project require that I transcribe a folio at a time? What about the line-by-line or word-by-word transcription used in annoTATE or Shakespeare’s World?
Our consensus algorithm works best with fully transcribed folios. At this time, there isn’t an effective method for reconciling partially transcribed folios that honors the time and effort of our contributors. The projects that are offering line-by-line are using a custom web interface.
Why not just use Optical Character Recognition (OCR) software to transcribe these documents?
At the moment, none of the OCR technologies available to us produce useful results from handwritten materials. The number of typewritten documents in the archive also present difficulties for OCR because of their quality and the presence of handwritten annotations.
The above shows a page from the PAParchive. The document is typewritten page, has only one hand written annotation and is well preserved. The a, c, e, and o (characters that look similar) are clearly legible and easy to discern. Furthermore, not many letters have faded away and have regular spacing between the letters. The document is of higher quality. The next paragraph shows what high-end OCR sofware makes of each folio. I think it is safe to say that in times of machine learning, for once, the human beats the computer!
What will you do with the transcribed documents?
We are collecting your transcripts to create a rich data resource for art and social historians. Once transcribed, the documents will be used to further enrich the Linked Open Database that is being built by the Royal Netherlands Institute in Rome. The database will be put on the KNIR-website and will be free to use and re-use.
Two advantages:
The Linked Open Database will connect with other art historical databases to enrich their data. This will make your work extra valuable.
The data will then be suitable for use in computational research methods. Our use of your contributed transcripts is governed by the Zooniverse User Agreement and Privacy Policy.