Research

In order to find images using textual search, metadata needs to be provided. Labels are a basic type of metadata. For standarization and interoperability reasons, it is convenient to use labels from a controlled vocabulary. We selected 21 concepts from the AAT Getty vocabulary as our target.

Once the images from this campaign are labeled, they could be ready to be queried in a search engine. However, our goal goes further. We would like to train a computer vision model that would learn to generate labels from this vocabulary for all the images from our database. In this way we could add labels to millions of images automatically.

You can read more about the pilot in this introductory blog post, the second blog post and the third blog post, where we explain how we trained a classification model. And don't forget to check out the github repository!