Research

Categorizing Computer Vision Training Images for Race and Ethnicity

Hundreds of companies and software platforms (including Google, Twitter, Facebook, Microsoft) use computer vision software to highlight, tag, and categorize human beings by their faces. However, almost none of the benchmark images used to train these algorithms has considered the racial or ethnic make-up of the human beings that are pictured in these images.

This lack of data is a problem. Why?

If there is an over-representation of some races or skin tones, then it is possible that those people who fit within those racial demographics are preferred by these platforms (i.e., they are shown more, and are more visible). If some groups of people are under-represented then this can lead to a host of problems including [medical misdiagnosis] (https://venturebeat.com/2020/10/21/researchers-find-evidence-of-racial-gender-and-socioeconomic-bias-in-chest-x-ray-classifiers/), [mistaken identities](https://www.nytimes.com/2020/06/24/technology/facial-recognition-arrest.htmlExample Text), and even total disappearance on video-conferencing.

Our team from the Technology, Race, and Prejudice (TRAP) lab is made up of multi-disciplinary scholars with expertise in consumer behavior, computer science, social and cognitive psychology.

This project will determine if the training data that underlies the software that affects YOU and billions of other people every day, is indeed biased.

We look forward to your support!