Finished! Looks like this project is out of data at the moment!
Nature SPAM Filter project is now complete! Thank you for all your help. Stay in touch to learn about results
What exactly am I looking for in these titles?
You're looking for mentions of wild animals in news article titles. The goal is to distinguish between titles that genuinely discuss wildlife (like "Bear spotted in national park") versus those that use animal names in different contexts (like "Jaguar launches new electric car model").
What counts as a "wild animal"?
Wild animals are those that live in their natural environment, not domesticated or pets. This includes animals like wolves, tigers, elephants, deer, bears, and wild birds. It does not include pets, livestock, or domesticated animals like cats, dogs, cows, or chickens.
Why do I get so many titles unrelated to nature?
Our datasets gathered headlines from news sources around the world, but less than 5% of these are about animals. We need your help to find them!
If I'm not sure about whether an animal mentioned is wild or not, what should I do?
If you're unsure whether the animal mentioned is wild or if you're uncertain about the context, please use the "Not Sure" option. It's better to acknowledge uncertainty than to make an incorrect classification.
What if the title mentions a brand name that uses an animal name (like Puma shoes)?
If the title is clearly referring to a brand or company (like Puma sportswear or Jaguar cars), you should select "No" as these aren't references to actual wild animals.
Why is this task important?
This project helps create better tools for monitoring human-wildlife conflict globally. The data you help generate will train AI systems to automatically identify relevant news articles about wildlife, supporting conservation efforts recognized by the Convention on Biological Diversity's 194 member countries.
What if the title uses metaphorical animal references (like "Lion-hearted hero saves child")?
These should be marked as "No" since they're not actually about wild animals, but rather using animal names metaphorically.
How should I handle titles about zoo animals?
Many zoo animals are wild species, even though they're in captivity. Please use the "Yes" option for animals that could be found in the wild, "No" for farm or domestic animals.
What if a title mentions multiple animals?
If any one of the animals mentioned is a wild animal (not being used as a brand name or metaphor), then select "Yes."
What about extinct animals or fossil discoveries?
For titles about extinct animals (like dinosaurs and dodo) or fossil discoveries, please use the "No" option.
How accurate do I need to be?
Your accuracy is key for the success of the project, but don’t stress about getting it perfect every time. If you’re uncertain, simply select the “Not Sure” option. Your input will be combined with reviews from other participants, ensuring that we achieve reliable and trustworthy results together.
Why can't we just correct the answers of ChatGPT?
If we put all those headlines into a Large Language Model (LLM) like ChatGPT and ask it to identify the headlines with animal names and wildlife context, it will likely give us a pretty good collection of related headlines. Much less than the whole dataset - fantastic! Now we look through this collection and see how many headlines are correctly identified - so called True Positives - and how many headlines have nothing to do with animals - False Positives. The ratio between True Positives and False Positives gives us the Accuracy. Now let's compare some models based on the headlines we annotated: Model 1 has achieved 94% Accuracy while Model 2 only got 87% Accuracy. Model 1 is clearly better right? It turns out Model 1 has correctly identified 392 of the relevant headlines and 25 unrelated headlines while Model 2 classified 963 True Positives and 144 False Positives.. What happened here? Model 1 has a lower ratio of False Positives, but we forgot to look at how many of the relevant headlines the models missed (False Negatives)! The ratio between True Positives and False Negatives tells us the Precision of the model. But for that we first need to know which headlines are actually relevant!