Finished! Looks like this project is out of data at the moment!
We're happy to announce that our preprint is out, describing our tree mapping research and the (now public) dataset that this project uses! You can find it on arXiv here and check out the models for yourself here. Thanks again for volunteering, your efforts are helping us improve our labels and ultimately improve the state of tree and restoration monitoring.
We've completed the classification stage for this project! To browse other active projects that still need your classifications, check out zooniverse.org/projects.
Incredible thanks to the over 3000 volunteers who helped tag our first batch of images in just a couple of months. We're checking the stats, but we think we have over 1.5 million tags!
What happens next?
On the left is a full image that was split into around 1000 tiles before being uploaded to Zooniverse. The middle image shows the "tag density" - where citizen scientists most often clicked on trees. This image is reconstructed by aggregating the results from individual tiles and mapping them back to the original image. The right image shows a zoomed in region with individual tags in red. You can see that the tags follow the tree locations well, and (as expected) there is some variation in where people click, but it's mostly in the crown centre.
By improving our model we can provide more confident predictions of forest cover which will be useful to restoration practitioners, stakeholders and other ecology researchers.
Ok. You might be wondering why we're asking you to label trees, and how your labels will help our research. On this page, we'll take a look at how we're using the labels that you've provided.
Over the last year, Restor has been developing a machine learning model to detect trees. That's a difficult task because - as you've not doubt seen from the data - the images are complex, it's not obvious what is or isn't a tree and there is a huge variety and diversity in the natural world. With support from Google.org, we were able to pay a team of annotators to label a large number of images. These annotations are polygons - individual trees are outlined, and we also requested annotations around groups of trees (as it's extremely time consuming and subjective to do this for dense forest). The model is given an image and a "mask" which describes where trees are, and its job is to try and predict that mask.
Labelling images in such a detailed way is time-consuming and expensive. Normally it's a good idea to get multiple annotators per image, and then perform some kind of aggregation to clean the labels. Some of our images are double-annotated in this way, and all were reviewed after labelling. We think the masks/outlines in our dataset are pretty good. Where we've identified some mistakes are where annotators labelled a single tree as multiple, or vice versa, or where grass/shrubs are labelled as tree canopy. This is where citizen science can help!
Rather than ask you to draw around all the trees again, we're simply asking you to mark their centres. We'll compare this to our existing annotations to see if they agree. For example, if we have a polygon that denotes a single tree, we would expect the community to also place a "single tree" label in the same region of the image. However, if we see that most people placed two points inside it, or a "group" point, then that label should be checked.
Similarly, once we have this additional database of crowd-sourced tree locations, we can verify that our model predicts trees where you've marked! And for each tree that the community labels, we will have some measure of confidence e.g. if 19/20 people said this label is a tree, then we can be very sure about it.
One large advantage of this is that we don't necessarily need marks for every tree in every image, so if you're not sure about a plant you don't need to label it. It's more important that we have a list of very confident labels, than a lot of unsure ones.
Over the next few weeks we'll do some analysis on the first batch of annotations and post the results here for you to see.
We will publish results here when complete!