We would like to thank all citizen scientists who contributed to this project on the Zooniverse platform. The Manatee Chat project continues on our platform (cetalingua.com). Please check it out if you are interested.
Today we hit a very important milestone, thank you so much everyone for participating. Please see some preliminary results below on two workflows that have been completed.
Reference: Lace, N. (2019), Manatee chat — a combination of citizen science and deep learning for identification of manatee calls and mastication sounds. Poster presentation at the World Marine Mammal Conference, Barcelona, Spain, December, 2019.
Abstract
Identification of biological sounds in large acoustic data sets can be difficult and time-consuming. Deep Convolutional Learning Networks have been used for sound identification and classification, but they require a substantial amount of labeled data. Manatee Chat, a citizen-science project currently housed on the Zooniverse platform, allows participants to easily label sound files. Over 2,000 citizen scientists inspected 9,259 audio files by listening to the sounds and visually examining the spectrogram. Each 10-second file was rated 15 times by different citizen scientists. Three identification categories were used: manatee calls, mastication sounds and nothing — resulting in 138,885 classifications. The obtained Fleiss’ Kappa (multi-raters’ reliability) was 0.48, indicating moderate strength of agreement. Next, 2,523 sound files that had at least 80% agreement among raters were selected to train and test the Deep Convolutional Neural Network model. The training set included 1,697 labeled spectrograms, and the validation set included 726 labeled spectrograms. The trained model achieved 97% accuracy on a validation set. The model was tested on 100 new spectrograms that were not part of the testing or validation set and achieved 85% accuracy. Data augmentation could further improve model accuracy and the ability to generalize. A number of citizen scientists doing the initial data labeling and subsequent training of Deep Convolutional Neural Networks could provide effective and accurate tracking and identification of manatee calls and mastication events in large acoustic data sets.
Results
Multi-rater reliability was 0.48, indicating a moderate strength of agreement.
The trained CNN model achieved 97% accuracy on validating sets by classifying spectrograms as containing “nothing,” “manatee call ” or “mastication.” The Confusion Matrix (Figure 1 below) shows several types of misclassifications.
Figure 1. Confusion Matrix. It shows what classes have been confused the most with each other
Figure 2. Some examples of the worst misclassified spectograms. Each has prediction class,
Actual class, loss and probability.