I’m trying to build a project that has users listen to short audio clips and label any species they hear in the audio on an accompanying spectrogram (using the rectangle tool).
So far I’ve had pretty good success setting this up but I do have a few issues:
Is it possible to display an audio file and spectrogram as separate frames on the same task without having them be automatically joined together?
Also, I’ve noticed that you can draw boxes in the space of the audio player with my current arrangement. Does that mean any of the y-coordinates I get are likely to be offset from the spectrogram?
Any help or suggestions would be much appreciated.
I’m trying to build a project that has users listen to short audio clips and label any species they hear in the audio on an accompanying spectrogram (using the rectangle tool).
So far I’ve had pretty good success setting this up but I do have a few issues:
Is it possible to display an audio file and spectrogram as separate frames on the same task without having them be automatically joined together?
Also, I’ve noticed that you can draw boxes in the space of the audio player with my current arrangement. Does that mean any of the y-coordinates I get are likely to be offset from the spectrogram?
Any help or suggestions would be much appreciated.
2 Participants
4 Comments
I have not worked with drawing tools on spectrograms but I expect, based on how things work with drawing tools on images in general that the data you are seeing will give you the start and stop time the rectangle extends over with a bit of scaling.
In general, on images, the drawing tools point locations ( such as corners of rectangles or centroids of ellipses or simple points) are given in pixels with the origin the top left corner, x increasing to the right and y increasing down.
A spectrogram is basically a fixed png image with a animated cursor overlaid on the image which "moves" in sync with an audio file. So, for instance, if you use "show image" or "save image" commands on a spectrogram you capture the base .png file. You can look at this image to determine the pixel size. I would expect that the drawing tool coordinates refer to pixel counts of this base .png image in the normal pixel coordinate system. For a rectangle I would have expected a y value and height as well but these may be suppressed for spectrograms where their interpretation is not very clear compared to the x axis which is simply related to time in the audio file.
IF this is the case, I believe you can get the start and stop edges of the rectangle from the data - in your example "x":438.1483154296875 may be the pixel location of the left hand edge of the rectangle and "width":45.33642578125 may give the right hand edge at 438.1+45.3 = 483.4 pixels. To convert to time I believe you would divide these numbers by the pixel width of the ,png image (for several of your subjects I got the width as 1169 pixels) and multiply by the total nominal duration of the audio clip.
This will be fairly easy to test - draw a rectangle from the full left margin of the spectrogram to the full right margin and verify the x value is 0 and the width is the .png image pixel count. Then try a few rectangles from easily compared extents - left edge to center, one quarter to three quarters etc and verify the x and width vales correspond to what you would expect for pixel values given the overall size of the .png image.
I have not worked with drawing tools on spectrograms but I expect, based on how things work with drawing tools on images in general that the data you are seeing will give you the start and stop time the rectangle extends over with a bit of scaling.
In general, on images, the drawing tools point locations ( such as corners of rectangles or centroids of ellipses or simple points) are given in pixels with the origin the top left corner, x increasing to the right and y increasing down.
A spectrogram is basically a fixed png image with a animated cursor overlaid on the image which "moves" in sync with an audio file. So, for instance, if you use "show image" or "save image" commands on a spectrogram you capture the base .png file. You can look at this image to determine the pixel size. I would expect that the drawing tool coordinates refer to pixel counts of this base .png image in the normal pixel coordinate system. For a rectangle I would have expected a y value and height as well but these may be suppressed for spectrograms where their interpretation is not very clear compared to the x axis which is simply related to time in the audio file.
IF this is the case, I believe you can get the start and stop edges of the rectangle from the data - in your example "x":438.1483154296875 may be the pixel location of the left hand edge of the rectangle and "width":45.33642578125 may give the right hand edge at 438.1+45.3 = 483.4 pixels. To convert to time I believe you would divide these numbers by the pixel width of the ,png image (for several of your subjects I got the width as 1169 pixels) and multiply by the total nominal duration of the audio clip.
This will be fairly easy to test - draw a rectangle from the full left margin of the spectrogram to the full right margin and verify the x value is 0 and the width is the .png image pixel count. Then try a few rectangles from easily compared extents - left edge to center, one quarter to three quarters etc and verify the x and width vales correspond to what you would expect for pixel values given the overall size of the .png image.
4 Participants
14 Comments
I believe that feature is available only by applying to zooniverse directly via the normal contact email and asking for it... So send them an email with project and workflow details and state your case for wanting/needing the feature.
If you do add a spectrogram, please scale it to the full audio player frame so the red line is actually synchronized with the sound. The Birdsong project has the spectrogram with kHz units outside the spectrogram which causes the spectrogram itself to be too short to have the red line synchronize with the sound.
I believe that feature is available only by applying to zooniverse directly via the normal contact email and asking for it... So send them an email with project and workflow details and state your case for wanting/needing the feature.
If you do add a spectrogram, please scale it to the full audio player frame so the red line is actually synchronized with the sound. The Birdsong project has the spectrogram with kHz units outside the spectrogram which causes the spectrogram itself to be too short to have the red line synchronize with the sound.
4 Participants
9 Comments
The point (x, y) values reported in the data export are pixel units in the subject image pixels, not what is displayed to the volunteer. So if your uploaded images are consistently the same size in pixels AND the placement of the spectrogram on the image is consistent then you do not need to change your ymin and ymax by subject or volunteer.
If your placement of the spectrogram frequency axis varies relative to the image borders you have an issue that can not be resolved except by knowing the frequency origin's position and scale in each subject. I would be very surprised if this is the case - I would expect the way you converted the spectrograms to subject images was consistent and ymin, fmin and ymax, fmax are the same for all your subjects.
On a quick look it appears your images are 2107X1719 pixels with ymin = 1527 and ymax = 127 (within +-1 pixel) If ALL your images have been produced in a consistent way and are the same size and placement then
freq <- (100 - (y-127)* 100/1400)) or more simply (100 - (y-127)/14)) (To verify, check y = 127 gives frequency = 100, y = 1527 gives frequency = 0)
Again where "y" is the y value of the reported points in the export for each classification.
The point (x, y) values reported in the data export are pixel units in the subject image pixels, not what is displayed to the volunteer. So if your uploaded images are consistently the same size in pixels AND the placement of the spectrogram on the image is consistent then you do not need to change your ymin and ymax by subject or volunteer.
If your placement of the spectrogram frequency axis varies relative to the image borders you have an issue that can not be resolved except by knowing the frequency origin's position and scale in each subject. I would be very surprised if this is the case - I would expect the way you converted the spectrograms to subject images was consistent and ymin, fmin and ymax, fmax are the same for all your subjects.
On a quick look it appears your images are 2107X1719 pixels with ymin = 1527 and ymax = 127 (within +-1 pixel) If ALL your images have been produced in a consistent way and are the same size and placement then
freq <- (100 - (y-127)* 100/1400)) or more simply (100 - (y-127)/14)) (To verify, check y = 127 gives frequency = 100, y = 1527 gives frequency = 0)
Again where "y" is the y value of the reported points in the export for each classification.
3 Participants
7 Comments
Hello,
We are trying to set up one of our workflows on a mobile app, but it keeps crashing. Additionally, we have a spectrogram and a sound for each subject, but in this app, the sound shows as empty space, and you have to scroll down to see the spectrogram. It looks like the sound and the spectrogram are not joined like they are on the desktop version.
Hello,
We are trying to set up one of our workflows on a mobile app, but it keeps crashing. Additionally, we have a spectrogram and a sound for each subject, but in this app, the sound shows as empty space, and you have to scroll down to see the spectrogram. It looks like the sound and the spectrogram are not joined like they are on the desktop version.
3 Participants
4 Comments
While your information is mostly correct for images, does it work on spectrograms? and Why is the data example @Cetalingua gives us missing the y and height values? - Her error providing the data or is this a consequence of using the rectangle on a spectrogram?
Note that most drawing tools can be "dragged" off the image and it is very easy to have x and y values outside the actual image size - even points could be dragged off the subject image at one time (pre 2016) so analysis should be robust for that potential. I had to clean for thousands of off-image points in the early FossilFinder data, (peoples believed dragging the point off image deleted it since it no longer shows) This still is possible with rectangles so x and y are not always on image)
Note that rectangles can not be drawn on videos or text subjects at all so I am certainly not sure what happens on spectrograms - hence the suggestions for testing above...
While your information is mostly correct for images, does it work on spectrograms? and Why is the data example @Cetalingua gives us missing the y and height values? - Her error providing the data or is this a consequence of using the rectangle on a spectrogram?
Note that most drawing tools can be "dragged" off the image and it is very easy to have x and y values outside the actual image size - even points could be dragged off the subject image at one time (pre 2016) so analysis should be robust for that potential. I had to clean for thousands of off-image points in the early FossilFinder data, (peoples believed dragging the point off image deleted it since it no longer shows) This still is possible with rectangles so x and y are not always on image)
Note that rectangles can not be drawn on videos or text subjects at all so I am certainly not sure what happens on spectrograms - hence the suggestions for testing above...
4 Participants
14 Comments
Sorry, I was commenting about the Rectangle drawing tool in the context of images, whereas @Cetalingua is using the Column Rectangle tool in the context of a special "multi-image" configuration with 1x image file + 1x audio file. Thanks for highlighting that, @Pmason.
There's a bit to cover here, but the short version is that the drawing tools aren't fully supported (or rather, haven't been fully tested) for this configuration, so there are some quirks.
The long version:
Sorry, I was commenting about the Rectangle drawing tool in the context of images, whereas @Cetalingua is using the Column Rectangle tool in the context of a special "multi-image" configuration with 1x image file + 1x audio file. Thanks for highlighting that, @Pmason.
There's a bit to cover here, but the short version is that the drawing tools aren't fully supported (or rather, haven't been fully tested) for this configuration, so there are some quirks.
The long version:
4 Participants
14 Comments
On further testing, it seems that part of the problem is how different web browsers handle the audio player? I had been developing and testing using Firefox (my usual browser) and it was not possible to draw boxes on the spectrogram and have the audio player be usable at the same time. That's why I'd been breaking it into two sequential tasks: 1) listen to the audio with spectrogram, and 2) label the spectrogram using box tools. However, while testing on Chrome I noticed that it was possible to use the drawing tools and play audio at the same time.
On further testing, it seems that part of the problem is how different web browsers handle the audio player? I had been developing and testing using Firefox (my usual browser) and it was not possible to draw boxes on the spectrogram and have the audio player be usable at the same time. That's why I'd been breaking it into two sequential tasks: 1) listen to the audio with spectrogram, and 2) label the spectrogram using box tools. However, while testing on Chrome I noticed that it was possible to use the drawing tools and play audio at the same time.
2 Participants
4 Comments
@am.zooni Thanks for your reply. I was working on subjects with Earthquake detective in 2019 and 2020, and I think that still image that you refer to is just the spectrogram, with the X axis being the time axis, but I could identify earthquake sounds without it, and I told the researchers I was blind, yet They never discouraged me from participating. I think I started some conversations that were useful over there in fact. It isn't as vital IMHO to be able to see their spectrogram as, for example, Deep Sea Explorer, because they weren't as interested in very faint signals with ED. I checked out Deep Sea Explorer yesterday and, though there is audio to play there, there are no sound examples I could find anywhere, and the emphasis is strongly on being able to see the spectrograms, even though on their survey there is a question about whether you do or do not have a visual impairment. I realize that some projects where signal to noise ratio is extremely low such as (perhaps and apparently)Deep Sea Explorer, and (most definitely) Gravity spy may not be usefully worked by ear. As a ham operator I know that these days hams work other hams they can't even hear with their ears. They're using highly specialized digital modes that allow the computer to detect and decode the signal barely above the noise level. I'm sure I mentioned this a couple years ago, but it would be very helpful to have a category that one could browse in projects for blind users, where any projects that were suitable would be included in that list. As it is, I have to read through the entire list, or at least the lists I think might contain audible data, and consider whether a project might potentially be accessible to me before I spend time checking it out. For instance FROG FIND only says this about their project on the projects browsable lists you can bring up.
"Help us find threatened frogs in NSW National Parks. Understanding where they live helps us improve their chance of survival in the future."
Words like sound, hear, hearing, listen, calls, and audio are not included in that description, so I almost dismissed it. Then I thought: wait a minute, frogs are generally really small. A project looking for frogs on camera wouldn't be very practical would it? I bet there's sound involved. So I opened the link and there it was, a very accessible project I could explore. Other VI folks who aren't me who come along might check out this thread: https://www.zooniverse.org/talk/18/1002723 if they think to search Talk for "Audio Projects" or other such keywords. And now that I think of it, I really should go and update that thread after I've checked that the whale Chat and dolphin Chat projects can be participated in just by using sound. The thread seems to have died. I also never got around to really trying to classify subjects in Manatee Chat. Now I see the researcher is the same in all 3 projects. At about the time I was thinking about it, there was talk of a visual drawing tool they wanted to include that would let you pinpoint the sound you heard in the time coordinate. There are ways to do that by ear by slowing down the sound and slow-rewinding and slow-forwarding the sound and using a keyboard shortcut to mark the beginning and end. Reaper calls this scrubbing if I remember right. This is something we who work with sound as musicians, voiceover artists, podcasters etc. can do regularly with accessible DAW's like Reaper, Goldwave and Soundforge, but I don't think the tools to do that could easily be incorporated into a Zooniverse project, that's the impression I got. I haven't been back to Manatee chat to see if they indeed implemented this drawing tool or if they somehow made it markable by ear as I had suggested back in 2020. @cetalingua @vivitang @reinforce
@am.zooni Thanks for your reply. I was working on subjects with Earthquake detective in 2019 and 2020, and I think that still image that you refer to is just the spectrogram, with the X axis being the time axis, but I could identify earthquake sounds without it, and I told the researchers I was blind, yet They never discouraged me from participating. I think I started some conversations that were useful over there in fact. It isn't as vital IMHO to be able to see their spectrogram as, for example, Deep Sea Explorer, because they weren't as interested in very faint signals with ED. I checked out Deep Sea Explorer yesterday and, though there is audio to play there, there are no sound examples I could find anywhere, and the emphasis is strongly on being able to see the spectrograms, even though on their survey there is a question about whether you do or do not have a visual impairment. I realize that some projects where signal to noise ratio is extremely low such as (perhaps and apparently)Deep Sea Explorer, and (most definitely) Gravity spy may not be usefully worked by ear. As a ham operator I know that these days hams work other hams they can't even hear with their ears. They're using highly specialized digital modes that allow the computer to detect and decode the signal barely above the noise level. I'm sure I mentioned this a couple years ago, but it would be very helpful to have a category that one could browse in projects for blind users, where any projects that were suitable would be included in that list. As it is, I have to read through the entire list, or at least the lists I think might contain audible data, and consider whether a project might potentially be accessible to me before I spend time checking it out. For instance FROG FIND only says this about their project on the projects browsable lists you can bring up.
"Help us find threatened frogs in NSW National Parks. Understanding where they live helps us improve their chance of survival in the future."
Words like sound, hear, hearing, listen, calls, and audio are not included in that description, so I almost dismissed it. Then I thought: wait a minute, frogs are generally really small. A project looking for frogs on camera wouldn't be very practical would it? I bet there's sound involved. So I opened the link and there it was, a very accessible project I could explore. Other VI folks who aren't me who come along might check out this thread: https://www.zooniverse.org/talk/18/1002723 if they think to search Talk for "Audio Projects" or other such keywords. And now that I think of it, I really should go and update that thread after I've checked that the whale Chat and dolphin Chat projects can be participated in just by using sound. The thread seems to have died. I also never got around to really trying to classify subjects in Manatee Chat. Now I see the researcher is the same in all 3 projects. At about the time I was thinking about it, there was talk of a visual drawing tool they wanted to include that would let you pinpoint the sound you heard in the time coordinate. There are ways to do that by ear by slowing down the sound and slow-rewinding and slow-forwarding the sound and using a keyboard shortcut to mark the beginning and end. Reaper calls this scrubbing if I remember right. This is something we who work with sound as musicians, voiceover artists, podcasters etc. can do regularly with accessible DAW's like Reaper, Goldwave and Soundforge, but I don't think the tools to do that could easily be incorporated into a Zooniverse project, that's the impression I got. I haven't been back to Manatee chat to see if they indeed implemented this drawing tool or if they somehow made it markable by ear as I had suggested back in 2020. @cetalingua @vivitang @reinforce
5 Participants
8 Comments
Hello everyone,
I am trying to figure out what values marker tool returns, when I select a signal on my spectrogram with rectangle marker tool, I get this: value":[{"0":0,"x":438.1483154296875,"tool":0,"frame":0,"width":45.33642578125,"details":[{"value":[0]}]
What are these values? Pixels, location, or something else? Is it possible for a marker tool to capture time? Like to select the beginning and the end of something (or the entire duration of something) in msec within 10 seconds spectrogram file? Alternatively, are there any tools that could capture time?
Hello everyone,
I am trying to figure out what values marker tool returns, when I select a signal on my spectrogram with rectangle marker tool, I get this: value":[{"0":0,"x":438.1483154296875,"tool":0,"frame":0,"width":45.33642578125,"details":[{"value":[0]}]
What are these values? Pixels, location, or something else? Is it possible for a marker tool to capture time? Like to select the beginning and the end of something (or the entire duration of something) in msec within 10 seconds spectrogram file? Alternatively, are there any tools that could capture time?
4 Participants
14 Comments