Just to complicate matters and add to the possiblities...
It is also possible to use simple workflows to determine which subjects are sent on to subsequent workflows. For instance you may have subjects, some of which contain features that need to be analysed further, while others do not. One option is a multiple task workflow that first asks if the features are present, then if they are, goes on to analyse them in further questions or other task types. The other option is to have a simple yes/no single question workflow that if the subject gets enough yes's - it is added to a second subject set which a second workflow analyses, if it does not get enough yes's the subject could be retired. This second option is a cascade filter - because the filter workflow is yes/no it can be adapted for mobile swipe devices and is very fast. These fast swipe workflows are very popular, being something one can do on a commute or during a work break or any time one wants simple relaxation requiring not a great deal of thought or long-term concentration.
Further, using something called Caesar, the criteria for advancement or retirement for the filter can be quite flexible - for instance a subject might advance after three yes's or retire after the first no - which ever happens first - this can result in a considerable reduction in the number of classifications needed to complete a subject set vs having all subject classified at least three times. If the large part of the subjects are no's the average classification to retirement may be well under two!
This has been rolled out for select projects already (example Snaphots at Sea (currently out of data).)
Also consider drawing tasks can be combined with subtasks you can ask for a drawing to be made ( such as marking some text to transcribe) then in a subtask ask for the transcription. This may reduce steps in the workflow
Some of the feature you will see if you investigate many projects are experimental as noted above by @srallen. As is Caesar and cascade filtering. Some are also customization of the front end that you can not hope to do, and it can be tricky to figure out which is which. Things like dropdowns are far enough along (if not already standard features) that you can ask for them with very high likelihood that they will be available to you.
What I suggest is cobble a single multiple step workflow together that does what you need without too much concern for efficiency or speed, or ease of use, and then ask various people to look at it and suggest alternate ways to get the same information most efficiently - that way your testers can see what you need out of the workflow, and the type of subjects you have to work with. Moderators on similar projects, and the various people that reply to this sort of question are good people to ask. You will likely have to add these people as testers so they can access the project at this stage, and provide them with the url. Note this is not beta testing - but a very early workflow design stage.
It is also not too soon to be thinking about what the export data looks like for these options, and what that means for your analysis. - Example - there is an experimental freehand drawing tool that some are playing with. Neat, but think of what one needs to do to recover and work with a complex contour which must be a raster of many many points for each use of the tool. Even simple tools can give big problems - free transcription blocks are a standard task type in the project builder - how do you aggregate and compare the various volunteers inputs? This is simple enough if you are asking for the transcription of a short text from single defined label or such but not so easy if you are asking for a descriptions of the cats in the Kitteh's subject set - you will get more or less valid descriptions from each classification but worded so differently you would need advanced keyword searching to come up with the "best" or consensus description. What the data looks like and how you will work with it is at least as important as efficiency for workflow design.
I for one would be happy to look at anything you come up with.
Just to complicate matters and add to the possiblities...
It is also possible to use simple workflows to determine which subjects are sent on to subsequent workflows. For instance you may have subjects, some of which contain features that need to be analysed further, while others do not. One option is a multiple task workflow that first asks if the features are present, then if they are, goes on to analyse them in further questions or other task types. The other option is to have a simple yes/no single question workflow that if the subject gets enough yes's - it is added to a second subject set which a second workflow analyses, if it does not get enough yes's the subject could be retired. This second option is a cascade filter - because the filter workflow is yes/no it can be adapted for mobile swipe devices and is very fast. These fast swipe workflows are very popular, being something one can do on a commute or during a work break or any time one wants simple relaxation requiring not a great deal of thought or long-term concentration.
Further, using something called Caesar, the criteria for advancement or retirement for the filter can be quite flexible - for instance a subject might advance after three yes's or retire after the first no - which ever happens first - this can result in a considerable reduction in the number of classifications needed to complete a subject set vs having all subject classified at least three times. If the large part of the subjects are no's the average classification to retirement may be well under two!
This has been rolled out for select projects already (example Snaphots at Sea (currently out of data).)
Also consider drawing tasks can be combined with subtasks you can ask for a drawing to be made ( such as marking some text to transcribe) then in a subtask ask for the transcription. This may reduce steps in the workflow
Some of the feature you will see if you investigate many projects are experimental as noted above by @srallen. As is Caesar and cascade filtering. Some are also customization of the front end that you can not hope to do, and it can be tricky to figure out which is which. Things like dropdowns are far enough along (if not already standard features) that you can ask for them with very high likelihood that they will be available to you.
What I suggest is cobble a single multiple step workflow together that does what you need without too much concern for efficiency or speed, or ease of use, and then ask various people to look at it and suggest alternate ways to get the same information most efficiently - that way your testers can see what you need out of the workflow, and the type of subjects you have to work with. Moderators on similar projects, and the various people that reply to this sort of question are good people to ask. You will likely have to add these people as testers so they can access the project at this stage, and provide them with the url. Note this is not beta testing - but a very early workflow design stage.
It is also not too soon to be thinking about what the export data looks like for these options, and what that means for your analysis. - Example - there is an experimental freehand drawing tool that some are playing with. Neat, but think of what one needs to do to recover and work with a complex contour which must be a raster of many many points for each use of the tool. Even simple tools can give big problems - free transcription blocks are a standard task type in the project builder - how do you aggregate and compare the various volunteers inputs? This is simple enough if you are asking for the transcription of a short text from single defined label or such but not so easy if you are asking for a descriptions of the cats in the Kitteh's subject set - you will get more or less valid descriptions from each classification but worded so differently you would need advanced keyword searching to come up with the "best" or consensus description. What the data looks like and how you will work with it is at least as important as efficiency for workflow design.
I for one would be happy to look at anything you come up with.
4 Participants
7 Comments
Yes, the basic frame by itself only slices out records per the logical conditions you set. So using it one can select the records for one specific workflow or subject set etc but they are not flattened or modified without adding additional blocks of code.
The next script you want to use then is flatten_class_survey_demo.py This will handle one survey task with the questions asked in the survey for each choice or species selected. If you have other questions that precede or follow the survey task we can address them later.
This one will reduce the annotations JSON for a survey task to a more friendly format plus aggregate over subject-choice for all the survey choices and questions to give you a vote_fraction for each possible answer, plus apply a filter which you can modify to resolve discrepancies between the individual classifications for choices or the other questions you asked for each choice. The filter will determine how you decide which species was selected if say, five people selected one animal and one selected something else. The filter allows you to set the criteria to determine a consensus and saves those cases that remain in question after the filter has been applied.
Start with the readme in the same location as the script above, then copy the script to your directory you are working from.
You will also have to copy the question.csv file you used to create your project to this directory since it tells the script what questions were asked and the possible responses to look for.
As you did for the frame you will need to go through the comments in the script and modify each thing it asks for - starting with the file paths and locations for your data, the question file andthe names and where you want the four resulting output files to end up.
You also have to modify the task number your project uses for the survey task section in line 150. The script defaults to T0 but that is not necessarily what your project used.
You can modify and/or copy and paste the logical conditions from the frame to the this script to limit the classifications to be analysed to a specific work flow or range of subjects or dates. - Any of the conditions from the basic frame can be added to the function include(class_record) in blocks from the if statement to the end of the return False for each condition you want to add.
If you look carefully you will see that the main section of the first step is the just the frame you already worked with plus a block of code that handles the survey task added to the body of it.
The hardest part will be to choose the fieldnames where the choices (species) and answers to your questions end up, plus two fields used in the next section for 'subject_choices', and 'all_choices'. These fieldnames will depend on the precise structure of your project and the questions you asked. Basically you will want "choice" or "species" plus one field for each question you asked in the survey, plus the two special fields. While the names you use do not matter, they must match those used in the writer in line 175. The demo you are modifying had five questions beyond the "choice" or species selection - yours may have more or less, in any case the answers are numbered answer[0] through to answer [n] in the order the additional questions are asked.
I you like you can get the first section - down to line 195 working then bring in the other sections ...
To "comment out" a line of code one adds a '#' in front of it. This line of code is then ignored by the interpreter. Pycharm can comment out whole sections by highlighting them and clicking on "Comment with Line Comment" found under "Code" in the main menu bar. Highlighting the section and clicking this again removes the comment marks and the code can be processed again. This is useful if you want to work on getting things working in a systematic way. At this point I would definitely comment out lines 206 on until we have the first section working.
It would help if I knew a bit more about your project if you can send me a link?? I can tell much from just the regular classification page.
Yes, the basic frame by itself only slices out records per the logical conditions you set. So using it one can select the records for one specific workflow or subject set etc but they are not flattened or modified without adding additional blocks of code.
The next script you want to use then is flatten_class_survey_demo.py This will handle one survey task with the questions asked in the survey for each choice or species selected. If you have other questions that precede or follow the survey task we can address them later.
This one will reduce the annotations JSON for a survey task to a more friendly format plus aggregate over subject-choice for all the survey choices and questions to give you a vote_fraction for each possible answer, plus apply a filter which you can modify to resolve discrepancies between the individual classifications for choices or the other questions you asked for each choice. The filter will determine how you decide which species was selected if say, five people selected one animal and one selected something else. The filter allows you to set the criteria to determine a consensus and saves those cases that remain in question after the filter has been applied.
Start with the readme in the same location as the script above, then copy the script to your directory you are working from.
You will also have to copy the question.csv file you used to create your project to this directory since it tells the script what questions were asked and the possible responses to look for.
As you did for the frame you will need to go through the comments in the script and modify each thing it asks for - starting with the file paths and locations for your data, the question file andthe names and where you want the four resulting output files to end up.
You also have to modify the task number your project uses for the survey task section in line 150. The script defaults to T0 but that is not necessarily what your project used.
You can modify and/or copy and paste the logical conditions from the frame to the this script to limit the classifications to be analysed to a specific work flow or range of subjects or dates. - Any of the conditions from the basic frame can be added to the function include(class_record) in blocks from the if statement to the end of the return False for each condition you want to add.
If you look carefully you will see that the main section of the first step is the just the frame you already worked with plus a block of code that handles the survey task added to the body of it.
The hardest part will be to choose the fieldnames where the choices (species) and answers to your questions end up, plus two fields used in the next section for 'subject_choices', and 'all_choices'. These fieldnames will depend on the precise structure of your project and the questions you asked. Basically you will want "choice" or "species" plus one field for each question you asked in the survey, plus the two special fields. While the names you use do not matter, they must match those used in the writer in line 175. The demo you are modifying had five questions beyond the "choice" or species selection - yours may have more or less, in any case the answers are numbered answer[0] through to answer [n] in the order the additional questions are asked.
I you like you can get the first section - down to line 195 working then bring in the other sections ...
To "comment out" a line of code one adds a '#' in front of it. This line of code is then ignored by the interpreter. Pycharm can comment out whole sections by highlighting them and clicking on "Comment with Line Comment" found under "Code" in the main menu bar. Highlighting the section and clicking this again removes the comment marks and the code can be processed again. This is useful if you want to work on getting things working in a systematic way. At this point I would definitely comment out lines 206 on until we have the first section working.
It would help if I knew a bit more about your project if you can send me a link?? I can tell much from just the regular classification page.
3 Participants
29 Comments
LOL! I thought you didn't like PMs, @zutopian? When I get a PM that asks a question about how something works in Classify or Talk, I usually reply by telling the person that other people probably have the same question, so it should be asked and answered in public on Talk. (I usually do give the answer via PM, but not again, if the person continues using PMs for that kind of question).
So I will reply to your question here in public, in case someone else who is reading would also like to know:
At the bottom of every Talk post, there are several commands: Helpful, Reply, Link, Report; also if you are the author of the post: Edit, Delete. Pressing Report will Report that post to the project's moderator(s). You can also post in the same thread and tag moderators (prefixed with @ ) or tag an individual moderator or member of the project team. Or you can send a PM to a moderator or team member.
I don't know if there is a button to report someone who sends you a PM: looking around messages in my inbox, I don't see one. Does someone else know if there's a built-in way to do that, or a Zooniverse "abuse" address?
When I get a PM asking a question about a project I'm not familiar with, I tell the person to ask on Talk in that project. If they send further messages, I tell them to stop. I have never gotten an abusive message, and only a few from people I hadn't previously conversed with on Talk.
I am curious about the message you got today. Was it abusive? If it was just a first message from a new user, and you don't want PMs from anyone (or from people you don't already know), why not just reply and say "I don't use PMs. You can post your message or questions on Talk and if I am interested, I may reply there." Why block them? If they are a new user, they probably don't know enough to understand the communication channels here, and if you ask them not to send PMs, they will stop. (If they don't stop after you ask, THEN block and/or report them.)
LOL! I thought you didn't like PMs, @zutopian? When I get a PM that asks a question about how something works in Classify or Talk, I usually reply by telling the person that other people probably have the same question, so it should be asked and answered in public on Talk. (I usually do give the answer via PM, but not again, if the person continues using PMs for that kind of question).
So I will reply to your question here in public, in case someone else who is reading would also like to know:
At the bottom of every Talk post, there are several commands: Helpful, Reply, Link, Report; also if you are the author of the post: Edit, Delete. Pressing Report will Report that post to the project's moderator(s). You can also post in the same thread and tag moderators (prefixed with @ ) or tag an individual moderator or member of the project team. Or you can send a PM to a moderator or team member.
I don't know if there is a button to report someone who sends you a PM: looking around messages in my inbox, I don't see one. Does someone else know if there's a built-in way to do that, or a Zooniverse "abuse" address?
When I get a PM asking a question about a project I'm not familiar with, I tell the person to ask on Talk in that project. If they send further messages, I tell them to stop. I have never gotten an abusive message, and only a few from people I hadn't previously conversed with on Talk.
I am curious about the message you got today. Was it abusive? If it was just a first message from a new user, and you don't want PMs from anyone (or from people you don't already know), why not just reply and say "I don't use PMs. You can post your message or questions on Talk and if I am interested, I may reply there." Why block them? If they are a new user, they probably don't know enough to understand the communication channels here, and if you ask them not to send PMs, they will stop. (If they don't stop after you ask, THEN block and/or report them.)
42 Participants
313 Comments
Things sometimes go wrong when you ask for an export, and of course you are then locked out for the day.
Note, if you do ask for an second full data export within the twenty four hours, you get a message saying the export is being created, but 1) nothing happens, and 2) that message stays there even after the blackout period has passed. In that case, wait twenty four hours after the last time you requested the export and simply request again despite the message which will still be showing that the export is being generated.
If you asked for a full data export, and for some reason that fails, you do have the option of asking for a data export by workflow. - so you more or less have two chances per day., so try asking for the other option and see if that will work.
Failing that, email zooniverse and they can free things up
Things sometimes go wrong when you ask for an export, and of course you are then locked out for the day.
Note, if you do ask for an second full data export within the twenty four hours, you get a message saying the export is being created, but 1) nothing happens, and 2) that message stays there even after the blackout period has passed. In that case, wait twenty four hours after the last time you requested the export and simply request again despite the message which will still be showing that the export is being generated.
If you asked for a full data export, and for some reason that fails, you do have the option of asking for a data export by workflow. - so you more or less have two chances per day., so try asking for the other option and see if that will work.
Failing that, email zooniverse and they can free things up
3 Participants
5 Comments
This is a very small point I noticed in the workflow classification export request function.
When one asks for a workflow classification export, and gets the Select a workflow block, the previous export link shows in blue as "a few seconds ago" and this does not increment as time passes since the export was last requested. The link works - linking to the previous export, and a new request is accepted if enough time has past (I have not tested how long, but it has worked every time I have asked at least a day apart) Indeed everything works as I expect it to with the exception of how long it has been since that workflow export was requested.
The specific project is FossilFinder but I doubt it is project specific (though I did not test that)
Compared to asking for the whole classification export which has like 700,000 classifications, asking for just the last 30,000 classifications of the latest workflow is fantastic.
This is a very small point I noticed in the workflow classification export request function.
When one asks for a workflow classification export, and gets the Select a workflow block, the previous export link shows in blue as "a few seconds ago" and this does not increment as time passes since the export was last requested. The link works - linking to the previous export, and a new request is accepted if enough time has past (I have not tested how long, but it has worked every time I have asked at least a day apart) Indeed everything works as I expect it to with the exception of how long it has been since that workflow export was requested.
The specific project is FossilFinder but I doubt it is project specific (though I did not test that)
Compared to asking for the whole classification export which has like 700,000 classifications, asking for just the last 30,000 classifications of the latest workflow is fantastic.
3 Participants
3 Comments
Many of the most interesting discoveries within Zooniverse projects have happened through our ‘Talk’ forums, as a result of discussion between volunteers, researchers, and development team members. We encourage everyone to join these conversations. We strive to make Talk as welcoming and supportive a place as possible. It is essential that people feel safe and comfortable being curious publicly; sharing their questions and ideas and engaging in real dialogue.
With this in mind, please:
We note that these standards apply to private messages between participants as well.
If you feel a comment doesn’t follow our community standards, please do let us know by clicking ‘Report’, so that we can reach out to that individual and help guide them back to participating in a respectful and supportive way. If you have an issue with another participant, please contact a member of the Zooniverse team, the project’s research team (for project-specific Talk boards), or one of the moderators. We are here to help make this a supportive and welcoming place for all our participants!
Thank you so much for being part of this community and contributing to research through Zooniverse!
Many of the most interesting discoveries within Zooniverse projects have happened through our ‘Talk’ forums, as a result of discussion between volunteers, researchers, and development team members. We encourage everyone to join these conversations. We strive to make Talk as welcoming and supportive a place as possible. It is essential that people feel safe and comfortable being curious publicly; sharing their questions and ideas and engaging in real dialogue.
With this in mind, please:
We note that these standards apply to private messages between participants as well.
If you feel a comment doesn’t follow our community standards, please do let us know by clicking ‘Report’, so that we can reach out to that individual and help guide them back to participating in a respectful and supportive way. If you have an issue with another participant, please contact a member of the Zooniverse team, the project’s research team (for project-specific Talk boards), or one of the moderators. We are here to help make this a supportive and welcoming place for all our participants!
Thank you so much for being part of this community and contributing to research through Zooniverse!
1 Participant
1 Comment
The other day @trouille posted Zooniverse Talk Community Standards, which are very welcome.
I'm wondering if something similar could be posted, re participation in research into citizen scientists?
No doubt like many of you reading this, I have, over the past few years, received several PMs (Messages, even emails) from people who say they are doing research into us citizen scientists, and asking me if I'd like to participate in one way or another (e.g. filling in a questionnaire, doing an interview by Skype).
In principle, I support such efforts/work, and am more than willing to help/join.
However, I've increasingly decided to do some screening first; for example, I ask for the name of the institution and the supervisor (in the case of work done for a degree, say), as well as a full name and email addy of the person doing the asking (if it's not already given). And case by case, I'll ask for a guarantee that I get a copy of any papers which result (as well as insisting that they be open access), and what IRBs have/will be involved.
Why do this?
First, if it's a random request, out of the blue, how do you know it's not some kind of scam or sophisticated phishing attempt? If it's legit research, things like name, email addy, supervisor, and institution (university or college, say) should be freely available.
Second, just as data I produce by classifying in Galaxy Zoo (say), when it's turned into papers, should be both acknowledged and the papers freely available, so too for this kind of data and papers.
Third, the Zooniverse and individual countries/institutions/etc have their own ethical standards concerning how research is conducted, when it involves humans/community members/citizens/etc. The same thing should apply for research into us citizen scientists; if there's not an IRB, there should be; if there is an IRB, it should be identified and its members known (publicly).
What do you think?
The other day @trouille posted Zooniverse Talk Community Standards, which are very welcome.
I'm wondering if something similar could be posted, re participation in research into citizen scientists?
No doubt like many of you reading this, I have, over the past few years, received several PMs (Messages, even emails) from people who say they are doing research into us citizen scientists, and asking me if I'd like to participate in one way or another (e.g. filling in a questionnaire, doing an interview by Skype).
In principle, I support such efforts/work, and am more than willing to help/join.
However, I've increasingly decided to do some screening first; for example, I ask for the name of the institution and the supervisor (in the case of work done for a degree, say), as well as a full name and email addy of the person doing the asking (if it's not already given). And case by case, I'll ask for a guarantee that I get a copy of any papers which result (as well as insisting that they be open access), and what IRBs have/will be involved.
Why do this?
First, if it's a random request, out of the blue, how do you know it's not some kind of scam or sophisticated phishing attempt? If it's legit research, things like name, email addy, supervisor, and institution (university or college, say) should be freely available.
Second, just as data I produce by classifying in Galaxy Zoo (say), when it's turned into papers, should be both acknowledged and the papers freely available, so too for this kind of data and papers.
Third, the Zooniverse and individual countries/institutions/etc have their own ethical standards concerning how research is conducted, when it involves humans/community members/citizens/etc. The same thing should apply for research into us citizen scientists; if there's not an IRB, there should be; if there is an IRB, it should be identified and its members known (publicly).
What do you think?
2 Participants
6 Comments
We beta test all of our projects. The difference for the projects built by external researchers is that we explicitly ask the review community if it is suitable to be launched as an official Zooniverse project. For Poppin' Galaxy the answer was an emphatic "yes". We have had one project so far that the community has rejected, and many that have been rejected by the Zooniverse team before making it to review by our volunteers.
Why is the "review community" just explicitly asked, if a project is suitable to be launched as an official Zooniverse project, when a project is built by external researchers?
I would like, that the "review community" is also asked, if a project is suitable to be launched as an official Zooniverse project, when a project isn't built by external researchers!
The project, which was rejected by the "review community", would have presumably become an official Zooniverse project, if the project hadn't been built by external researchers! BTW, I guess, that you refer to the project "Faces of World" (social robots)! I filled out the feedback form and wrote, that it should be rejected for several reasons! In my opinion, the FOW project obviously wasn't suitable for the Zooniverse! I am astonished, that it hadn't been rejected by the "Zooniverse Team" before asking the "beta/review community" for feedback!
We beta test all of our projects. The difference for the projects built by external researchers is that we explicitly ask the review community if it is suitable to be launched as an official Zooniverse project. For Poppin' Galaxy the answer was an emphatic "yes". We have had one project so far that the community has rejected, and many that have been rejected by the Zooniverse team before making it to review by our volunteers.
Why is the "review community" just explicitly asked, if a project is suitable to be launched as an official Zooniverse project, when a project is built by external researchers?
I would like, that the "review community" is also asked, if a project is suitable to be launched as an official Zooniverse project, when a project isn't built by external researchers!
The project, which was rejected by the "review community", would have presumably become an official Zooniverse project, if the project hadn't been built by external researchers! BTW, I guess, that you refer to the project "Faces of World" (social robots)! I filled out the feedback form and wrote, that it should be rejected for several reasons! In my opinion, the FOW project obviously wasn't suitable for the Zooniverse! I am astonished, that it hadn't been rejected by the "Zooniverse Team" before asking the "beta/review community" for feedback!
6 Participants
31 Comments
Ok this is a simplistic overview and may make the zooniverse developers cringe...
As a user or project owner think of things this way.. zooniverse is a platform for doing citizen science in a specific way that may or may not suit your needs. For the small guys with little IT support it is very useful and quite versatile , but one has to accept it is what it is and is not likely going to get changed in any way for a small project.
The software the project runs on has two main components - A "Front End" which determines how the projects look and are presented - everything the volunteer sees and experiences. - A Back-end (though no one refers to it that way that determines how the servers store and manage data - that is the thing called panoptes. Together the front end presents information from the panoptes servers in a functional way, accepts input from the volunteers and passes that back to panoptes that then manages all the existing and incoming data.
Built around these two main components though are a bunch of tools -
First and foremost is the project builder. This is a tool to define all the pieces the front end will need to present a project, set up all the tasks in all the workflows and generally set up panoptes data structures ready to accept the input of data subject and responses from the volunteers. In particular it is used to define the tasks in a workflow - there are a set of standard tasks currently available to all and a few experimental task types which one has to be aware of and ask for. So if you see a project that does something that you do not see how to do in the regular project builder, you may need to ask zooniverse to have it enabled for you. A current example is the combo task where more than one question of transcription is shown at once. Experimental tasks may change or present future issues and to get them, you may need to show you have the IT support to work through those potential changes. (if for example the data structure in the data export was to change due to changes in the experimental task)
Another tool is the panoptes_client which is a Python package (Python is a fairly simple to use programming language) that allows one to use Python to query and modify the panoptes data via a API (API is just a fancy term for something that allows one to access and work with the data stored by some other piece of software). Using the client one can do just about anything that the project builder can do, plus retrieve just about any panoptes data, and in many cases actually modify the data stored by panoptes. Practically that means uploading subjects, modifying or correcting subject images or metadata, or linking/unlinking them from subject sets, and linking/unlinking subject sets and workflows, copying subject sets, workflows or even projects. It is also used to track projects and workflows, and produce customized statistics. I think this is what you are specifically asking about so more on this later...
The panoptes Command Line Interface - is a scaled down version of panoptes-client that allows one to use many of the panoptes_client functions in a single command line that does not require Python coding - it runs in Python but you do not need to be able to code in Python to use it, It was specifically meant to ease the uploading of new subjects vs the rather weak method using the subject uploader section of the project builder, but it can do some other things as well, though nothing like what one can do with the panoptes_client!
Recently there have been some additions to the tool set - there is an aggregation tool that one can set up that will extract classification data from panoptes and enable display and aggregation of the task responses - I do not use it but many smaller IT lite project do. - Again it does what it does, and is not likely to be customized in any way for your project, where Python code (or any other simple coding language) can be used to do much more with your data.
Another tool is Caesar - this is a tool that allows near realtime analysis of the volunteers responses in a workflow as they are received. It can apply simple rules to do a number of things with data already in panoptes, such as retire subjects, or link them to different workflows. Currently project owners need zooniverse assistance to set up caesar for customized retirement rules.
Finally there are scripts written for other projects which can be used for specific tasks such as Notes from Nature's reconcile.py (again in Python). Some of these are quite general and can be used for many projects with similar task structures, while many are highly customized for specific projects. Most projects make the scripts they run freely available, though few will assist with modifications or tech support beyond that.
Since I think you are asking about the panoptes_client, here is a bit more info specific to the client:
First the user documentation is here
To use it you need to install Python and be able to do rudimentary coding, or if using someone's scripts, be able to run Python scripts. If you are on a Windows system there is a trick to get a working panoptes_client installation see here. I use PyCharm as an IDE which makes getting the client running easier and Pycharm Edu has some training modules to get you running Python scripts in an hour or two.
Also check out the top post in the Data processing forum.
Questions? just fire away...
Ok this is a simplistic overview and may make the zooniverse developers cringe...
As a user or project owner think of things this way.. zooniverse is a platform for doing citizen science in a specific way that may or may not suit your needs. For the small guys with little IT support it is very useful and quite versatile , but one has to accept it is what it is and is not likely going to get changed in any way for a small project.
The software the project runs on has two main components - A "Front End" which determines how the projects look and are presented - everything the volunteer sees and experiences. - A Back-end (though no one refers to it that way that determines how the servers store and manage data - that is the thing called panoptes. Together the front end presents information from the panoptes servers in a functional way, accepts input from the volunteers and passes that back to panoptes that then manages all the existing and incoming data.
Built around these two main components though are a bunch of tools -
First and foremost is the project builder. This is a tool to define all the pieces the front end will need to present a project, set up all the tasks in all the workflows and generally set up panoptes data structures ready to accept the input of data subject and responses from the volunteers. In particular it is used to define the tasks in a workflow - there are a set of standard tasks currently available to all and a few experimental task types which one has to be aware of and ask for. So if you see a project that does something that you do not see how to do in the regular project builder, you may need to ask zooniverse to have it enabled for you. A current example is the combo task where more than one question of transcription is shown at once. Experimental tasks may change or present future issues and to get them, you may need to show you have the IT support to work through those potential changes. (if for example the data structure in the data export was to change due to changes in the experimental task)
Another tool is the panoptes_client which is a Python package (Python is a fairly simple to use programming language) that allows one to use Python to query and modify the panoptes data via a API (API is just a fancy term for something that allows one to access and work with the data stored by some other piece of software). Using the client one can do just about anything that the project builder can do, plus retrieve just about any panoptes data, and in many cases actually modify the data stored by panoptes. Practically that means uploading subjects, modifying or correcting subject images or metadata, or linking/unlinking them from subject sets, and linking/unlinking subject sets and workflows, copying subject sets, workflows or even projects. It is also used to track projects and workflows, and produce customized statistics. I think this is what you are specifically asking about so more on this later...
The panoptes Command Line Interface - is a scaled down version of panoptes-client that allows one to use many of the panoptes_client functions in a single command line that does not require Python coding - it runs in Python but you do not need to be able to code in Python to use it, It was specifically meant to ease the uploading of new subjects vs the rather weak method using the subject uploader section of the project builder, but it can do some other things as well, though nothing like what one can do with the panoptes_client!
Recently there have been some additions to the tool set - there is an aggregation tool that one can set up that will extract classification data from panoptes and enable display and aggregation of the task responses - I do not use it but many smaller IT lite project do. - Again it does what it does, and is not likely to be customized in any way for your project, where Python code (or any other simple coding language) can be used to do much more with your data.
Another tool is Caesar - this is a tool that allows near realtime analysis of the volunteers responses in a workflow as they are received. It can apply simple rules to do a number of things with data already in panoptes, such as retire subjects, or link them to different workflows. Currently project owners need zooniverse assistance to set up caesar for customized retirement rules.
Finally there are scripts written for other projects which can be used for specific tasks such as Notes from Nature's reconcile.py (again in Python). Some of these are quite general and can be used for many projects with similar task structures, while many are highly customized for specific projects. Most projects make the scripts they run freely available, though few will assist with modifications or tech support beyond that.
Since I think you are asking about the panoptes_client, here is a bit more info specific to the client:
First the user documentation is here
To use it you need to install Python and be able to do rudimentary coding, or if using someone's scripts, be able to run Python scripts. If you are on a Windows system there is a trick to get a working panoptes_client installation see here. I use PyCharm as an IDE which makes getting the client running easier and Pycharm Edu has some training modules to get you running Python scripts in an hour or two.
Also check out the top post in the Data processing forum.
Questions? just fire away...
2 Participants
5 Comments
There is no problem technically with what you want to do - unlaunched projects run exactly the same as launched projects - the only difference is they do not show on the launched project list, and you need the url to get to them.
So you can set a retirement limit for your verification workflow at the number of experts that will verify the public result. This can be set as low as one. It can be higher, you may even open it up to volunteers that showed skill by invitation. Your experts will know when they have done all the subjects to verify by the already seen banner once they have completed all subjects.
It is very easy to "advance" subjects from the public workflow to the verification workflow - there are two ways you can do this - 1) with caesar in which case you need a simple rule that determines when a subject is advanced such as it has received x votes for one species. You do need zooniverse developers assistance to set up caesar so contact them if you are interested. 2) you can use a script which analyses the classification export and determines which subjects to verify, and then proceeds to link them to a subject set linked to the verification workflow using the panoptes client.
I would suggest that you use a script - it is under your control and you can do things like accept public results which have a very high degree of consensus, and verify only those which are not so certain. It is my experience though that if the public has basically no consensus then I doubt your experts will be any more accurate (you get a definite answer but is it correct?) Many times the subject is simply not clear enough to make a accurate determination. I suspect verification will only be useful for a narrow range of subjects with moderate uncertainty where the public result show some but not a great deal of variation.
So technically there are no big issues but you may want to think about the ethics a bit - you are asking volunteers to expend effort and then you are in some sense not using that effort as they expected you to. It gets a little murky so I am not saying "don't go there", just think about it and what you might say on the project so the process is open and transparent.
You might also consider a filter type set-up (See Snapshots at Sea) where you ask simple questions which lead to the subject being advanced to different workflows based on the answers - I could see asking to select the fish species by families of confusing similar appearance in one workflow, then working with those grouped subjects in further workflows which are family specific. There you ask about particular defining characteristics which lead to final species selection within the family. A filter cascade avoids having many species to select from with the volunteer having to know them all - where grouping subjects by similar species reduces the choices to consider to a few with more chance your volunteers will become skilled for a particular family (workflow). Volunteers love fast simple decisions, so the increased uptake makes up for the increased number of tasks in the multiple workflows.
I have Python scripts for advancing subjects, and I could walk you through the various verification workflows I have set up for various projects if you are interested.
There is no problem technically with what you want to do - unlaunched projects run exactly the same as launched projects - the only difference is they do not show on the launched project list, and you need the url to get to them.
So you can set a retirement limit for your verification workflow at the number of experts that will verify the public result. This can be set as low as one. It can be higher, you may even open it up to volunteers that showed skill by invitation. Your experts will know when they have done all the subjects to verify by the already seen banner once they have completed all subjects.
It is very easy to "advance" subjects from the public workflow to the verification workflow - there are two ways you can do this - 1) with caesar in which case you need a simple rule that determines when a subject is advanced such as it has received x votes for one species. You do need zooniverse developers assistance to set up caesar so contact them if you are interested. 2) you can use a script which analyses the classification export and determines which subjects to verify, and then proceeds to link them to a subject set linked to the verification workflow using the panoptes client.
I would suggest that you use a script - it is under your control and you can do things like accept public results which have a very high degree of consensus, and verify only those which are not so certain. It is my experience though that if the public has basically no consensus then I doubt your experts will be any more accurate (you get a definite answer but is it correct?) Many times the subject is simply not clear enough to make a accurate determination. I suspect verification will only be useful for a narrow range of subjects with moderate uncertainty where the public result show some but not a great deal of variation.
So technically there are no big issues but you may want to think about the ethics a bit - you are asking volunteers to expend effort and then you are in some sense not using that effort as they expected you to. It gets a little murky so I am not saying "don't go there", just think about it and what you might say on the project so the process is open and transparent.
You might also consider a filter type set-up (See Snapshots at Sea) where you ask simple questions which lead to the subject being advanced to different workflows based on the answers - I could see asking to select the fish species by families of confusing similar appearance in one workflow, then working with those grouped subjects in further workflows which are family specific. There you ask about particular defining characteristics which lead to final species selection within the family. A filter cascade avoids having many species to select from with the volunteer having to know them all - where grouping subjects by similar species reduces the choices to consider to a few with more chance your volunteers will become skilled for a particular family (workflow). Volunteers love fast simple decisions, so the increased uptake makes up for the increased number of tasks in the multiple workflows.
I have Python scripts for advancing subjects, and I could walk you through the various verification workflows I have set up for various projects if you are interested.
4 Participants
6 Comments