ODK Collect widget/shortcut for setting metadata (e.g. username)

Florian_May · September 18, 2019, 3:31pm

1. What is the general goal of the feature?
Provide a quick way for enumerators to set their name as username, so that all forms collected by them on that device can record the username automatically as metadata.

The ODK Collect 1x1 widgets allow to create and arrange shortcuts for individual forms on the home screen. This allows to arrange the forms in a sort of flow chart.

It would be brilliant if we could have the same for settings, with a special interest in those that might need to change often.

2. What are some example use cases for this feature?
Fifty-odd volunteers are rostered into separate survey days (one or a few vollies each day) to use one Android tablet to survey a beach for signs of turtle nesting. Every survey starts roughly at 5am with a under-caffeinated volunteer having to 1. remember 2. manage to dig their way into the furthest General Settings menu to enter their name.

The cost of forgetting to set the username is a heap of QA work to re-allocate the "reported by" for many records.

We do not have all names available in advance, new enumerators could join at any time, so pre-printing the QR setup code with each username in it is not feasible.

This shows two problems:

It is hard to remember to change a setting if the field to change is four menus deep. Out of sight, out of mind. Especially at 5am when counting turtle tracks on a humid beach.
It is tedious to navigate four menus deep. Our enumerators have reported this operation as a special pain point.

3. What can you contribute to making this feature a reality?
UAT of any proposed solutions, even mock-ups could be used to get feedback from our coordinators (trainers) and enumerators.
Post-season survey filled in anonymously by enumerators could provide feedback about usability.

LN · September 19, 2019, 2:28am

It sounds like the underlying need is to have the same enumerator identifier entered for several submissions in a row and then to change it when a new enumerator swaps in, right?

It further sounds like the identifier is important from an informational standpoint but that there's no risk of one of these volunteers deliberately wanting to impersonate another, is that right? I ask because one of the nice things about using the username setting is that it can be set and then locked down with an admin password. But this doesn't sound relevant to your scenario.

If I've gotten all that right, it sounds like just the kind of situation that the "last saved" feature was designed for. The idea is that you could start your form with a username question and set it to default to whatever the last saved form's value was. The first enumerator will see a blank value, then s/he will put in their name, and then next time they open a blank form, that name will be populated. Once they hand the device over to someone else, that person will open a blank form, see the previous enumerator's name, change it to their name, and then every subsequent form they open will start with their name.

Does that sound like what you'd like?

This feature was contributed by an ecosystem member who does not use XLSForm so unfortunately XLSForm support is specified but still pending. You can use it by modifying the XML of your form definition to add the following after </instance>:

<instance id="last-saved" src="jr://instance/last-saved" />
<setvalue event="odk-instance-first-load" ref="<your question path>" value="instance('last-saved')<your question path>" type="string"/>

For example, if the question's name in XLSForm is username, find that in the XML and copy the whole value of the ref to where I put <your question path>. It will look something like /my-form/username.

If that doesn't work for you, it would be helpful to understand what about your scenario makes that undesirable.

If you'd rather wait for the XLSForm support, another option would be to have a question that shows the username from settings and asks the current user to confirm whether that is their name. If they say no, it could give them instructions for setting their name in the settings.

Florian_May · September 19, 2019, 3:33am

Hi @LN,

thanks for that, this would work for us if we can we use "last saved" across different forms - is that possible? Can we set this in ODK Build? (ping @issa)

Alternative ideas:

Option to add "Update metadata" as top level menu entry. Warning, once this exists, other settings pages might also want to opt into top level menu. Benefit: turns navigating several obscurely named menus (feedback from enumerators) into one tap.
Option to display at least username, and possibly other metadata on top level instead of catchphrase (which is a nice touch but has no business value). Benefit: makes username visible to enumerators (currently it's not). Best way to solicit good data (current enumerator's name) is to show wrong data (previous enumerator's name) and an "edit" button.

Our work flow:

Same tablet, new day, one new user leads the survey, optionally accompanied by some others.
User fills in form "survey start" (can set username there, and who else is on the team for the day).
User fills in a range of 3-4 forms, depending on what they encounter (a turtle track/nest, a sign of a predator or disturbance, a marine wildlife incident form for injured/stranded/dead turtles).
User closes off with a "survey end" form.

We can't rely on any one form being the first one recorded, sometimes enumerators forget to fill in a "survey start". I really would love to have a reliable "recorded by" on each form.

Cost of getting the username wrong:

We ask the wrong enumerator for clarification if the data seems incorrect.
Tracking observer bias and targetting refresher training gets harder - if we know X messed up a specific step Y, we can simply train X again how to do Y.
We get the "you have surveyed XX hours and walked YY kms" wrong on the end of season "thank you" note - if user X captures data with user Y recorded as enumerator, Y gets X's hours from that day.

Nothing dramatic from a legal/data privacy/medical point here. I could imagine people handling serious medical data like @dr_michaelmarks might have more restrictive scenarios here.

LN · September 20, 2019, 2:52am

Unfortunately not.

Also no.

Thanks for all the additional details about the workflow, that's very helpful.

I think this is big enough and has enough potential for interfering with other planned and requested features that it'll need to be discussed at the TSC level. For example, I think another option would be to add a login/logout concept to Collect. This has been discussed in the context of tracking who has made changes when a device can be used by multiple people. There we're talking about some kind of authentication being involved but there could be an option just to request a user identifier as login and that's it.

What I would do in the mean time is start every survey with a question that says something like "Is your name <output ref="/data/username" />? If not, please exit the form, tap on the three dots at the top right, tap User and device identity, tap Form metadata and enter your username.". Note that I used Build syntax for the username field because you mentioned Build. In XLSForm one would use ${username}. I know it's not ideal but it should dramatically lower the risk of data coming in with the wrong name.

dr_michaelmarks · September 20, 2019, 8:11pm

Yes to be honest if you have a login (ideally with say a PIN set by the Admin) then that
a) solves this
b) solves many other issues

That said I understand Collect has never really had the concept of a login before so its a big change

seewhy · September 23, 2019, 1:13pm

Although its a workaround rather than a solution to the problem, you could integrate things and only use one form - I was collecting data using multiple forms and found it hard to keep track, so I moved to a single form with an early question of 'what is the reason for this record' (paraphrased!) - I then used a relevant column in my spreadsheet (probably available in Build?) to show the group according to the 'data type' answer.

in my case the Group question is set to 'field-list' so they all show on a single page and I set the relevant column to ${datatype}="A", then for the next group, which would have been a separate form, I use ${datatype}="B". So this works like form logic, hiding parts of the form that are not relevant to that particular record. Maybe it only works because I've set it to 'field-list', but I guess you could have a 'relevant' for each question within that group if it doesn't display as expected.

I also use a repeat group so that enumerators only have to fill in certain common data once in a day - obviously this might not work for your needs, but it helps me - perhaps you could have the first question as the enumerator's name, followed by the helpers? It avoids changing metadata too.

Apologies if this is a lesson in egg-sucking, but I thought I would share something that I found simplified data collection and (especially) sending finalised forms - there's just one that needs to be sent each day, and the instance name makes it easy to identify. I also have just one dataset to manage rather than integrate multiple forms into my GIS...

Just a thought... It seems to fit your workflow, but might not fit with the way you are managing the data.

Florian_May · September 24, 2019, 1:04am

@seewhy thanks heaps for the write-up!

We've been experimenting with one giant form with repeats, and it's indeed what many enumerators have suggested. This would be an absolutely viable solution if there were a good way to dive in and out of subgroups. A simple mock-up:

Real-world complication: device could fail/run out of battery/fall and shatter at any point in time. We always carry a spare device to take over enumeration. What would happen with an unsaved master form with core data in the "start of survey" part?

Our workflow:

One start of survey form
Four detail forms (let's call them observation A, B, C, and D)
One end of survey form

During a survey, the enumerators could encounter e.g.:

start of survey
52 x A
1 x B
37 x A
1 x B
24 x A
1 x C
140 A
end of survey

A naive implementation in ODK Collect would be

screen 1: start of survey
screen 2, repeats: A
screen 3, repeats: B
screen 4, repeats: C
screen 5, repeats: D
screen 6: end of survey
Real-world complication: A has ~17 groups, some repeating. In the simplest scenario, A is done after the first screen. B is two screens. C is again ~10 groups, some repeating.

Navigating these screens in the current ODK Collect paradigm of sequential screens, "add another X group?" with tens or hundreds of subgroups already filled just to add another subgroup is probably less user-friendly than separate forms.

If the above use cases and complications have a solution, one nested form like you describe would be ideal. Thanks again for sharing, it's useful to learn about what works for others!

seewhy · September 24, 2019, 9:57am

OK, @Florian_May you got me on the complexity

Here's a schematic of my situation which I've given to enumerators, which is

a) much less involved
b) not as well illustrated

A-D are the main groups, but I haven't tried using nested repeats within this as so far it's mutually exclusive - one set of data for a given location. We're looking a peat condition so fewer parameters (no more than 6 for each location).

However, I have found in the field that things are pretty robust - typically collecting 80 - 100 sets of data within the single form instance. There have been cases where Collect has crashed (sometimes on the Geopoint widget using Mapbox with some 'null value' message) or other times where we're using another app in parallel (e.g. clinometer or gps) and Collect restarts mysteriously. So far, I've not lost any data - on restart, selecting 'fill blank form' comes back to the saved version of the form in 'jump screen' and we just need to navigate to the right 'level'. I've also found it easy to save and exit the form during a break (e.g. caffeine injection) and 'edit saved form' to restart data collection where I left off. I've also been able to use Mapbox layers to load my own geo-data to help navigate to specific locations for my geopoints, which helps 'package' things into a single app rather than needing to switch constantly between apps.

The improvements to the navigation / jump screen (which I think you contributed to!) have made it much more viable to combine forms - it's pretty close to 'intuitive' now, just need to get that 'Add Group' message changed! Drilling into nested groups needs some work (I use 'dynamic' group names based on answers within the group, so that helps), but I can see that new enumerators could easily get lost in your forms! Maybe some kind of colour coding for different nested levels might make it easier?

One point, I've moved away from making questions 'required' as it can be frustrating if you miss one and then have to navigate back to some nested question (because the jump screen lets you bypass required questions, so is easily abused!), but if you just had top level 'required', such as the enumerator name, that might still work.

Looking at what you're collecting, I reckon your volunteers need way more than a 'well done for surveying for XX hours' message . You've obviously got some pretty dedicated folk helping you.

Anyway, keep up the good work...

Florian_May · September 25, 2019, 4:25am

Thanks for the details on your solution for peat surveys!
(The best solution I can offer for peat is Tallisker.)
Great to hear about dynamic group names! Did you make them in ODK Build?

Indeed our forms are more complex, so nesting might create a navigation challenge.

E.g. the main form (with 1 to 15-odd screens) has the following workflow: https://wastd.readthedocs.io/data_collection/data_collection_training.html#track-count-work-flow
It's of course about turtles: tracks (with/without/can't tell/maybe but didn't check for nests) of nesting female turtles observed in the early morning following a nesting night.

The only bad data we had was when we allowed to save unfinalised forms - those were forgotten in the outbox. Since then, we disabled anything but "save as finalised and send when wifi" and "discard".

Re mandatory fields: I only made the minimum viable dataset fields required - timestamp (via metadata form started), location (GPS button), subject (e.g. species), observer (via username). The rest is optional with safe defaults, e.g. "Disturbance present?" has options "confirmed present", "confirmed absent", and (default) "Didn't check" (NA under the bonnet).
I've also aggressively optimised the forms to have the minimum viable dataset on the first screen, with options to "unlock" subsequent screens as necessary.

Our volunteers are indeed amazing and invest a lot of personal effort in turtle census. At the same time it's a bucket list thing to do for visiting volunteers, and a special connection to their own location for locals. We also have Traditional Owners in affiliated ranger programs caring for country.

Cheers!

seewhy · September 25, 2019, 9:53am

'Fraid not... I'm 'tied' to XLSForm. The main reason being that I do a lot of recycling / reuse of forms and I find it the easiest way to adapt (I tend to colour code blocks of questions / groups in the spreadsheet to help visualise). So I've got a 'master' matrix with the different columns for parameters that allows me to develop new forms quickly - I work on a range of projects so tend to adapt as I go to the needs of the client / survey.

It's one of the strengths of ODK using XLSForm that means I can be flexible, rather than tell someone they have to use a clunky form that was designed for something else! So far I'm on 17 different forms for different data collection needs... mostly recreation or ecology based. Plus the various iterations (and mistakes) that got me there.

I played with Build for a while and although its intuitive, I found it harder to duplicate 'core data' so had to start again each time. Call me impatient, or efficient?

I guess once you're in Build, you're locked in, and vice-versa with XLSForm? I didn't find a way of translating between the two, although that might have changed.

I never thought of someone 'aggressively optimising' a form! I guess that comes from having long term investment in a data collection protocol rather than being a data 'butterfly'. Helpful to see how others are making use of ODK.

Sliante! (also in a Talisker context)

Florian_May · September 25, 2019, 12:53pm

re ODK Build core form parts, We've iterated towards a good common set of fields and defaults which work for our data warehouse (the code to ingest from ODK Aggregate is rather grisly - to be replaced by ruODK and ODK Central). Our main form for turtles is at version 56 now going into its fourth turtle nesting season.
A new form starts from an existing one, "saved as" with a new name, then I delete the specific bits and keep the core form. ODK Build also allows to bulk-paste and save the select1 options (great feature), so I can easily populate share e.g. a species shortlist across forms. (I alone owe @issa around 2 kB (kilo-Beers) for general ODK Build awesomeness and real-time support)

ODK Build is pretty low barrier for newcomers as it's simply impossible to create an invalid XForm.

Tallisker time now over here. Prost/cheers!

issa · September 25, 2019, 5:32pm

"core data" parts in Build: this is kind of a workaround but it's one that should work for most people:

you can open Build in two different windows and drag questions between them. and you can hold ctrl (option if you're on a mac) while dragging to clone.

(edit: oh and you can click on one question and shift-click on another to select all questions in that range for dragging.)

skål..

Florian_May · September 26, 2019, 12:12am

@issa nice one, sorry forgot to mention! (beer_count++;)
Updated https://github.com/dbca-wa/urODK