Approaches for linking form instance changes to individuals

The control for a 'comment' (called discrepancy note in OpenClinica) is defined as in the XLSForm/XForm as I posted above (so just the for attribute and an appearance). The RFC functionality is built on top of that discrepancy note question and has no associated XForm syntax (it's defined by the view that the backend UI launches for that user) but it shares the data structure (stringified JSON for OpenClinica) of the discrepancy note and is shown in its history (which the user can view within the form). RFC is always required for them.

Not sure how helpful that all is though, because that was designed long before any audit functionality was added to the spec, and I'm not advocating for it. The only advantage of their approach is that you can query that comment data (e.g. it has different statuses and they have a custom comment-status() XPath function that can inspect the status of that JSON data). This means they can use it inside constraint and required expressions in XPath. E.g. a question can be required only if it doesn't have a comment or a constraint can include the clause that a value can exceed a limit if it has an 'updated' or 'new' comment.

Though the use by OpenClinica may be useful to see how far clinical trial requirements could go (I think it's way beyond what a generic client such as Enketo or ODK Collect should handle, for sure). Hence none of that specialized stuff has made into the core Enketo. We just made the core Enketo very extensible to facilitate such domain-specific customizations.

1 Like

As I understand it, the workflow that @dr_michaelmarks and @chrissyhroberts have described is entirely offline and so there is no entity to authenticate against. The idea would be that someone fills out data on their device and then either reviews it and makes edits later or hands it to someone else for review and editing. It's the same as the paper case where initials have to be relied on. This would be a bit better because you could at least know exactly whose device submitted the data.

In an ideal world, I think Enketo could be used for online edits to extend that workflow. I'm imagining it could have a way to use server auth for this feature. For example, when a user is logged in to Central and launches a submission for edit, Central would pass on some kind of client token/hash that would automatically be used as the user identifier for this feature. I'd see that as a future extension on this spec -- something like if the session (virtual secondary instance) has a user identifier, use that.

You mean for all questions, right? My evolving sense of option 2 is that it would be something like a single audit attribute (e.g. odk:track-change-reasons) that lets the client know to prompt for user identifier and change reason every single time a field is changed from a blank value to a non-blank value. The audit log would get two new columns (e.g. editor-id and change-reason) that would get populated. Editing or saving of the form would be blocked until editor identification and change reason were populated.

Basically this.

That would be very convenient and can be client-specific.

1 Like

Will there be daycare services at the convening?... :wink:

Well, first time we could use the login information (username), then if someone edit has to enter his/her user before editing. (Of course no authentication, all offline as paper based).
A bit more of restriction can be added validating the user editing against a csv media file with the list of users "authorized" for editing (?).

Yes. And I like your approach :slight_smile:

1 Like

Great progress people!

It's all about enhancing electronic data collection by making is verifiable, have an effective audit trails thus credible/authentic data.

@yanokwa, see how to make a signature feature work. In my view, it could be the most tamper-proof.


Just a minor point but we per se don't need an actual "signature" - an alphanumeric identifier would be fine (that's what Redcap records for example) and the file size of saving a JPG, or in the case of an audit trail multiple JPGs each time a change is made, would be problematic on many places we use ODK.

1 Like

The second option seems like the best one to me as well but I'm not sure if I understand everything...
We have many comments here but no one approach (most of them come from the option 2)
We should ask for the user id every time a form is started//edited and write it to audit.csv file and that seems fine, but what about those comments... I saw the sample @LN attached

It assumes that a user should also add a comment just when the form is opened and it's a general comment. Does that make sense to ask for such a general comment before editing anything? I think a user might don't know what he is going to change at that point.
If we want to have a general comment for the entire form it should be displayed when the form is being exited.

Asking for a comment, for each edited question (if a question requires it) seems fine but in that case it should ask for a comment, only not user id becasue it doesn't make sense to duplicate it right?

I think we can implement it in two separate pr (two features):
First of all we should add an option to ask for the user id and a generall comment.
Then we can add an option to ask for comments for each separate questions.

Just to add (although we have incentive enough as is) that we are getting fairly serious interest from major players in other serious epidemic diseases to use ODK if we can get the GCP compliance issues sorted - of which this audit trail thing is the key step.

1 Like

FYI I think this would be inadequate; I can certainly see any change to a response - eg from "Fail" to "Pass" - as requiring both a timestamp and an optional (or mandatory?) comment. And in the general case potentially a signature too (@dr_michaelmarks?)

I think a fully general-purpose design may well need to fulfill:

  • allow for comment for every change (vs a single comment at beginning/end, or comment only for null-to-nonnull). And we probably want something to flag whether a comment is mandatory or not.
  • timestamp every change
  • allow for 'signature' (or otherwise initial) for every change (vs just a single signature at beginning). And likewise flag if mandatory or not.

Question is, we can we scope it back initially? eg only require signature at beginning (which might determine whether this is 1 PR or more...)

I think we can implement it in two separate pr (two features):
First of all we should add an option to ask for the user id and a generall comment.
Then we can add an option to ask for comments for each separate questions.


@LN and I spent some time iterating on a proposal built around Option 2 and we've published it at

We've put it in a Google Doc so you can easily leave comments. We'd love your comments or objections over the next 5 days. @TSC-1 please also review!

Some thoughts/straw man (google doc comments getting rather long...)

  • when first start up or reopen a saved form, the user has to pseudo-'login' by providing their name/initial in a popup (if odk:track-user set)

  • if odk:track-change-reason (per form or per question?) is set then any user change to an existing non-null value - including initially non-null defaults in the instance XML, or going back and changing a response - brings up mandatory popup to enter reason (this would have to occur after constraint checking). By implication, this change is tagged against the current 'logged in' user, since otherwise Collect has no notion of internal sessions. Note, this (also) occurs when first filling in the form [why? 'cause I dont see a compelling reason to introduce and track a distinct different state to distinguish between before and after saved form state... but perhaps that's open to debate]

  • best practice is that you must 'save' forms before handing the device to another person, who will then have to 'login' using their id/initial should they reopen the form and edit it. Realistically I dont think there's a lot we can do to enforce a re-login when you hand the device to someone else, so it'll just have to be best-practice...

  • any and all changes timestamped, with the initial answering of a question (ie null to non-null) suitably tagged as such, since these wont have a change reason (as described above)

1 Like

Thanks so much for all the thoughtful responses, everyone. @yanokwa caught me up on the TSC discussions.

The spec document has been edited to reflect the various feedback received. It's very similar to @Xiphware's strawman above.

You can reach the diff from the last version from File > Version History.

Some things to note:

  • asking for user ID and comments have been fully disaggregated as @Grzesiek2010, @Xiphware and others recommended. They are also completely independent from change tracking. This enables various scenarios such as auditing the identity of users navigating the form without including potentially sensitive form data in the log or using reasons for change as "notes to self" in a one-person data collection context.
  • the ODK XForms spec is deliberately agnostic to what the user identifier is and how it is obtained.

We've gotten feedback on the spec from the research facilitation team at LSHTM and the takeaway is..

The implementation we've specified, paired with appropriate procedures, could be made compliant with good clinical practice (and thus pass internal review boards). LSHTM will put together some procedures about how ODK could be used in a way that achieves all of this, but that shouldn't be a blocker.

All this to say, I think this should settle @TSC-1's concerns. I would request that TSC members (especially @aurdipas, @Xiphware, @martijnr) please review the updated specification at and leave comments. It'd be great to get approval on this in the next two weeks!

1 Like

Or slightly less... I'll put it on the agenda for next TSC call to discuss and (hopefully) approve! :slight_smile: