Link each change to form data to an individual

1. What is the general goal of the feature?
As the designer of a study, I must be able to ensure that each change made to a form’s data can be traced to the person who made it. This must be the case even if the changes are made entirely offline before form submission.

This is a companion feature to Collect: keep history of changes to values in the form.

There is also a related requirement to track comments about the changes made. Ideally the specification for tracking username can be general enough to also meet that requirement but if not, that feature will be tracked separately.

2. What are some example use cases for this feature?
This kind of traceability is required by various study protocols. For example, @dr_michaelmarks pointed here to the ICH Good Clinical Practice which is an international standard followed by federal research agencies around the world (e.g. FDA in the US or NIH in the UK). It states (CRF means case report form in this context):

4.9.3 Any change or correction to a CRF should be dated, initialed, and explained (if necessary) and should not obscure the original entry (i.e., an audit trail should be maintained); this applies to both written and electronic changes or corrections (see 5.18.4 (n)). Sponsors should provide guidance to investigators and/or the investigators' designated representatives on making such corrections. Sponsors should have written procedures to assure that changes or corrections in CRFs made by sponsor's designated representatives are documented, are necessary, and are endorsed by the investigator. The investigator should retain records of the changes and corrections.

5.18.4 (n) gives the following responsibility to trial monitors:

Informing the investigator of any CRF entry error, omission, or illegibility. The monitor should ensure that appropriate corrections, additions, or deletions are made, dated, explained (if necessary), and initialled by the investigator or by a member of the investigator's trial staff who is authorized to initial CRF changes for the investigator. This authorization should be documented.

3. What can you contribute to making this feature a reality?
Spec work, project management, software development

I have talked to @dr_michaelmarks and @chrissyhroberts about their requirements (thank you). They need the username logged in the audit file along with the value change itself. A common workflow for them will be that a device gets handed to a supervisor or monitor who will request changes. They are flexible when it comes to the user interface. We discussed a couple of different ideas:

  • a required user identifier field in the form that is always blank on form open and must be filled
  • a dialog that confirms user identity on form open

For others who would use this functionality, do you have additional requirements, constraints or ideas? @CharlieKeyes @Dalerhoda

I'll follow up soon with some spec ideas to support this within a given client (e.g. Collect, Enketo) but in the mean time I wanted to mention that there is a skunksworks project that has been referred to as Share led by @dexter21, @Shobhit_Agarwal and @mickys0918 at https://github.com/opendatakit/skunkworks-crow that does support a workflow where data collectors send their instances to a supervisor via a local wifi hotspot or soon bluetooth. The supervisor then reviews and possibly submits those form instances. I think there should be an answer to how this happens within an ODK client but I wanted to highlight this project since it's related.

1 Like

This sounds super exciting. Being able to see the original data as well as the changed data.

Just a few questions
a) Is there an XLSForm specification for this? I was wondering whether every variable field would be subject to tracking a change, or whether one could set specific fields. There are some pros and cons to both. In the case of every field being monitored for changes, I envisage a very long CSV file, particularly for a long form. This may make the submission of data challenging. In the case where specific variables are chosen to be tracked by the user, within an XLSForm specification, then one runs the risk of changes not being captured, thus undermining the compliance.

b) Presumably a date time stamp of the change will be captured as well for each variable that is changed. There could be multiple changes at different times.

c) The user ID would be very important. But I wonder whether a user would be able to see the change they made as well as the original value. It is unlikely a user would be able to view the CSV file and see the changes that were made. Unless the data could be passed to another form with a calculate and note field.

d) When a user make a change there needs to be a space to incorporate any explanation of the change (as per 4.9.3) above.

Hope this is useful,
Sham

@LN

This is a great idea. To help in thinking through the feature, how would the system ensure a data collector doesn't use someone else's credentials? Could a PIN or password be inbuilt into the feature as the suggested dialog confirms identity?

Paul

Thanks much for the feedback, @Lal_S and @paul_macharia!

Taking a step back, this form demonstrates what I think is the best that can be done now that Collect 1.22 has been released. The first screen of the form prompts for an identifier for the last person to edit the form and comments about the edit. There is a red notice that will be visible both on the first screen and in the hierarchy view. There could be different protocols developed around that concept. For example, editors could be asked to replace the comments in the text box or to append theirs after any existing ones. For identifying users, an external app could be used that is connected to a fingerprint scanner (e.g. https://www.simprints.com/).

While this is not perfect because there is no mechanism to require that editors add their identifier and comments when they make edits, I would argue that it's at least as good as paper. That is, when a paper form is used, there is some reliance on a protocol being established and followed -- the paper doesn't enforce it either. With paper, someone malicious could conceivably make an undetectable edit, destruct a filled survey or introduce a fake one. This is much harder with an ODK survey, especially with audits. With the track changes audit, it is now possible to always detect when a change occurred, even if someone fails to put in their identifier and comment.

I think we want to work towards being much better than paper in this regard but I wanted to point out what is possible today.

Tracking changes in the audit log was just released in Collect 1.22 on Friday. The documentation describes how to enable the audit log in XLSForm. Tracking changes will involve adding track-changes=true to the parameters column for the audit row in XLSForm but that has not yet been released.

In the mean time, you can add a bind::odk:track-changes and put true in the audit row (see my sample form described above). This will continue to work but is not the preferred approach because of the strangeness of bind:: and because it adds another column.

The good news is that this is all text so it's not very large. Assuming log lines of about 200 characters (this will depend on field names, non-ASCII characters, nesting level, etc), 3000 user actions would take up about 600kb (0.6mb). Unfortunately, I don't think there's a way around this to provide full compliance.

Can you say more about the context in which this would be useful?

As much as possible, I think we'd want this to be up to the form designer so that many different workflows can be supported. As I mentioned above, this might be done through external apps. For example, a form could call out to a small app that prompts a user for a PIN, verifies it, and then sends an identifier back to Collect once the PIN has been confirmed. This would be similar to what a fingerprint app would do.

I've started the discussion on how to actually build this at Approaches for linking form instance changes to individuals. You'll see that it's still pretty high-level because there are lots of possible mechanisms and no totally obvious one that I can identify yet.

As much as practical, I think it would be helpful to keep the discussion about the requirements and user experience here and the pros and cons of the different approaches from a development perspective in the other thread.

Thanks @LN,

This is good detail. I am of the opinion of building the user authentication/verification in ODK. ODK Collect could have the signature feature appear when a change is effected, then the data collector would sign off their changes and this could be verified later.

Paul

1 Like

@Lal_S I don't think file size should be a huge concern. The log is text and very compressible in transit (Central does this by default and Aggregate on Tomcat can be configured to do it).

@paul_macharia Signature would likely be a bad default. It's a picture and so adds a large incompressible file to the submission. I think we should give people the option of what type of widget they want to use.

1 Like

@Yaw,

From a research perspective, the cost of the larger file is worth it.

I have seen research study outcomes rejected for failing to reach the threshold of credibility and authenticity.

A number of field enumerators are also being tempted to falsify or manipulate data collection to meet job demands

Asante,
Paul Macharia

2 Likes