Encrypt and re-upload old data to Aggregate to comply with GDPR

Hi everyone!

I am working on several research projects using ODK, with our servers being hosted in Europe. With the new GDPR, our provider asked us to encrypt all data available on their servers.
So far, everything was sent to Aggregate unencrypted.
They advise us to download all data, delete it from their servers and from now on only submit questionnaires encrypted.

I wondered if there was a way to encrypt existing data and reupload it to Aggregate, to keep it available in the same place as before.

Thank you!

1 Like

Here's what I would do. Please read all the steps first and make sure they make sense before starting! You may also want to wait and see a day or two if anyone else has alternate or additional suggestions. You will need a mobile device or emulator with Collect on it but no blank or filled forms (you can clear forms from Collect in Admin Settings > Reset application)

You should then have encrypted blobs in Aggregate. You can verify this by pulling with Briefcase one more time, setting your private key and exporting the data. Again, that export is unencrypted so treat it with care!

5 Likes

Thank you for your reply! All right, I'll wait a bit, and maybe try this at first on a test Aggregate instance, but from the look of it, I should be able to get it working.
Thanks!

1 Like

And if you don't want to try on test instance (although that is a good idea), you can try with a single test form and a couple submissions. Then instead of purging everything from Aggregate, you just purge that single form.

1 Like

Thank you, yes indeed, start small to try it out and then scale up!

1 Like

I have tried following LN's steps and I am stuck when I want to send the finalized forms to the sever.
What I have done is the following:

  • I have downloaded all the data from my Aggregate server with Briefcase.
  • I have modified the xml file that was downloaded with Briefcase to add the line with the public key.
  • I have pushed that file with adb to /sdcard/ork/forms, and all the "uuid..." folders to /sdcard/odk/instances.

My issue is that after I have done all this, while the form appears in ODK Collect, the finalized forms don't.
Do you know what I am missing?

Edit:
I have also tried to upload using Briefcase but it doesn't work either.
What I have done is downloading the data for one form with Briefcase, modified the xml form to add a public key, and then tried to reupload this to my test server.
Birefcase gives me this error:

UPLOAD FAILED: Submission upload failed. Detailed error: Bad Request (400) while accessing: https://[TEST-SERVER]/submission
Please verify that the submission (400 of 400) that is being uploaded is well-formed.

When I go to my AppEngine logs i get this:

org.opendatakit.aggregate.servlet.SubmissionServlet doPost: Parsing failure - Xml document element tag: a_checklist_muac_measurement_updated1 does not match the xform data model tag name: data

I guess this has to do with the fact that I am sending back submissions from a form that was slightly different.
When I try to use the Briefcase CLI, I get an Authentication error, even though I double checked the url, username and password.

Does that mean that when you tap on Send Finalized Form from the landing page you see no forms at all? What if you tap on Edit Saved Form? Note that the form counts won't update until you actually tap on one of those and trigger a reload from the sdcard.

What you've tried with uploading from Briefcase is a good idea and should work as well. How did you add the submission block? It looks like more than just the submission block changed in your form XML.

That sounds like there could be a bug. Which version of Briefcase are you using?

Does that mean that when you tap on Send Finalized Form from the landing page you see no forms at all? What if you tap on Edit Saved Form?

Yes, that's what I have done. I have tried re-uploading several times, resetting the app, but still, nothing appears in either category.

I have added the submission block just after the model tag opens:

<h:title>[FORM-TITLE]</h:title>
<model>
<submission action="[SERVER]/submission" base64RsaPublicKey="mQENBF[...]iDWZ" method="form-data-post"/>
<itext>
That's all I have added. When I try to reupload old submissions to my test server with the original form, everything works. When I add the submission block, i have the error.

I am using Briefcase version 1.10.1. I have bypassed the authentication error by temporarily enabling anonymous submission and form management on my server.

I'm sorry, I now realize Briefcase upload won't work because Briefcase doesn't encrypt the submissions. The error message on Aggregate isn't very helpful but I think that's what it is caused by.

Ok, so Collect feels like a better bet. What version of Collect are you running? Reading instances from the sdcard was added less than a year ago so if you're running an older version that might explain why you're not seeing them.

There's another potential issue I thought of -- I'm not sure whether the task that reads submissions from the sdcard would actually go through the proper finalization process that encrypts submissions or whether it just marks the submission as finalized in the db. If you do get to see your submissions but they're not sending from Collect, let me know and I can dig deeper.

2 Likes

I am running ODK Collect v.1.15.0, which appears to be the latest.

I have dug a bit into the /sdcard/odk/instances folder and it appears that when I push all the uuid... folders there, the file instances.db remains unchanged. Can that be the cause of the issue? Is there a way to get it updated with what was pushed with adb?

Yes, you are right, @benjaminFaguer -- the database is not being updated. When I tried the Briefcase to Collect step, I think I copied files that had been pulled from Collect and not from Aggregate. What's going on is that the task that reads submissions from the sdcard looks for a particular XML filename. When pulled from Aggregate, all submissions have the filename submission.xml. I've also confirmed that when submissions are read this way they aren't actually being encrypted (they're not finalized, the db column for finalized is just being marked as true).

I believe this use case is not currently supported and will require some code changes. I see two options:

  • Allow Collect to read submission.xml files and actually re-finalize submissions rather than just setting the db column (change this section in InstanceSyncTask to either add support for submission.xml explicitly or read any XML file present; instead of this line, actually finalize)
  • Add encryption capabilities to Briefcase so that the "push" operation can include an optional encryption step.

@Grzesiek2010, @ggalmazor, @yanokwa -- any thoughts on whether either of these seems reasonable?

We should probably do both, but this feels like it should belong in Briefcase because it's more convenient.

This is a great idea, but I don't know how it would play with the Briefcase push operation. We would have to make it more complex for everyone for a feature that only some part of the users need and just for what it seems to be a unique moment in time.

Instead of that, I'd like to propose a new CLI command that would take a form and all its submissions and create an encrypted copy. Then, the original plain form and the copied encrypted form would be listed on Briefcase and the users could push it as any other form.

1 Like

Just to close the loop on this thread, ODK Collect v1.16 lets Collect read submission.xml files and actually re-finalize (and thus re-encrypt) submissions.

1 Like

Wow, thank you for adding this feature!
In the meantime we have downloaded everything and stored it on a shared cloud storage.
I will try to re-upload the data to the servers thanks to your effort.

1 Like