Generate identifier of specific length to link forms

In my implementation, I have multiple surveys, call them A, B, and C. In the field I would like to administer all 3 of those surveys to the same person. I do not, however, want to re-enter the identifying personal information on each form. A thorough look at the forums suggests that there is not yet a simple way to link between instances of different forms. One workaround that has been suggested is that I generate my own uuid which the enumerator will then re-enter on forms B and C, after it was generated for form A.

Because the identifier will need to be re-entered by hand, a relatively short id number is necessary. My idea was therefore to generate a 6 digit alphanumeric key, which downstream can be made unique to those 3 surveys only by concatenating on the date of data collection.

However I am unable to generate a random 6-digit alpha numeric key. In my survey table, i have:
type= "calculate"
name = "vid"
calculation = "once(uuid(6))"

From the help page I was under the impression that uuid(6) would suffice:
"
uuid([length])
Without argument, returns a random RFC 4122 version 4 compliant UUID.
With an argument it returns a random GUID of specified length.
"

Can anyone explain why entering a value of [length] in this manner fails? This is also obviously a clunky solution to the problem of making a connection between surveys, but I haven't been able to find a better way with ODK. Thanks!

What version of ODK Collect are you using? This seems like a bug (either in the spec or Collect). Paging @LN and @martijnr to confirm.

As to the problem at hand, note that a 6 digit ID isn't going to be universally unique (what the first Us in UUID stand for) and so you'll likely run into collisions. At about 50,000 records, you'll have a 50% chance of a collision. At 1,000,000 records there'll be about 250 collisions. https://en.wikipedia.org/wiki/Birthday_problem has more on this. I know you are concatenating with date, but that's not super unique and I wanted to give you a heads up.

I can suggest some alternative approaches, but it'd be good to understand when forms A, B, and C are filled. If it's the same place and time, then you can probably use pre-generated barcodes or PINs. If it's not then maybe CSV preloads built from form A's data might be better.

1 Like

Can you please describe the behavior you are expecting and the behavior you are getting? Attaching a sample form would be ideal.

Here is a form that generates a 6-character UUID and displays it. It does what I would expect.

If you say more about your scenario, there may be other good options for you. How many respondents do you expect each enumerator to visit? Over what period of time? Is it necessary that the forms are separate?

It may just be a bug in Enketo perhaps? We've never implemented support for the argument: https://github.com/enketo/enketo-xpathjs/issues/55. Now that we know somebody uses this we'll greatly increase the importance assigned to adding this missing feature, but we're in the middle of developing a new XPath evaluator, which means it may take a few months.