Comparing space-separated lists

Say I want to compare two select-multiple (select) nodes in an XPath expression. I fear if I do a straight equality check (/data/node1 = /data/node2), I might get a false negative if the values are not in a consistent order.

For example, one may be option1 option2 and the other may be option2 option. Direct equality would return false when in fact they are functionally equal.

Does anyone know if these space separated lists are guaranteed to be in a consistent order, say matching the order of the items in the itemset? So if both questions use an identical itemset then I'd be golden? Or might the order depend on, say, the order in which the user selected the options, which is effectively random?

Good question. As far as I can tell, the spec (ODK refers to XForms for this) does not explicitly specify this so it would be open to the implementer.

Enketo always uses stores in the order of the presented options.

1 Like

I did a quick scan on the XForm spec and didnt see anything dictating the order selected values are stored in the space-separated string. Perhaps @ln @martijnr can confirm...

I suspect implementations may well store them in a particular order, irrespective of which order they are picked. But a potential problem with relying on this is if you, say, use a pulldata to pull in a result submitted from a different client, the order may differ.

Agreed - GREAT question!

Another quirk (fly in ointment?) is you can potentially randomize the presented order of options. So what would be the expected behavior: original order, presented order, vs selected order...

Agreed there is no guaranteed order. In Collect, it actually depends on the specific multiple selection appearance because the implementations are different for performance reasons.

@tomsmyth can you perhaps describe your need in a little more detail and we can see what options there might be? That is, what real-world scenario does this come from? How many options are we talking about?

In our form builder you can create constraints or relevances by saying e.g. questionX = questionY. If both are select multiple questions, it would intuitively make sense for this to test if the same options are selected.

However, the likelihood that someone will actually need to check that two select multiple questions are equal is probably pretty low, so I can just disallow that combination in the form builder I suppose.

I was just hoping there was an easy way to build an XPath expression to test this. If anyone has any ideas I'd be grateful. The closest I came would be to basically test selected(questionX, option) = selected(questionY, option) for each possible option, but that could get huge if the set of options is huge.

If you have a relatively small number of options, you could conceivably accomplish something if you made each option a binary length - ie 1,2,4,8,16... characters long, then comparing count-selected() and string-length() are the same. (The binary lengths would basically only match if all the same options exist in both). Bit of a hack, and obviously won’t scale well to lots of options, and requires custom option values that are otherwise pretty meaningless...

Otherwise I can’t think of a (supported) XPath expression that’d compare two strings containing arbitrarily ordered words, that doesn’t involving somehow sorting them ( which we don’t have an XPath function for presently). Other than your long selected()=selected() for every option...

1 Like

That's what I would recommend. I can't think of a context in which that would be useful. Same with, say, geopoint, geotrace and geoshape.

Hi @LN,
Any news on this?

Here another use case. We need to strip a string (concat of node/repeat), e.g. with a string-before() or with a selected-at(), to exclude a specific (last) element.

Sorry, I don't like much that a data result is depending on the appearance of a UI element. (Esp. as we do not have a sort() function yet.) I would prefer the "Enketo" solution, based on the original choice-list order (also for randomize).

How is this related to the choices string value and its desaggregation columns order in the data export? If the appearance is changed, will even the data export (desaggregated columns) order change for this form?

Also the randomize question from @Xiphware seems open, please.