Should the root node name produced by XLSForm be changed to data?

There's nothing in the XForms spec regarding the specific naming of the child node, other than there can only be one:

...The detached copy must consist of content that would be well-formed XML if it existed in a separate document. Note that this restricts the element content of instance to a single child element.

And according to the ODK Spec, the unique form identifier (aka filename) and version are specified as attributes of this node:

<?xml version="1.0"?>
<h:html xmlns="http://www.w3.org/2002/xforms" 
        xmlns:h="http://www.w3.org/1999/xhtml" 
        xmlns:jr="http://openrosa.org/javarosa" 
        xmlns:orx="http://openrosa.org/xforms" 
        xmlns:odk="http://www.opendatakit.org/xforms"
        xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <h:head>
        <h:title>My Survey</h:title>
        <model>
            <instance>
                <data id="mysurvey" orx:version="2014083101">
                    <firstname></firstname>
                    ...

[I read thru 152, and associated issues] So making the child node name be some (hopefully, which in some documented cases it wasnt...) valid XML custom label conglomoration of a filename, timestamp, what-have-you, is IMHO pretty naughty and asking for trouble... The desired unique id and version is right there in the attributes!

Indeed, because there is only ever going to be one child node to begin with, it is in fact largely redundant in trying to uniquely identify elements within the scope of their associated instance XML submission. If the purpose of this exercise is to minimize the prefix needed to uniquely identify exported properties in, say, a flatfile (eg CSV, etc) then the only things you need to uniquely identify a submission property are its node name and the paths of any groups/subgroups it falls under. Further, if, in addition, you want to globally uniquely label each property - ie across all forms/versions - then the necessary information needed to accomplish this is in the child node's attributes (if it exists anywhere at all).

I believe (strongly!) that attempting to exploit the child node name to somehow globally uniquely identify submitted property values is bad, and should probably be punished to the fullest extent of the W3C Specification Violation Law (which amounts to about 30 lashes with a wet noodle these days...) :slight_smile:

I've moved this discussion into its own topic because I don't think it's connected to the feature request to offer exports without prefixes (correct me if I'm wrong!). I mentioned the proposed root node name change in https://github.com/XLSForm/pyxform/pull/152 as an FYI. I think it's helpful to stay more focused on user-facing discussion in the features category and I apologize for moving us away from that with my comment!

I think that we're in agreement that using the filename as the root node name leads to all kinds of problems with no benefit that I can discern.

Using data consistently makes it easier to do things like share XPath expressions between forms with no loss of expressiveness that I can identify.

All good! :slight_smile:

I think my main point - relative to the original question posed - was if there is a problem with the child node name being so excessively long that it's blowing out your property name prefixes - well, you probably shouldn't be using it for that purpose in the first place, since there is absolutely no guarantee its unique anywhere outside of its parent <instance> node! (and its redundant there anyway, 'cause if there's only ever one it doesn't really matter what its called...)

That and the fact it appears some tools are probably being a bit lazy about trying to get the formid and version from the correct place (attributes), and instead assuming its been conveniently dropped into the node name.

1 Like

I agree. For several years in Smap I have used the root name of "main" instead of the file name which has simplified development in quite a few places and much to my surprise I have not come across any problems caused by this approach.