Entity too large error when uploading to XLSForm

Hello all. Long time fan, first time poster. I am also having this problem, on a very small file (1.5MB) but with a ton of lookups. Let me explain:

I have developed a process in R through which a registry on one of our online (Azure SQL) project's database can have all it's schema exported to ODK for offline collection. The challenge has always been ensuring we are working with an updated list of villages, districts, regions etc, as well as any local organizations etc. In reality, we need a full offline relational system, but I've managed to flatten out all the references and make them work using choose_one options lookuping into the choices tab. (making sure to export hierarchies from the schema so that if we have the same village in different districts people don't select the wrong one! :slight_smile: ).

The output then becomes a 3 sheet excel workbook (46 questions, about 50,000 select-one choices) which we feed into your form validator, but that's where I'm having the bug... your form-validator (http://opendatakit.org/xlsform/) give me the 413 error referenced above.

EDIT: The offline file works, as does the python autoconverter (even tho it does spit a bunch of errors:

Traceback (most recent call last):
  File "/home/amit/.local/lib/python2.7/site-packages/pyxform/xls2xform.py", line 146, in <module>
    main_cli()
  File "/home/amit/.local/lib/python2.7/site-packages/pyxform/xls2xform.py", line 137, in main_cli
    enketo=args.enketo_validate)
  File "/home/amit/.local/lib/python2.7/site-packages/pyxform/xls2xform.py", line 27, in xls2xform_convert
    warnings=warnings, enketo=enketo)
  File "/home/amit/.local/lib/python2.7/site-packages/pyxform/survey.py", line 720, in print_xform_to_file
    warnings.extend(odk_validate.check_xform(path))
  File "/home/amit/.local/lib/python2.7/site-packages/pyxform/validators/odk_validate/__init__.py", line 56, in check_xform
    if not _java_installed():
  File "/home/amit/.local/lib/python2.7/site-packages/pyxform/validators/odk_validate/__init__.py", line 45, in _java_installed
    stderr = run_popen_with_timeout(["java", "-version"], 100)[3]
  File "/home/amit/.local/lib/python2.7/site-packages/pyxform/validators/util.py", line 53, in run_popen_with_timeout
    startupinfo=startup_info)
  File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1343, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

Is the error on the webform expected?

Thanks for a great product!

Welcome to the forum, @Amit_Kohli! When you get a chance, please introduce yourself here. I'd also encourage you to add a real picture as your avatar because it helps build community!

It sounds like XLSForm Offline works, but pyxform (what you call python autoconverter) isn't working? What versions of both are you using?

I think XLSForm Online (http://opendatakit.org/xlsform) is failing because the web server doesn't expect files that big. That isn't really a bug with the converter, but more an issue of the server's configuration. If you OK sharing your form publicly, please upload it to this topic so I can test it against the server and make adjustments.

And as an aside, 50,000 selects is a lot! I don't know if you are using Collect to render these, but there are a couple of ways to make these more performant. One is external selects and the other is search(). We are also working on an optimization that will make big lists of selects load very quickly (in our test case it's gone from 13 seconds to 1 second).

I took a quick look at the server and I'm pretty sure the issue you ran into was web server's 1mb client_max_body_size limit.

I've fixed it at https://github.com/opendatakit/xlsform-server/pull/15 and deployed it to the server. Rather than sharing your form, please try the server again and see if it works now.

Done! I also read your profile.... 3tis3n madamfo?

I hadn't heard of external selects or search()... I'll definitely look them up! Yes, I do want to use Collect... the wait time is tolerable... but in the book that I have set up the only annoying part is that a right swipe that loads a large list doesn't display any indication that it's loading (like a turning wheel)... I guess you're working on something like this for ODK 2?

Yessir, confirmed! It works flawlessly now. Thanks!

I had major problems with this because I made the mistake of updating my pip to version 10... and then ran into that massive problem that pip version 10 doesn't do anything... so I tried a bunch of stuff... but yeah, the one that ended up working was: pip install git+https://github.com/XLSForm/pyxform.git@master#egg=pyxform so that's the version that I have. So I do confirm that the ONLINE tool works correctly, OFFLINE works correctly, and pyxform seems to run correctly, but outputs the errors referenced above and does NOT give the warnings the other two give (which are correct).

2 Likes

That’s good to hear! Could you please share what device you’re on and how long it takes to load?

It’s possible that you’re not hitting a data load speed limitation (that’s what select one external and search() aim to address) but a rendering speed limitation. @Grzesiek2010 is working on dramatically speeding up how quickly choices are rendered and since you have such a large set of choices on a real form (as opposed to fake data we generate) it would be good to know how long they currently take to show up on the screen. Also, does the form itself take a while to load when launched from Fill Blank Form?

@Amit_Kohli Me ho yε!

My guess is that your Python environment is messed up. It's not unusual. Seems to happen to me every year or so :laughing:. I usually try to get back to a default Python setup my uninstalling all the packages and starting over. Both XLSForm Online and Offline run the version of pyxform that is failing on your machine, so it's probably your machine.

@yanokwa Actually what Python version should this be run on? I think it's 3.6.

I'm not gonna lie to you... it seems to be doing everything correctly... so yeah, I'm gonna close my eyes and won't mess w/ it for a while!

For clarity, I was trying the online and offline from my windows machine; The pyxform is running on my Ubuntu 16.04 VM.

Bidaseeeeeee papapappapa!

@LN I'm on a Google Pixel. The delays are several. Downloading the form from google keep is pretty fast... I'm gonna guess 3 seconds. When I go to "fill blank form" there's another maybe 5 second wait for the form to show up at all... then when I click on it, there's a maybe 30 second wait for the form to be ready to gather data. Scrolling from page to page is fast, but waiting for one of the elements with many choices to look up does take again somewhere <10 seconds to load depending on the choice (it's hirearchical so depends on what I picked before). Hope that helps? If I could help you guys design a better product I'm sure I can share my data with a closed group after a quick NDA... pls let me know.

2 Likes

Thanks for those details, @Amit_Kohli! I'll make sure to loop you into the next beta so you can see what the speed looks like with the new version. The elements will should then load instantly.

The 30s delay in form load would be addressed by one of the external data strategies @yanokwa mentioned above.

Thank you! There's no ongoing work to speed up internal lists of choices but if there is another effort in that direction I will be sure that your offer is remembered.

:smile::ghana:

I definitely will look into external selects and search... just want to get my prototype full-featured for now then will speed up. Only have to port "default values"... it'll work!!! Very exciting.

Hrm... what about a turning wheel to indicate that the swipe has been registered... so I don't sit there swiping furiously :slight_smile:

Eyyyyy you speak Twi obruni? :smiley:

Once the beta is out you'll be able to try it and see if some kind of status is still needed. That load should be instant, though. Currently the whole list is being prepared for display which takes a long time. With the change, only what is needed to be displayed on the screen will be rendered.

I just like saying things like akokɔ kɔkɔɔ. :grin::chicken:

Awesome! I look forward to the beta!

Also, (I guess also for @yanokwa... I took a look at the external lists, but I guess they require an external csv to be brought on? I'm not sure it'll work for my use case... at our NGO we have 20-30 projects with custom databases. The idea is that, for each project, I create a full offline sync for ONE of those tables, the beneficiary registry, and then give that form for our managers in the field to give out to their field staff and enumerators. That means that there could be a few hundred people downloading forms on an ongoing basis. It will ALREADY be a challenge to get them to update their forms on a regular basis to reflect new lookups etc (I have an idea on how to do that... I think I'll have a welcome page detailing the form creation date and I'll cycle the background color of that form to 5 colors... that way managers will quickly be able to see what version form is installed)...

Anyway if the update means they have to download a form in addition to one or more csvs, management becomes tricky... how could our field managers confirm that the enumerators are using the latest and greatest?

Perhaps I'll go with what I have for now, but it's good to know if something simply doesn't work, I can always spread into something more robust.

Thanks!

I didn't know this one!

Dear all, I also just received this issue of "413 Request Entity Too Large" when uploading a form to Aggregate. The form has many audio files as attachment. I just tried uploading the form with only a selection of them, and it has worked out. Could you advise on what I could do to solve this issue.

Many thanks!

Maria Isabel