Aggregate Server - the refresh death spiral

What is the problem? Please be detailed.
I originally had one ODK Aggregate server rrunnign which exhibited a peristent problem which appeared to be related to load/quantity of data. This led to the problem of that server eventually failing with the data needing to be recovevered manually.

I now have three ODK Aggregate servers set up in 3 different locations on the Google cloud platform. Each is running V1.6 and each is now exhibiting the early signs that lead to the problem described above.

What happens is that the fetch of the data on page that shows the records takes so long that the page refreshes before the data can display. I have increased the performance parameters on all of the servers, but it still exhibits this problem.

Generally, if care is taken to delete the records before there are too many, then the problem can be avoided, but if the record count climbs, then the problem gets to a point where the records can no longer be accessed to be able to delete them - it becomes a Catch-22 type death spiral.

Given that this problem occurs predictably and accross many different instances, it would appear to be common. I hope someone has a way to deal with this..... The death spiral has started!!!

What has worked to alleviate the problem in the past is to physically delete all tables associated with the forms. However, this is probably not something to be recommended and probably has some serious side effects. Any suggestions welcome.

@Mark_Schormann1 Last we talked, you said your data set was 150-300 full resolution images with 15-20 submissions per day. It's just a lot for App Engine's data store to handle and so you might be running into concurrency issues. Deleting tables isn't recommended, but it might be your only option at this stage. Other options:

1 Like