I wanted to share with y'all some ideas that we've been discussing for some time related to adding automated benchmarks to JavaRosa.
The motivation behind investing a collective effort on this is to be aware of which are currently the most pressing pain points in JavaRosa's implementation performance-wise and to have a tool to assess any proposed technical improvement when addressing them.
We're not talking about assessing JavaRosa's behavior (which is something important that we should do as well), but the performance of its current behavior. This means that sometimes we will want to focus on low-level features, like a specific xforms function, or high-level performance when parsing full-fledged forms.
From a technical point of view, our best option (after doing some research) will be to use a the JMH Java micro-benchmarking framework, although there are some things we need to talk about:
- Benchmarks could be wrapped in junit4 tests so that they can become a new step of the build workflow, which would enable us to break the build when a benchmark worsens or doesn't meet some threshold after some change to the implementation.
- We could create different groups of benchmarks depending on their abstraction level. A low-level benchmark should be close to the codebase in order to target specific methods, but high-level benchmarks could even be outside of the codebase and approach JR following an outside-in strategy (blackbox testing).
From a logistics point of view, we need to identify low-level and high-level targets for our benchmarks. We've already collected a couple of particularly big forms that present performance issues when used in Collect that we could explore for our first high-level benchmarks. It's not clear yet what low-level targets we should focus on first, although generic XML parsing and secondary external instance support are good first candidates.
From a strategic point of view, here are my thoughts:
I expect we will have to go through a first phase to try stuff out, keep what works, and throw away what doesn't. During this exploration phase, I wouldn't create much structure for this work. Maybe having a branch that the involved people can share, and some focused conversations here and over at Slack would be enough.
Once we make all the technical decisions, and we can provide examples for every possible benchmarking scenario, we could create specific issues targetting our benchmarking needs for anyone to contribute.