Discussion
Apache Otava
cheema33: I think I could use a slightly more detailed explanation of what it is and how it works at the high level. The website doesn't fully explain it.e.g. The about page, as of this writing, does not say anything about the project.https://otava.apache.org/docs/overview
jacques_chester: I would be interested in a layperson summary of how this code deals with whacky distributions produced by computer systems. I am too stupid to understand anything more complex than introductory SPC which struggles with non-normal distributions.
adammarples: It's hilarious that the about page doesn't tell you anything about the project
homarp: Change Detection for Continuous Performance Engineering: Otava performs statistical analysis of performance test results stored in CSV files, PostgreSQL, BigQuery, or Graphite database. It finds change-points and notifies about possible performance regressions.You can also read "8 Years of Optimizing Apache Otava: How disconnected open source developers took an algorithm from n3 to constant time" - https://arxiv.org/abs/2505.06758v1and "The Use of Change Point Detection to Identify Software Performance Regressions in a Continuous Integration System" - https://arxiv.org/abs/2003.00584 (and I guess this blog post https://www.mongodb.com/company/blog/engineering/using-chang... )(https://en.wikipedia.org/wiki/Change_detection explains what's change detection)
kanwisher: that really doesn't explain it very well
mohsen1: it is basically a “performance regression detector” It looks at a time series of benchmark results (e.g. test runtime, latency, memory usage across commits) and tries to answer one question: did something actually change, or is this just noise?in a performance critical project I am working on, I had a precommit hook to run microbenchmarks to avoid perf regressions but noise was a real issue. Have to to try it to be sure if it can help but seems like a solution to a problem I had.
bjackman: If your benchmarks are fast enough to run in pre-commit you might not need a time series analysis. Maybe you can just run an intensive A/B test between HEAD and HEAD^.You can't just set a threshold coz your environment will drift but if you figure out the number of iterations needed to achieve statistical significance for the magnitude of changes you're trying to catch, then you might be able to just run a before/after then do a bootstrap [0] comparison to evaluate probability of a change.[0] https://en.wikipedia.org/wiki/Bootstrapping_(statistics)