OpenSearch: OpenSearch 1.2.0 Performance regression of 10% compared to 1.1.0
We are seeing 10%+ regression across the board compared to the OpenSearch 1.1.0 release.
| Area | Movement | Additional Reference |
|---|---|---|
| Indexing Requests Per Second | Down 10% | |
| Mean Index Latency | Up 11% | p50 9%, p90 11%, p99 37%, p100 46% |
| Mean Query Latency | Up 116% | p50 116%, p90-108%, p99 106%, p100 118% |
Thanks to @mch2 for these numbers
Performance test data is available on https://github.com/opensearch-project/opensearch-build/issues/963, please review and create any issues if follow up is needed.
Builds under test:
"min x64 url": "https://ci.opensearch.org/ci/dbc/bundle-build/1.2.0/982/linux/x64/builds/dist/opensearch-min-1.2.0-linux-x64.tar.gz",
"full distro x64 url": "https://ci.opensearch.org/ci/dbc/bundle-build/1.2.0/982/linux/x64/dist/opensearch-1.2.0-linux-x64.tar.gz",
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 38 (32 by maintainers)
Commits related to this issue
- Delay the request size calculation until required by the indexing pressure framework. (#1560) Signed-off-by: Saurabh Singh <sisurab@amazon.com> — committed to getsaurabh02/OpenSearch by getsaurabh02 3 years ago
- Delay the request size calculation until required by the indexing pressure framework. (#1560) Signed-off-by: Saurabh Singh <sisurab@amazon.com> — committed to getsaurabh02/OpenSearch by getsaurabh02 3 years ago
Excellent news @peternied!!!
So to summarize, the initial reported 20% indexing degradation was the result of a difference in the benchmarking configuration between the 1.1 and 1.2 build? The initial issue above indicates the 1.2.0 distro was built from CI. Were we comparing that distro build against a 1.1.0 built from a local branch checkout? Thereby leading to different jvm and machine configs?
Many thanks to all that put a huge effort into verifying this performance! @peternied @mch2 @getsaurabh02 @Bukhtawar Looking forward to getting ongoing benchmarking stood up to catch any potential issues early!
OpenSearch 1.2 Indexing Backpressure Removed vs Normal Distribution
Looking over these results, they are within in the 5% error percentage, no 20% outliers. This aligns with the earlier findings indicating that Indexing Backpressure presence does not have a [edit] significant [/edit] impact on the overall performance. 🍾 @getsaurabh02 @nknize
Indexing Latency milliseconds delta
Indexing Latency milliseconds delta %
Source Data table (Excel Online)
We (myself and @mch2) ran the Rally Tests again today, to compare the 1.2 changes with and without the shard indexing pressure commit. This was done using the same branch and just isolating one ShardIndexingPressure commit.
Since the previous tests were done against different distributions, it was not equivalent setup. Here numbers are pretty much similar, and definitely not alarming as called out 20% in the earlier comment. Worst case increase is reported as around 2% for p100, however since at the same time p99 is same, it mitigates any risk.
Indexing Operation Latencies (ms) Comparison
Throughput (req/s) Comparison
The server side metrics for these tests such as CPU, Memory, Threadpool utilization, Garbage collections are all identical in trend and similar in numbers. Some of the differences in the tests can be attributed to client side metric collections. Please let me know if there are any questions related to these runs.
On the other hand, for one possible optimization found (as covered in the above comment #1589), I have already raised a PR. It will reduce one CPU computation in the indexing flow further. Although it doesn’t look like a blocker, but we can get that change in as well if needed.
I compared 1.1 and 1.2 performance runs and on queries being slow on 1.2: performance test created
nyc-taxisindex with 5 shards on 1.1 and with 1 shard on 1.2 test runs respectively. This index has ~165millon docs. So on 1.1 with 5 shards (4.5 GB each) the query ran faster than 1.2 with single (23GB) shard. We need to rerun the 1.2 performance test with 5 shards and compare again.This issue was ultimately determined to be a configuration mismatch between the 1.1.0 and 1.2.0 performance testing setups. Closing this as we have shipped 1.2.0
@dblock What I understood in the previous setup from @mch2 which reported 20% degradation, one test setup was run by checking out the local branch and building the distribution locally after reverting the commit, while other was downloaded from the archived directly.
So this time we used the same setup branch, which was checked out locally for both the tests, while one had the commit reverted.
In order to have a basis of comparison for 1.1 and rule out test configuration differences, I’ve triggered 4 new tests against the OpenSearch 1.1, the should be complete in ~8 hours
Working on getting rally set up to run min comparison locally will report back