OpenSearch: OpenSearch 1.2.0 Performance regression of 10% compared to 1.1.0

We are seeing 10%+ regression across the board compared to the OpenSearch 1.1.0 release.

Area	Movement	Additional Reference
Indexing Requests Per Second	Down 10%
Mean Index Latency	Up 11%	p50 9%, p90 11%, p99 37%, p100 46%
Mean Query Latency	Up 116%	p50 116%, p90-108%, p99 106%, p100 118%

Thanks to @mch2 for these numbers

Performance test data is available on https://github.com/opensearch-project/opensearch-build/issues/963, please review and create any issues if follow up is needed.

Builds under test:

"min x64 url": "https://ci.opensearch.org/ci/dbc/bundle-build/1.2.0/982/linux/x64/builds/dist/opensearch-min-1.2.0-linux-x64.tar.gz",
"full distro x64 url": "https://ci.opensearch.org/ci/dbc/bundle-build/1.2.0/982/linux/x64/dist/opensearch-1.2.0-linux-x64.tar.gz",

About this issue

Original URL
State: closed
Created 3 years ago
Comments: 38 (32 by maintainers)

Commits related to this issue

Delay the request size calculation until required by the indexing pressure framework. (#1560) Signed-off-by: Saurabh Singh <sisurab@amazon.com> — committed to getsaurabh02/OpenSearch by getsaurabh02 3 years ago
Delay the request size calculation until required by the indexing pressure framework. (#1560) Signed-off-by: Saurabh Singh <sisurab@amazon.com> — committed to getsaurabh02/OpenSearch by getsaurabh02 3 years ago

Most upvoted comments

Excellent news @peternied!!!

So to summarize, the initial reported 20% indexing degradation was the result of a difference in the benchmarking configuration between the 1.1 and 1.2 build? The initial issue above indicates the 1.2.0 distro was built from CI. Were we comparing that distro build against a 1.1.0 built from a local branch checkout? Thereby leading to different jvm and machine configs?

Many thanks to all that put a huge effort into verifying this performance! @peternied @mch2 @getsaurabh02 @Bukhtawar Looking forward to getting ongoing benchmarking stood up to catch any potential issues early!

nknize on Nov 20, 2021

OpenSearch 1.2 Indexing Backpressure Removed vs Normal Distribution

Looking over these results, they are within in the 5% error percentage, no 20% outliers. This aligns with the earlier findings indicating that Indexing Backpressure presence does not have a [edit] significant [/edit] impact on the overall performance. 🍾 @getsaurabh02 @nknize

Indexing Latency milliseconds delta

	P50	P90	P99
x64-disable	8.5	14.3	-5.1
x64-enabled	4.5	23.1	17.7
arm64-enable	20.7	29.1	45
arm64-disable	-3.3	-13	72.8

Indexing Latency milliseconds delta %

	P50	P90	P99
x64-disable	2%	2%	0%
x64-enabled	1%	3%	1%
arm64-enable	4%	4%	3%
arm64-disable	-1%	-2%	6%

Source Data table (Excel Online)

peternied on Nov 20, 2021

We (myself and @mch2) ran the Rally Tests again today, to compare the 1.2 changes with and without the shard indexing pressure commit. This was done using the same branch and just isolating one ShardIndexingPressure commit.

Since the previous tests were done against different distributions, it was not equivalent setup. Here numbers are pretty much similar, and definitely not alarming as called out 20% in the earlier comment. Worst case increase is reported as around 2% for p100, however since at the same time p99 is same, it mitigates any risk.

Indexing Operation Latencies (ms) Comparison

Metric	Full 1.2	With Revert Commit in 1.2	Difference (%)
P50	2,587.4	2,569.2	-0.70839
P90	3,773.4	3,686	-2.37113
P99	5,083.5	5,077	-0.12803
P100	8,736.7	8,564	-2.01658

Throughput (req/s) Comparison

Metric	Full 1.2	With Revert Commit in 1.2	Difference (%)
P0	14,516.1	14,644.4	0.8761
P50	15,647.6	15,893.2	1.54531
P100	18,931.2	19,152.6	1.15598

The server side metrics for these tests such as CPU, Memory, Threadpool utilization, Garbage collections are all identical in trend and similar in numbers. Some of the differences in the tests can be attributed to client side metric collections. Please let me know if there are any questions related to these runs.

On the other hand, for one possible optimization found (as covered in the above comment #1589), I have already raised a PR. It will reduce one CPU computation in the indexing flow further. Although it doesn’t look like a blocker, but we can get that change in as well if needed.

getsaurabh02 on Nov 19, 2021

I compared 1.1 and 1.2 performance runs and on queries being slow on 1.2: performance test created nyc-taxis index with 5 shards on 1.1 and with 1 shard on 1.2 test runs respectively. This index has ~165millon docs. So on 1.1 with 5 shards (4.5 GB each) the query ran faster than 1.2 with single (23GB) shard. We need to rerun the 1.2 performance test with 5 shards and compare again.

skkosuri-amzn on Nov 18, 2021

This issue was ultimately determined to be a configuration mismatch between the 1.1.0 and 1.2.0 performance testing setups. Closing this as we have shipped 1.2.0

peternied on Nov 29, 2021

@dblock What I understood in the previous setup from @mch2 which reported 20% degradation, one test setup was run by checking out the local branch and building the distribution locally after reverting the commit, while other was downloaded from the archived directly.

So this time we used the same setup branch, which was checked out locally for both the tests, while one had the commit reverted.

getsaurabh02 on Nov 19, 2021

In order to have a basis of comparison for 1.1 and rule out test configuration differences, I’ve triggered 4 new tests against the OpenSearch 1.1, the should be complete in ~8 hours

efe7ef60-4a6c-498e-a50e-78da62adaf0a OpenSearch-1-1-0–RC-x64-disable
50243fa9-f4e6-4886-9cd8-5201e8b3852d OpenSearch-1-1-0–RC-x64-enable
d6bc9610-2c1e-474e-80e4-44f3820a461b OpenSearch-1-1-0–RC-arm64-disable
63f7f5e8-3e12-45cf-86d5-09100dbcf6ef OpenSearch-1-1-0–RC-arm64-enable

peternied on Nov 18, 2021

Working on getting rally set up to run min comparison locally will report back

mch2 on Nov 17, 2021