OpenSearch: [BUG] repository-azure plugin hangs in OpenSearch >= 1.2.0
Describe the bug Since 1.2.0, the repository-azure plugin stop working correctly. The PUT command to create the new repository hanging forever and the thread pool queue is filling up with 120 generic tasks and the master node is eating all the cpu resources it got:
"CNLPL4MfQ1aOeA1io2LXKw:44940" : {
"node" : "CNLPL4MfQ1aOeA1io2LXKw",
"id" : 44940,
"type" : "transport",
"action" : "cluster:admin/snapshot/get",
"start_time_in_millis" : 1639552118629,
"running_time_in_nanos" : 205051397113,
"cancellable" : false,
"parent_task_id" : "uY6TEyVlSQCxiJxkMJq6Sg:10583",
"headers" : { }
},
Nothing is logged. Is there anyway to enable debug logging on plugins?
Also, if you look at transactions/sec metrics in the azure storage account, there is thousands of them:

To Reproduce Steps to reproduce the behavior:
-
Add Azure Storage Account info (name and sas token) in keystore azure.client.default.account azure.client.default.sas_token
-
Create the snapshot repository.
PUT _snapshot/azure
{
"type": "azure",
"settings": {
"client": "default",
"container": "opensearch"
"base_path": "subfolder"
}
}
This should hangs forever. 3. See the thread pool or running tasks
GET /_cat/thread_pool
GET _tasks
Expected behavior
{
"acknowledged" : true
}
Plugins
- repository-azure
Host/Environment (please complete the following information):
- opensearch 1.2.1 docker image running in Kubernetes
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 7
- Comments: 24 (10 by maintainers)
Commits related to this issue
- repository-azure: revert the fix for https://github.com/opensearch-project/OpenSearch/issues/1734 once upstream solution is available Signed-off-by: Andriy Redko <andriy.redko@aiven.io> — committed to reta/OpenSearch by reta 2 years ago
- repository-azure: revert the fix for https://github.com/opensearch-project/OpenSearch/issues/1734 once upstream solution is available (#2446) Signed-off-by: Andriy Redko <andriy.redko@aiven.io> — committed to opensearch-project/OpenSearch by reta 2 years ago
- repository-azure: revert the fix for https://github.com/opensearch-project/OpenSearch/issues/1734 once upstream solution is available (#2446) Signed-off-by: Andriy Redko <andriy.redko@aiven.io> (cher... — committed to opensearch-project/OpenSearch by reta 2 years ago
- repository-azure: revert the fix for https://github.com/opensearch-project/OpenSearch/issues/1734 once upstream solution is available (#2446) (#2475) Signed-off-by: Andriy Redko <andriy.redko@aiven.i... — committed to opensearch-project/OpenSearch by opensearch-trigger-bot[bot] 2 years ago
I confirm that the updated plugin is working as expected.
Confirmed that 1.2.3 produces the correct behavior for me:
Transactions / sec is back to an expected range:
Full x64 distribution build, https://ci.opensearch.org/ci/dbc/distribution-build-opensearch/1.2.3/126/linux/x64/dist/opensearch/opensearch-1.2.3-linux-x64.tar.gz
Anyone to try this on Azure?
Other manifests/components: https://ci.opensearch.org/ci/dbc/distribution-build-opensearch/1.2.3/126/linux/arm64/builds/opensearch/manifest.yml https://ci.opensearch.org/ci/dbc/distribution-build-opensearch/1.2.3/126/linux/arm64/dist/opensearch/manifest.yml https://ci.opensearch.org/ci/dbc/distribution-build-opensearch/1.2.3/126/linux/x64/builds/opensearch/manifest.yml https://ci.opensearch.org/ci/dbc/distribution-build-opensearch/1.2.3/126/linux/x64/dist/opensearch/manifest.yml
@reta @uncycler @juntezhang care to confirm that @reta’s fix works in https://ci.opensearch.org/ci/dbc/distribution-build-opensearch/1.2.2/102/linux/x64/builds/opensearch/core-plugins/repository-azure-1.2.2.zip, this does not have a version increment, we’re going to do this and go to 1.2.3.
If you need an OpenSearch-min, https://ci.opensearch.org/ci/dbc/distribution-build-opensearch/1.2.2/102/linux/x64/builds/opensearch/dist/opensearch-min-1.2.2-linux-x64.tar.gz
I was going to suggest that. There’s no reason for these plugins to be tied to OpenSearch IMO. Appreciate if you could open an issue either way.
We’ll increment the version and make a tag like we always do. 1.2 is just the line for all the 1.2.x releases.
@PaulLesur @juntezhang so the issue is closely related to https://github.com/FasterXML/jackson-databind/issues/3322 and in the nutshell, Azure Blob APIs V12 heavily relies on the fact that empty XML elements / attributes are going to be nullified.
However, sadly, it highly depends on XMLInputReader instance being picked up at runtime: the Woodstox does that, whereas the default one from JDK
com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpldoes not. It leads to infinite loop withinlistBlobsByHierarchyorlistBlobs- the page navigation only understandsnullas termination condition.Working on the fix now.
@reta there are no exceptions logged by OpenSearch. It just hangs.
@juntezhang looking into it