aws-sdk-go-v2: DynamoDB returns operation error `use of closed network`

Describe the bug

SDK often returns the error below after updated SDK version.

acutal error:

operation error DynamoDB: Query, $tableName: https response error StatusCode: 200, RequestID: 123456, deserialization failed, failed to decode response body, read tcp $src->$dst: use of closed network connection

Expected Behavior

No errors

Current Behavior

operation error DynamoDB: Query, $tableName: https response error StatusCode: 200, RequestID: 123456, deserialization failed, failed to decode response body, read tcp $src->$dst: use of closed network connection

Reproduction Steps

	ddb := dynamodb.NewFromConfig(defaultCfg)
	result, err := ddb.Query(
		ctx,
		&dynamodb.QueryInput{
			TableName:                 aws.String(tableName),
			ExpressionAttributeNames:  expr.Names(),
			ExpressionAttributeValues: expr.Values(),
			KeyConditionExpression:    expr.KeyCondition(),
		},
	)

Possible Solution

No response

Additional Information/Context

No response

AWS Go SDK V2 Module Versions Used

	github.com/aws/aws-sdk-go-v2 v1.16.8
	github.com/aws/aws-sdk-go-v2/config v1.15.15
	github.com/aws/aws-sdk-go-v2/feature/dynamodb/attributevalue v1.9.8
	github.com/aws/aws-sdk-go-v2/feature/dynamodb/expression v1.4.14
	github.com/aws/aws-sdk-go-v2/service/dynamodb v1.15.10
	github.com/aws/smithy-go v1.12.0

Compiler and Version used

1.19

Operating System and version

Linux

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 1
  • Comments: 18 (4 by maintainers)

Most upvoted comments

We are seeing the same issue as well, can we please resurrect this issue?

As far as the error being returned is concerned - the SDK is behaving correctly by not retrying in this case. As people have noted in this thread and others, we DO NOT retry this class of error (ECONNRESET) because it is not deterministically safe to do so.

As for the root case, we have no hand in the networking internals of SDK requests - we defer to the configured HTTP client, which is either the standard library implementation in the default case, or one provided by the user.

We probably have same problem on SQS DeleteMessage. (Request ID and source IP are masked)

operation error SQS: DeleteMessage, https response error StatusCode: 200, RequestID: xxxx-xxxx-xxxx-xxxx-xxxx, deserialization failed, failed to discard response body, read tcp xxx.xxx.xxx.xxx:xxx->18.183.37.45:443: use of closed network connection

go 1.19.3, aws-sdk-go-v2 v1.17.2

Just leaving a comment here to confirm that this problem still exists with the following versions and a local running dynamodb docker instance in gitlab:

Go v1.20.3 github.com/aws/aws-sdk-go-v2 v1.17.7 github.com/aws/aws-sdk-go-v2/config v1.18.2 github.com/aws/aws-sdk-go-v2/credentials v1.13.2 github.com/aws/aws-sdk-go-v2/feature/dynamodb/attributevalue v1.10.19 github.com/aws/aws-sdk-go-v2/service/dynamodb v1.19.2 github.com/aws/smithy-go v1.13.5 amazon/dynamodb-local:1.21.0

A retry strategy cannot always be used to retry this error, due to the same reason you are not adding this as a retryable error on SDK level: e.g. adding an element to a list in dynamodb with an update expression is not idempotent and cannot be retried since we do not know if the request was successful or not.

Currently our BDD test suite failes about ~5% of times in the gitlab pipeline because of this error, which is mildly annoying. If our tests are failing then it’s usually in about the same scenario and steps, so I could assume it could be related to a lifecycle of a connection. Either from client or from local dynamodb perspective.

To me it seems that this is happening when the running system is under heavy load.

you can fork smithy-go with a small fix that outcomments a three-line block to avoid closing the request body too early, just as @psmarcin did, and we are also doing now for our production services now - with no problems at all.

https://github.com/psmarcin/smithy-go https://github.com/cwd-nial/smithy-go

Maybe as @a-h commented, does the issue just happen when connecting to DynamoDB local running in Docker?