seaweedfs: S3 ListObjectsV2 pagination is broken
Describe the bug
When making a ListObjectsV2 call to the SeaweedFS S3 API, IsTruncated
is not set to true
in the response, when a bucket contains more keys that can be listed.
Most S3 client support a maximum of 1,000 objects in a response. This is the maximum allowable response size per the S3 API: https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html. It appears SeaweedFS is returning 10,000 keys per response, which appears to violate this spec. This issue is about pagination, not the KeyCount
violation.
[max-keys](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html#API_ListObjectsV2_RequestSyntax)
Sets the maximum number of keys returned in the response. By default the action returns **up to 1,000 key** names. The response might contain fewer keys **but will never contain more**.
As a result, on clusters with more than 1,000 objects, a ListObjectsV2 response will only return 1,000 keys. Therefore, no pagination via ContinuationToken
can be performed.
On a cluster with millions of objects, the following XML is returned. Note the value of IsTruncated
:
<?xml version="1.0" encoding="UTF-8"?>\n<ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Name>bucket-name</Name><Prefix></Prefix><MaxKeys>10000</MaxKeys><Delimiter>/</Delimiter><IsTruncated>false</IsTruncated><Contents><Key>
...
...
...
<KeyCount>10</KeyCount></ListBucketResult>
System Setup SeaweedFS v3.08 on bare metal k8s, Docker Host OS: Ubuntu 20.04 LTS version 30GB 3.08 8a49240d linux amd64
NOTE: Using a different cluster, this problem does not occur using SeaweedFS 2.93.
Expected behavior
When there are more keys in the bucket that can be listed, ListObjectsV2 response should respond with IsTruncated
set to true
, and NextContinuationToken
should contain a token value to continue pagination.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 1
- Comments: 24 (8 by maintainers)
Commits related to this issue
- fix subdir pagination https://github.com/chrislusf/seaweedfs/issues/3166 — committed to kmlebedev/seaweedfs by kmlebedev 2 years ago
- s3: use cursor to track tree traversal fix https://github.com/seaweedfs/seaweedfs/issues/3166 — committed to seaweedfs/seaweedfs by chrislusf 2 years ago
- Fix s3 pagination (#3436) * Revert previous changes * s3: use cursor to track tree traversal fix https://github.com/seaweedfs/seaweedfs/issues/3166 * special cases for empty prefix and empty direc... — committed to martyanov/seaweedfs by chrislusf 2 years ago
- Fix s3 pagination (#3436) * Revert previous changes * s3: use cursor to track tree traversal fix https://github.com/seaweedfs/seaweedfs/issues/3166 * special cases for empty prefix and empty... — committed to nickb937/seaweedfs by chrislusf 2 years ago
@chrislusf As far as I’m aware, this is still an open bug. SeaweedFS listing 10,000 keys by default (which does not match AWS) is fine. The problem here is that isTruncated is not being set to
true
. Therefore, clients will not know they have to paginate for further list responses.