keda: Azure Blob Storage Scaler doesn't list blobs recursively
Proposal
I’m not sure if this is a bug in the current implementation but given the default values, if I upload a blob to foo/bar/blob.txt the scaler will not “see” the file in order to count it. I think (next to 0 Go knowledge) this is because https://github.com/kedacore/keda/blob/c2ad43eb9adbee0517e01afe60683faf13f8cb2a/pkg/scalers/azure/azure_blob.go#L26 calls ListBlobsHierarchySegment whereas ListBlobsFlatSegment tells the Azure API to “flatten” the list of files on the server side prior to returning the list.
If this is a bug then it would be great to get this fixed or docs updated to make this clear. However, if this is intended behaviour it would be great to have this as a new feature whereby a developer can pass a switch to the trigger metadata:
triggers:
- type: azure-blob
metadata:
blobContainerName: mycontainer
blobCount: "5"
blobPrefix: ""
blobDelimiter: "/"
ignoreBlobHierarchy: true
Use-Case
when I upload a blob which includes a directory structure, it is still included in the blobCount to trigger the Scale Target.
Given this directory tree:
.
├── baz.txt
├── bin.txt
└── foo
├── bar
│ └── world.txt
└── hello.txt
The call to GetAzureBlobListLength would return 2
With the proposed feature in place, GetAzureBlobListLength would return 4.
Anything else?
No response
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 1
- Comments: 18 (9 by maintainers)
Commits related to this issue
- Support listing blobs recursively in azure-blob scaler Closes #1789 Signed-off-by: Ahmed ElSayed <ahmels@microsoft.com> — committed to kedacore/keda by ahmelsayed 3 years ago
- Support listing blobs recursively in azure-blob scaler Closes #1789 Signed-off-by: Ahmed ElSayed <ahmels@microsoft.com> — committed to kedacore/keda by ahmelsayed 3 years ago
I can take this up, @kedacore/keda-contributors.
@jasonpaige thanks, but let’s keep this open until the PR is not merged 😃
We’ve agreed to make the following changes:
globPattern: <glob>option that takes place when specified, instead of the restrecursive: "true"which ignores the delimiter, when specifiedI certainly think it does, but it’s different from @jasonpaige & @joachimgoris their scenario who want to scale based on the amount of blobs and not the containers.
So since this was originally reported for blob count, I’d recommend creating a new feature request and link to this one for context so we track both.
@ahmelsayed are you up for implementing both?
@ahmelsayed Instead of counting folders, I think we should support scaling based on blob count and container count which go recusively through all sub-containers.
If I configure
somefolder, I want to be able to scale to 4 since there are 4 text files. In container mode, that would be 2 since I haverun1&run2.This would also be useful for us. We process batch requests and based on the amount of blobs scale our functions. We separate our blobs in folders so batches don’t get mixed. Making it possible for the blob scaler to find blobs recursively would simplify our process.