flyte: [BUG] Azure Workload Identity not working with `fsspec` in `flytekit`

Describe the bug

For workload identity to work within AzureBlobFileSystem, the filesystem needs to be instantiated with an account name and anon set to False. Currently, due to how get_filesystem() is set up, we cannot pass the right kwargs for Workload Identity to work.

The workaround is to set AZURE_STORAGE_ACCOUNT_NAME and AZURE_STORAGE_ACCOUNT_KEY, which will allow flytekit to talk to blob storage, but it would be nice to support Workload Identity directly.

This relates to the discussion in https://github.com/flyteorg/flyte/issues/3945

Expected behavior

Enabling Azure Workload Identity should work out of the box when flytekit tries to read/write to blob storage.

Additional context to reproduce

No response

Screenshots

No response

Are you sure this issue hasn’t been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Reactions: 1
  • Comments: 24 (12 by maintainers)

Most upvoted comments

I created a small PR to extend the flyte-binary helm chart for azure storage support. I tested 1813 and 1842 to confirm the behavior.

The Use case would be to have one storage account for flyte metadata and multiple storage accounts for different flyte projects and domains where user data can be uploaded/downloaded via Flytefiles or Flytedirectories. But like you said, i think it is not possible at the moment because of the relative output format.

If you want your python flytekit code to write to the same storage account where stow is writing metadata, why dont just set the env variable AZURE_STORAGE_ACCOUNT_NAME which is being read by adlfs already? I am using workload identities for azure and with setting this env variable, it works like expected. This way we dont need the special env variables?

@wild-endeavor I can make a PR to add that to get_filesystem() if you’d like. I know you’re planning to refactor, but this is a 2 lines change that could help inform your refactor as well