airflow: Redshift connection breaking change with IAM authentication
Apache Airflow version
2.6.1
What happened
This PR introduced the get_iam_token method in redshift_sql.py
. This is the breaking change as introduces the check for iam
in extras, and it’s set to False by default.
Error log:
self = <airflow.providers.amazon.aws.hooks.redshift_sql.RedshiftSQLHook object at 0x7f29f7c208e0>
conn = redshift_default
def get_iam_token(self, conn: Connection) -> tuple[str, str, int]:
"""
Uses AWSHook to retrieve a temporary ***word to connect to Redshift.
Port is required. If none is provided, default is used for each service
"""
port = conn.port or 5439
# Pull the custer-identifier from the beginning of the Redshift URL
# ex. my-cluster.ccdre4hpd39h.us-east-1.redshift.amazonaws.com returns my-cluster
> cluster_identifier = conn.extra_dejson.get("cluster_identifier", conn.host.split(".")[0])
E AttributeError: 'NoneType' object has no attribute 'split'
.nox/test-3-8-airflow-2-6-0/lib/python3.8/site-packages/airflow/providers/amazon/aws/hooks/redshift_sql.py:107: AttributeError
What you think should happen instead
It should have backward compatibility
How to reproduce
Run an example DAG for redshift with the AWS IAM profile given at hook initialization to retrieve a temporary password to connect to Amazon Redshift.
Operating System
mac-os
Versions of Apache Airflow Providers
No response
Deployment
Astronomer
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project’s Code of Conduct
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 17 (14 by maintainers)
Indeed,
conn.host.split(".")[0]
is compiled regardless the extra values. I will fix it.@dstandish @hussein-awala @phanikumv This is a bug in the current implementation as per the PR.
When user passes the following as part of redshift connection, it would break:
The following image shows that the
host
is not set in the connection butcluster_identifier
is set.Even though the
cluster_identifier
is set here this would break with the following error on the line as host is not set becauseconn.host
is evaluated first and when we split it toNoneType
object it would throw attribute error.Following is the test we did.
Thanks @pankajkoti for your support with the connection details we use in astro-sdk and debugging.
Conclusion:
This is a bug in the current implementation for all users when
host
is not set in the connection (not only for default connections). This would require an null check forconn.host
.that’s interesting, and, maybe not the best choice (particularly given that it looks like it would blow up, since host is not populated), but it’s just an example connection. and, looks like it was probably released at same time as this [new] feature.
don’t get me wrong, i think we should make the code better, but just doesn’t seem, strictly speaking, to be a breaking change – whatever that’s worth.