hudi: [SUPPORT] Should we shade all aws dependencies to avoid class conflicts?
As we introduce support for DynamoDb based lock by HUDI-2314, can we shade all aws dependencies for all our bundled jars(spark, flink)? As many users also import their own aws packages but also use hudi, which could cause many class conflicts like following error:
java.lang.NoSuchMethodError: com.amazonaws.http.HttpResponse.getHttpRequest()Lcom/amazonaws/thirdparty/apache/http/client/methods/HttpRequestBase;
at com.amazonaws.services.s3.internal.S3ObjectResponseHandler.abortableIs(S3ObjectResponseHandler.java:64)
at com.amazonaws.services.s3.internal.S3ObjectResponseHandler.handle(S3ObjectResponseHandler.java:57)
I’m not sure whether shading these jars will introduce other issues or not. @zhedoubushishi Can you take a look at this issue?
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 1
- Comments: 16 (12 by maintainers)
After some discussions, we think that we should keep cloud provider’s jars out of open source bundle jars. Any cloud provider can create its own specific hudi module and hudi bundle jars. (like
hudi-awsandhudi-spark-aws-bundlefor example) But open source bundle jars should stay neutral. cc @danny0405 @nsivabalan @codope @vinothchandar @zhedoubushishi @umehrot2I’ve pivoted this ticket to removing bundle deps to align with flink bundle changes. https://issues.apache.org/jira/browse/HUDI-3157
@zhedoubushishi If to help users use the bundle a bit easier, as I suggested above, please consider adding an aws specific hudi bundle to resolve dependency problem. Hope this could align with your thoughts too.
For our internal hudi version, we shade aws dependencies, you can add new relocation and build a new bundle package:
For example, to shade aws dependencies in spark, add following codes in packaging/hudi-spark-bundle/pom.xml