spark-bigquery-connector: Missing maven dependencies when using --packages and ClassNotFound when using --jars
Hi,
I want to play a little bit with the BigQuery connector (on AWS EMR version 5.24.1 with Spark 2.4.2) and run this command: pyspark --packages com.google.cloud.spark:spark-bigquery_2.11:0.9.1-beta
. But the following three dependencies seem to be missing in maven central:
- javax.jms#jms;1.1!jms.jar
- com.sun.jdmk#jmxtools;1.2.1!jmxtools.jar
- com.sun.jmx#jmxri;1.2.1!jmxri.jar
As a workaround, I tried to download the JAR from here: https://console.cloud.google.com/storage/browser/spark-lib/bigquery and add it to the classpath with this command: pyspark --jars spark-bigquery-latest.jar
. But when I tried to read a table from BigQuery, I get this error: ClassNotFoundException: Failed to find data source: com.google.cloud.spark.bigquery
.
I also tried to use com.google.cloud.spark.bigquery instead of just “bigquery” in format(), without success.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 1
- Comments: 15 (6 by maintainers)
Okay, now it works with:
And in the code just:
Thanks a lot for your support!
Created #72 to handle the --packages issue