nessie: Nessie Http parse exception
I have been using this system (nessie & iceberg & spark) all along because it has been running smoothly, so I have never changed any dependency versions. However, suddenly, when another colleague who did not have a cache built the same dockerfile, I encountered the following error while running the program to insert data into the table:
Dependency:
- Nessie version: 0.67.0 nessie-spark-extensions-base、nessie-spark-extensions-3.3、nessie-client、nessie-model
- Spark version: 3.3.2
- Scala : 2.13 -iceberg: 1.3.1
Exception:
org.projectnessie.client.http.HttpClientException: Cannot parse response.
at org.projectnessie.client.http.HttpResponse.readEntity(HttpResponse.java:56)
at org.projectnessie.client.http.HttpResponse.readEntity(HttpResponse.java:78)
at org.projectnessie.client.http.HttpTreeClient.getReferenceByName(HttpTreeClient.java:82)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at org.projectnessie.client.http.NessieHttpClient$ExceptionRewriter.invoke(NessieHttpClient.java:79)
at jdk.proxy2/jdk.proxy2.$Proxy27.getReferenceByName(Unknown Source)
at org.projectnessie.client.http.v1api.HttpGetReference.get(HttpGetReference.java:35)
at org.apache.iceberg.nessie.NessieIcebergClient.loadReference(NessieIcebergClient.java:110)
at org.apache.iceberg.nessie.NessieIcebergClient.lambda$new$0(NessieIcebergClient.java:81)
at org.apache.iceberg.relocated.com.google.common.base.Suppliers$NonSerializableMemoizingSupplier.get(Suppliers.java:183)
at org.apache.iceberg.nessie.NessieIcebergClient.getRef(NessieIcebergClient.java:89)
at org.apache.iceberg.nessie.NessieIcebergClient.refresh(NessieIcebergClient.java:93)
at org.apache.iceberg.nessie.NessieTableOperations.doRefresh(NessieTableOperations.java:115)
at org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:97)
at org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:80)
at org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:47)
at com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2406)
at java.base/java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1916)
at com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2404)
at com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2387)
at com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108)
at com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62)
at org.apache.iceberg.CachingCatalog.loadTable(CachingCatalog.java:166)
at org.apache.iceberg.spark.SparkCatalog.load(SparkCatalog.java:642)
at org.apache.iceberg.spark.SparkCatalog.loadTable(SparkCatalog.java:160)
at org.apache.spark.sql.connector.catalog.CatalogV2Util$.loadTable(CatalogV2Util.scala:311)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.$anonfun$lookupRelation$3(Analyzer.scala:1202)
at scala.Option.orElse(Option.scala:477)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.$anonfun$lookupRelation$1(Analyzer.scala:1201)
at scala.Option.orElse(Option.scala:477)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveRelations$$lookupRelation(Analyzer.scala:1193)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$13.applyOrElse(Analyzer.scala:1049)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$13.applyOrElse(Analyzer.scala:1028)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$3(AnalysisHelper.scala:138)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:176)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:138)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:1028)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:987)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:211)
at scala.collection.LinearSeqOps.foldLeft(LinearSeq.scala:183)
at scala.collection.LinearSeqOps.foldLeft$(LinearSeq.scala:179)
at scala.collection.immutable.List.foldLeft(List.scala:79)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:208)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:200)
at scala.collection.immutable.List.foreach(List.scala:333)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:200)
at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:231)
at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$execute$1(Analyzer.scala:227)
at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withNewAnalysisContext(Analyzer.scala:173)
at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:227)
at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:188)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179)
at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:179)
at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:212)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:330)
at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:211)
at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:76)
at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:185)
at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510)
at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:185)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:184)
at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:76)
at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:74)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:81)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:79)
at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:116)
at org.apache.spark.sql.DataFrameWriterV2.runCommand(DataFrameWriterV2.scala:195)
at org.apache.spark.sql.DataFrameWriterV2.append(DataFrameWriterV2.scala:149)
at ai.weride.datalake_service.write_center.Insert$.$anonfun$writeToIceberg$3(Insert.scala:49)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
at zio.internal.FiberRuntime.evaluateEffect(FiberRuntime.scala:381)
at zio.internal.FiberRuntime.evaluateMessageWhileSuspended(FiberRuntime.scala:504)
at zio.internal.FiberRuntime.drainQueueOnCurrentThread(FiberRuntime.scala:220)
at zio.internal.FiberRuntime.run(FiberRuntime.scala:139)
at zio.internal.ZScheduler$$anon$4.run(ZScheduler.scala:476)
Caused by: com.fasterxml.jackson.databind.exc.InvalidDefinitionException: Cannot construct instance of `org.projectnessie.model.Reference` (no Creators, like default constructor, exist): abstract types either need to be mapped to concrete types, have custom deserializer, or contain additional type information
at [Source: (jdk.internal.net.http.ResponseSubscribers$HttpResponseInputStream); line: 1, column: 1]
at com.fasterxml.jackson.databind.exc.InvalidDefinitionException.from(InvalidDefinitionException.java:67)
at com.fasterxml.jackson.databind.DeserializationContext.reportBadDefinition(DeserializationContext.java:1904)
at com.fasterxml.jackson.databind.DatabindContext.reportBadDefinition(DatabindContext.java:400)
at com.fasterxml.jackson.databind.DeserializationContext.handleMissingInstantiator(DeserializationContext.java:1349)
at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserialize(AbstractDeserializer.java:274)
at com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:323)
at com.fasterxml.jackson.databind.ObjectReader._bindAndClose(ObjectReader.java:2051)
at com.fasterxml.jackson.databind.ObjectReader.readValue(ObjectReader.java:1427)
at org.projectnessie.client.http.HttpResponse.readEntity(HttpResponse.java:53)
... 86 more
BTW: I had no issues running it locally (IDE, sbt run). However, when I used sbt pack (which is also my dockerfile’s execution plan), I was able to reproduce this issue consistently.
About this issue
- Original URL
- State: closed
- Created 7 months ago
- Comments: 15 (8 by maintainers)
Yes, see https://projectnessie.org/tools/iceberg/spark/
I’m closing this issue, because it’s not a bug. Feel free to discuss further on https://project-nessie.zulipchat.com/
@dimas-b is correct. Having iceberg-nessie AND nessie-model/client on the classpath is not supported.
Mixing
iceberg-spark-runtime-3.3_2.13-1.3.1.jar
andnessie-model-0.67.0.jar
is not supported.Iceberg shades Jackson (which includes serialization annotations), but
nessie-model-0.67.0.jar
references Jackson classes directly. This creates a conflict-prone environment where class resolution is not deterministic… This is probably why behaviour changes on restart.