beam: [Failing Test]: Various TPC-DS queries throw NPEs using SparkRunner

What happened?

Various TPC-DS queries started throwing NPEs with the SparkRunner some while back (see here):

java.lang.NullPointerException
        at org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:903)
        at org.apache.beam.sdk.util.WindowedValue$TimestampedWindowedValue.<init>(WindowedValue.java:312)
        at org.apache.beam.sdk.util.WindowedValue$TimestampedValueInGlobalWindow.<init>(WindowedValue.java:329)
        at org.apache.beam.sdk.util.WindowedValue.of(WindowedValue.java:95)

Without looking further into the underlying root cause, this seems to be related to #27617.

Issue Failure

Failure: Test is flaky

Issue Priority

Priority: 2 (backlog / disabled test but we think the product is healthy)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

About this issue

  • Original URL
  • State: closed
  • Created 10 months ago
  • Comments: 27 (27 by maintainers)

Commits related to this issue

Most upvoted comments

I created an issue for that #29198

Yes, I’ll fix this

Ah, I see, gs://beam-tpcds/datasets/parquet/nonpartitioned.

Yes, I’d only like to walk the code again to be sure exactly what might be the impact of the fix. Yes, it is strange it was not caught by VR tests. I’ll look into it.