beam: [Bug]: Fix or rewrite a broken Avro Example.
What happened?
The input for the avro bitcoin example defined here is not accessible in GCS: https://github.com/apache/beam/blob/ca0787642a6b3804a742326147281c99ae8d08d2/sdks/python/apache_beam/examples/avro_bitcoin.py#L116
Listing the files in that bucket returns a 404, bucket not found exception
gsutil -m ls -lh "gs://beam-avro-test/bitcoin/txns/*"
# BucketNotFoundException: 404 gs://beam-avro-test bucket does not exist.
We should fix the example or rewrite it to a working one.
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
- Component: Python SDK
- Component: Java SDK
- Component: Go SDK
- Component: Typescript SDK
- Component: IO connector
- Component: Beam examples
- Component: Beam playground
- Component: Beam katas
- Component: Website
- Component: Spark Runner
- Component: Flink Runner
- Component: Samza Runner
- Component: Twister2 Runner
- Component: Hazelcast Jet Runner
- Component: Google Cloud Dataflow Runner
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 20 (10 by maintainers)
No problem, let us know if you need any help. Have fun!
For the reference, here is the code I finally used to convert one parquet to avro:
Done.
Can you give it another try? hopefully, this time the files are correct.
Let me fix this. I will update these files later. Thanks for catching this issue.
Done with uploading. gs://apache-beam-samples/nyc_trip/avro contain the converted avro files. Please let me know if these files work for you. Thanks!
I am uploading both parquet and avro formats under gs://apache-beam-samples/nyc_trip
I changed the schema a little bit:
+1 to @liferoad’s comment.
You can pick up any dataset you like. We can help you put them under
gs://apache-beam-samples
.Thanks for trying out the example. Looks like it is broken and doesn’t have an integration test exercising it. I asked for details on https://github.com/apache/beam/pull/5496. In the mean time I think we can repurpose this bug to fix this example or provide a better one.