google-api-python-client: intermittent error 400 when calling bigquery API
I’m experiencing a cryptic error message while running a long bigquery job:
File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 851, in execute
raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 400 when requesting https://www.googleapis.com/bigquery/v2/projects/<project-name>/queries/job_<id>?pageToken=<page_token>&alt=json returned "{0}">
The point of attention is the returned "{0}"
part of the message where the reason was supposed to be. When looking at the job history with the bq --format=prettyjson show -j <jobid>
command it also doesn’t show the reason field, so that may be the root cause.
Environment details
- Python version: 3.6.5
- google-api-python-client: 1.7.8
- google-cloud-bigquery: 1.11.2
- google-cloud-storage: 1.15.0
Steps to reproduce
This error occurs somewhat randomly while processing a long running bigquery job.
After retrying a few times the job usually succeeds. I wonder if this is related with cause resourcesExceeded
documented in https://cloud.google.com/bigquery/troubleshooting-errors. The other possible causes doesn’t seem to be related.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 18 (3 by maintainers)
Checked this morning, it looks like the backend changes to address the malformed response have been deployed to tasks in all regions. Cases where requests were getting the 400 response and the {0} body were representative of a backend component where error propagation was not being handled correctly. Now such conditions should yield an appropriate retryable HTTP error, and a properly formatted response.
@danicat Thanks again for reporting this, the additional details you were able to provide about specific requests were helpful in tracking this down.
@danicat It looks like the BigQuery engineering team has identified a situation in the tabledata backend that can trigger the unusual 400 response with the poorly formatted response. No details/ETA on resolution yet, but thanks for the report helping to identify this. I’ll keep this issue open until the internal issue is resolved.
It smells like a backend error, but absent more information we’ll have trouble tracking it down. If you can provide more details about when it last happened and values like the project id, we might be able to find something useful in the backend logs. Feel free to send the details to me privately (my github username at google.com) if you’d prefer not to disclose them in the public issue.
Yes, and in this particular execution the job managed to fetch several rows before hitting this error.
The
resourcesExceeded
error was documented under HTTP 400 on https://cloud.google.com/bigquery/troubleshooting-errors. The other 400 errors doesn’t seem to make any sense here.CC @leahecole as FYI re: BigQuery and Airflow / Composer (and maybe motivation to switch Airflow to google-cloud-bigquery)