google-cloud-python: BigQuery: insert_rows does not seem to work
Hello, I have this code snippet:
client = bigquery.Client(...)
table = client.get_table(
self.client.dataset("Integration_tests").table("test")
)
print(table.schema)
rows = [
{"doi": "test-{}".format(i), "subjects": ["something"]}
for i in range(1000)
]
client.insert_rows(table, rows)
This produces the following output:
DEBUG:urllib3.util.retry:Converted retries value: 3 -> Retry(total=3, connect=None, read=None, redirect=None, status=None)
DEBUG:google.auth.transport.requests:Making request: POST https://accounts.google.com/o/oauth2/token
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): accounts.google.com:443
DEBUG:urllib3.connectionpool:https://accounts.google.com:443 "POST /o/oauth2/token HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): www.googleapis.com:443
DEBUG:urllib3.connectionpool:https://www.googleapis.com:443 "GET /bigquery/v2/projects/{projectname}/datasets/Integration_tests/tables/test HTTP/1.1" 200 None
[SchemaField('doi', 'STRING', 'REQUIRED', None, ()), SchemaField('subjects', 'STRING', 'REPEATED', None, ())]
DEBUG:urllib3.connectionpool:https://www.googleapis.com:443 "POST /bigquery/v2/projects/{projectname}/datasets/Integration_tests/tables/test/insertAll HTTP/1.1" 200 None
It seems like it worked, but when I go to my table it’s empty. Any idea?
Python version: 3.6.0 Libraries version: google-cloud-bigquery==1.1.0 google-cloud-core==0.28.1
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 32 (10 by maintainers)
Okay, I think I might have found a solution.
In the “Streaming into ingestion-time partitioned tables” section on this page there is the suggestion that the partition can be explicitly specified with the syntax
mydataset.table$20170301.If I do this (so replace
table_ref = dataset_ref.table('payload_logs')withdataset_ref.table('payload_logs$20190913')in the code above), then it works, and the result is immediately returned by the queries.This is a bit surprising to me, because if I don’t specify the partitiontime explicitly, then I’d expect BigQuery to simply take the current UTC date, which seems to be identical to what I’m doing when I’m specifying it in code.
Anyhow, this seems to solve the issue.
I had the same problem. I got around it by using jobs to push data instead of
client.insert_rowsLike this:
Reference: https://cloud.google.com/bigquery/docs/loading-data-local
@shollyman Thanks. Yes, in my script I delete and create table, then insert data into the table. And, I just tried to use a new table id and insert 100 rows, right after the insert finishes and use SELECT to query, only 1 row appears. After a while I did the query again, the 100 rows are returned. So it is expected that the new insert will Unavailable for some time? How long will that time be?