influxdb-client-python: Occasionally losing data points along with error message: "The batch item wasn't processed successfully because: (400) {"code":"invalid","message":"writing requires points"}"

I’ve been encountering occasional errors with a really simple python program to write batches of points. I thought my usage was more-or-less very basic, so I’m not clear why this is happening. Perhaps the batching machinery is improperly creating empty batches and dropping points?

I get many log messages like the following:

The batch item wasn't processed successfully because: (400)
Reason: Bad Request
HTTP response headers: HTTPHeaderDict({'Date': 'Sat, 11 Apr 2020 23:14:18 GMT', 'Content-Type': 'application/json; charset=utf-8', 'Content-Length': '54', 'Connection': 'keep-alive', 'Strict-Transport-Security': 'max-age=15724800; includeSubDomains', 'x-platform-error-code': 'invalid'})
HTTP response body: {"code":"invalid","message":"writing requires points"}

Observe how the chronograf.csv data is missing some values like (0, 9, 27, 30, 36, etc).

I’ve attached some sample code, sample local output, and a sample CSV exported from InfluxDB explorer UI. Also attached in this Gist for nice formatting.

SampleCode.py.txt LocalOutput.txt 2020-04-11-16-47_chronograf_data.csv.txt

Configuration Info:

InfluxDB version: InfluxDB Cloud 2.0 influxdb_client python module version: 1.5.0 Python version: 3.7.3 OS: Raspbian Linux (Buster)

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 16 (7 by maintainers)

Most upvoted comments

The issue seems to be fixed. Please use with statement for initializing the client and the batching write_api.

The following script was used for testing:

import time
from datetime import datetime
from influxdb_client import InfluxDBClient, WriteOptions, Point

url = "https://us-west-2-1.aws.cloud2.influxdata.com"
token = "..."
org = "..."
bucket = "..."
measurement = "python-loosing-data_" + str(datetime.now())

with InfluxDBClient(url=url, token=token, debug=False) as client:
    options = WriteOptions(batch_size=8, flush_interval=8, jitter_interval=0, retry_interval=1000)
    with client.write_api(write_options=options) as write_api:
        for i in range(50):
            valOne = float(i)
            valTwo = float(i) + 0.5
            pointOne = Point(measurement).tag("sensor", "sensor1").field("PSI", valOne).time(time=datetime.utcnow())
            pointTwo = Point(measurement).tag("sensor", "sensor2").field("PSI", valTwo).time(time=datetime.utcnow())

            write_api.write(bucket, org, [pointOne, pointTwo])
            print("PSI Readings: (%f, %f)" % (valOne, valTwo))
            time.sleep(0.5)

    query = f'from(bucket: "{bucket}") |> range(start: 0) |> filter(fn: (r) => r["_measurement"] == "{measurement}") |> count()'
    tables = client.query_api().query(query, org)
    for table in tables:
        for record in table.records:
            print(f'{record.get_measurement()}: {record.get_field()} count: {record.get_value()}')

print("end")

bednar on Oct 26, 2021

Hi @bednar, SYNCHRONOUS works, thanks!

Neocryan on Jul 1, 2020

I am facing the same / very similar issue. I am using the following parameters:

batch_size=1_000, flush_interval=10, retry_interval=1_000

Tried with flush interval of 1, 5, 10, 20, 25 … . I didn’t find the exact level where I lose data. I don’t need to flush my data this fast, but as @joeyhagedorn said: this shouldn’t occur.

Even though not all data is saved to InfluxDB, I don’t get any errors (debug mode is enabled). It might be a memory leak. I don’t see much load on my InfluxDB server. On my “client” server I see that one core is used for about 80% (I should look if I could parallelize the workload).

FYI, server specs (InfluxDB):

16 virtual cores (Intel Xeon Platinum 8176)
64GB RAM (about 12GB in use)
SSD storage

Client is running on another VM, same server, 32GB RAM.

Using 1.8.0.dev0 client version, Influx version is InfluxDB 2.0.0-beta.10 (will check if beta 12 solves the issue).

Eslih on Jun 23, 2020