promscale: duplicate data in sample

Hi I have deployed timescaledb using the helm chart and promscale also. the values are by default

When the installation is completed a can show metrics, after a fews minutes the ingest stops and promscale log this.

...
level=warn ts=2020-12-04T11:55:16.545Z caller=sql_ingest.go:657 msg="duplicate data in sample" table=deployment_controller_rate_limiter_use duplicate_count=1 row_count=1
level=warn ts=2020-12-04T11:55:16.545Z caller=sql_ingest.go:657 msg="duplicate data in sample" table=cloudprovider_aws_api_request_duration_seconds_bucket duplicate_count=24 row_count=24
level=warn ts=2020-12-04T11:55:16.545Z caller=sql_ingest.go:657 msg="duplicate data in sample" table=pv_collector_unbound_pv_count duplicate_count=1 row_count=1
level=warn ts=2020-12-04T11:55:16.554Z caller=sql_ingest.go:657 msg="duplicate data in sample" table=cloudprovider_aws_api_request_duration_seconds_bucket duplicate_count=48 row_count=48
level=warn ts=2020-12-04T11:55:16.567Z caller=sql_ingest.go:657 msg="duplicate data in sample" table=rest_client_request_duration_seconds_bucket duplicate_count=8 row_count=8
...

The timescaled log this.

2020-12-04 12:05:33 UTC [31502]: [5fca260d.7b0e-4] postgres@prometheus,app=[unknown] [22021] STATEMENT:  SELECT * FROM _prom_catalog.get_or_create_series_id_for_kv_array($1, $2, $3)
2020-12-04 12:05:33 UTC [31505]: [5fca260d.7b11-3] postgres@prometheus,app=[unknown] [22021] ERROR:  invalid byte sequence for encoding "UTF8": 0x00
2020-12-04 12:05:33 UTC [31505]: [5fca260d.7b11-4] postgres@prometheus,app=[unknown] [22021] STATEMENT:  SELECT * FROM _prom_catalog.get_or_create_series_id_for_kv_array($1, $2, $3)
2020-12-04 12:05:33 UTC [31502]: [5fca260d.7b0e-5] postgres@prometheus,app=[unknown] [00000] LOG:  disconnection: session time: 0:00:00.030 user=postgres database=prometheus host=100.100.0.4 port=39916
2020-12-04 12:05:33 UTC [31505]: [5fca260d.7b11-5] postgres@prometheus,app=[unknown] [00000] LOG:  disconnection: session time: 0:00:00.024 user=postgres database=prometheus host=100.100.0.4 port=39922
2020-12-04 12:05:33 UTC [31503]: [5fca260d.7b0f-3] postgres@prometheus,app=[unknown] [22021] ERROR:  invalid byte sequence for encoding "UTF8": 0x00
2020-12-04 12:05:33 UTC [31503]: [5fca260d.7b0f-4] postgres@prometheus,app=[unknown] [22021] STATEMENT:  SELECT * FROM _prom_catalog.get_or_create_series_id_for_kv_array($1, $2, $3)
2020-12-04 12:05:33 UTC [31503]: [5fca260d.7b0f-5] postgres@prometheus,app=[unknown] [00000] LOG:  disconnection: session time: 0:00:00.030 user=postgres database=prometheus host=100.100.0.4 port=39918
2020-12-04 12:05:33 UTC [31463]: [5fca260b.7ae7-3] postgres@prometheus,app=[unknown] [22021] ERROR:  invalid byte sequence for encoding "UTF8": 0x00
2020-12-04 12:05:33 UTC [31463]: [5fca260b.7ae7-4] postgres@prometheus,app=[unknown] [22021] STATEMENT:  SELECT * FROM _prom_catalog.get_or_create_series_id_for_kv_array($1, $2, $3)
2020-12-04 12:05:33 UTC [31498]: [5fca260d.7b0a-3] postgres@prometheus,app=[unknown] [22021] ERROR:  invalid byte sequence for encoding "UTF8": 0x00
2020-12-04 12:05:33 UTC [31498]: [5fca260d.7b0a-4] postgres@prometheus,app=[unknown] [22021] STATEMENT:  SELECT * FROM _prom_catalog.get_or_create_series_id_for_kv_array($1, $2, $3)
2020-12-04 12:05:33 UTC [31498]: [5fca260d.7b0a-5] postgres@prometheus,app=[unknown] [00000] LOG:  disconnection: session time: 0:00:00.277 user=postgres database=prometheus host=100.100.0.4 port=39906
2020-12-04 12:05:33 UTC [31463]: [5fca260b.7ae7-5] postgres@prometheus,app=[unknown] [00000] LOG:  disconnection: session time: 0:00:01.528 user=postgres database=prometheus host=100.100.0.4 port=39840

the logs was crazy printing the same output.

If the data are duplicated is not inserted in timescale ? why i can`t see more data ? all samples are duplicated?

Thanks

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 2
  • Comments: 18 (9 by maintainers)

Most upvoted comments

Hi @VineethReddy02

I checked all job_names in prometheus and the error was on kube-controller-manager metrics.

Kube controller manager are exposing a metric with a incorrect value.

# HELP node_collector_evictions_number [ALPHA] Number of Node evictions that happened since current instance of NodeController started.
# TYPE node_collector_evictions_number counter
node_collector_evictions_number{zone="eu-central-1:�:eu-central-1a"} 69
# HELP node_collector_unhealthy_nodes_in_zone [ALPHA] Gauge measuring number of not Ready Nodes per zones.
# TYPE node_collector_unhealthy_nodes_in_zone gauge
node_collector_unhealthy_nodes_in_zone{zone="eu-central-1:�:eu-central-1a"} 0
# HELP node_collector_zone_health [ALPHA] Gauge measuring percentage of healthy nodes per zone.
# TYPE node_collector_zone_health gauge
node_collector_zone_health{zone="eu-central-1:�:eu-central-1a"} 100
# HELP node_collector_zone_size [ALPHA] Gauge measuring number of registered Nodes per zones.
# TYPE node_collector_zone_size gauge
node_collector_zone_size{zone="eu-central-1:�:eu-central-1a"} 16

I have disabled the controller metrics and now promscale works perfect. Thanks

@jmvizcainoio and @Alaith for informing and providing the details. We will take a look into this shortly. Thank you very much.

Not sure, but I feel like there is some issue with the source of data.

I’m not familiar enough with how Prometheus scrapes data and ships it to remote storage to judge.

My prometheus.yml was simply statically configured to monitor: itself, node-exporter, and cAdvisor. I’m using the latest version of Prometheus, latest node_exporter, and cAdvisor v0.38.6.

remote_write:
    - url: "http://promscale-connector:9201/write"
remote_read:
    - url: "http://promscale-connector:9201/read"

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']
  - job_name: 'node-exporter'
    static_configs:
      - targets: ['node-exporter:9100']
  - job_name: 'cAdvisor'
    static_configs:
      - targets: ['cadvisor:8080']

@Harkishen-Singh I’ll upgrade later this week and give it a try, thanks for the tip.

We noticed the root cause of level=warn ts=2020-12-09T08:29:49.339Z caller=sql_ingest.go:657 msg=" duplicate data in sample" table=kube_secret_created duplicate_count=6 row_count=6

This is because of cAdvisor exposes the timestamps along with values whereas other exporters will handover the task of attaching timestamps to Prometheus. When prometheus scrapes the cAdvisor target there can be a case where cAdvisor exposes the same samples which is causing a duplicate samples warning log at promscale level. But this shouldn’t be an issue as it just logs warnings (we will be addressing this soon).