telegraf: Event Hub output plugin does not reconnect after a link is closed.

Relevant telegraf.conf

[agent]
interval = "10s"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "0s"
#collection_offset = "0s"
flush_interval = "10s"
flush_jitter = "0s"
precision = "0s"
quiet = false
logtarget = "file"
logfile = 'c:\TIGStack\Logs\Telegraf\secondary.log'
logfile_rotation_interval = "24h"
logfile_rotation_max_size = "100MB"
logfile_rotation_max_archives = 5
#log_with_timezone = ""

[[inputs.influxdb_v2_listener]]
service_address = "127.0.0.1:47596"
parser_type = "upstream"

[[outputs.event_hubs]]
connection_string = "$TELEGRAF_APPLIANCE_CONN_STRING"
data_format = "influx"
namedrop = ["Log"]

[[outputs.event_hubs]]
connection_string = "$TELEGRAF_LOG_CONN_STRING"
data_format = "influx"
namepass = ["Log"]

Logs from Telegraf

2023-09-28T11:56:52Z I! [agent] Stopping running outputs
2023-09-28T11:56:52Z E! [outputs.event_hubs] Error closing output: *Error{Condition: amqp:link:detach-forced, Description: Idle link tracker, link 2ISmltw7WTvmN5IWAIrCa-3LAQRhlW59fRv2EnwINjiWbhncEQscXw has been idle for 1800000ms TrackingId:ba737198-96fa-4dfd-9eb5-ea33c595e045_G23, SystemTracker:numove:EventHub:statistics-logs, Timestamp:2023-09-27T20:12:16, Info: map[]}
2023-09-28T11:56:57Z I! Starting Telegraf 1.27.0
2023-09-28T11:56:57Z I! Available plugins: 237 inputs, 9 aggregators, 28 processors, 23 parsers, 59 outputs, 4 secret-stores
2023-09-28T11:56:57Z I! Loaded inputs: influxdb_v2_listener
2023-09-28T11:56:57Z I! Loaded aggregators: 
2023-09-28T11:56:57Z I! Loaded processors: 
2023-09-28T11:56:57Z I! Loaded secretstores: 
2023-09-28T11:56:57Z I! Loaded outputs: event_hubs (2x)
2023-09-28T11:56:57Z I! Tags enabled: host=nulogik-pnp1-dcmvgn
2023-09-28T11:56:57Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"nulogik-pnp1-dcmvgn", Flush Interval:10s
2023-09-28T11:56:57Z I! [inputs.influxdb_v2_listener] Started HTTP listener service on 127.0.0.1:47596
2023-10-02T16:50:01Z E! [agent] Error writing to outputs.event_hubs: amqp: link closed
2023-10-02T16:50:11Z E! [agent] Error writing to outputs.event_hubs: amqp: link closed
2023-10-02T16:50:21Z E! [agent] Error writing to outputs.event_hubs: amqp: link closed
2023-10-02T16:50:31Z E! [agent] Error writing to outputs.event_hubs: amqp: link closed
2023-10-02T16:50:41Z E! [agent] Error writing to outputs.event_hubs: amqp: link closed
2023-10-02T16:50:51Z E! [agent] Error writing to outputs.event_hubs: amqp: link closed
2023-10-02T16:51:01Z E! [agent] Error writing to outputs.event_hubs: amqp: link closed
2023-10-02T16:51:11Z E! [agent] Error writing to outputs.event_hubs: amqp: link closed
...
2023-10-02T16:57:01Z E! [agent] Error writing to outputs.event_hubs: amqp: link closed
2023-10-02T16:57:11Z W! [outputs.event_hubs] Metric buffer overflow; 156 metrics have been dropped
2023-10-02T16:57:11Z E! [agent] Error writing to outputs.event_hubs: amqp: link closed
2023-10-02T16:57:21Z W! [outputs.event_hubs] Metric buffer overflow; 217 metrics have been dropped
2023-10-02T16:57:21Z E! [agent] Error writing to outputs.event_hubs: amqp: link closed
2023-10-02T16:57:31Z W! [outputs.event_hubs] Metric buffer overflow; 217 metrics have been dropped
...
2023-10-03T15:47:22Z E! [agent] Error writing to outputs.event_hubs: amqp: link closed
2023-10-03T15:47:32Z W! [outputs.event_hubs] Metric buffer overflow; 217 metrics have been dropped
2023-10-03T15:47:32Z E! [agent] Error writing to outputs.event_hubs: amqp: link closed
2023-10-03T15:47:42Z I! [agent] Hang on, flushing any cached metrics before shutdown
2023-10-03T15:47:42Z E! [agent] Error writing to outputs.event_hubs: amqp: link closed
2023-10-03T15:47:42Z I! [agent] Stopping running outputs
2023-10-03T15:47:42Z E! [outputs.event_hubs] Error closing output: *Error{Condition: amqp:link:detach-forced, Description: Idle link tracker, link JJjcz1sZ1TkH0PYZadktROEGnizZMpyQ0EY7ckNfCuMs5f5xAnr22g has been idle for 1800000ms TrackingId:c4dbe1ce-69dd-46d4-820c-19b83bbf8f9c_G11, SystemTracker:numove:EventHub:statistics-logs, Timestamp:2023-10-02T22:23:44, Info: map[]}

System info

Telegraf 1.27.0, Windows 11 Pro 22H2, NSSM 2.24

Docker

No response

Steps to reproduce

  1. Start a Telegraf with EventHub output
  2. Produce data for multiple days (last run lasted 363184s before failing to produce)
  3. Data stops producing due to AMQP link closed error.

Expected behavior

AMQP link should automatically reconnect when timing out or being forcefully closed, or Telegraf should stop and be restarted.

Actual behavior

Telegraf produces logs describing the error and the data to be produced is lost.

Additional info

No response

About this issue

  • Original URL
  • State: closed
  • Created 9 months ago
  • Comments: 20 (9 by maintainers)

Commits related to this issue

Most upvoted comments

actually, no please file a new issue with the full logs, this isn’t the same issue