vector: thread 'tokio-runtime-worker' panic

Vector Version

vector 0.11.0 (g50a9c3a x86_64-unknown-linux-musl 2020-09-10)

Vector Configuration File

data_dir = "/vector-data-dir"
[log_schema]
  host_key = "host"
  message_key = "message"
  source_type_key = "source_type"
  timestamp_key = "timestamp"
[sources.internal_metrics]
  type = "internal_metrics"
[sources.kafka]
  type = "kafka"
  bootstrap_servers = "logging-kafka-bootstrap.logging-kafka:9092"
  group_id = "test-vector"
  topics = ["test-vector"]
  
[transforms.message_parser]
  type = "json_parser"
  inputs = ["kafka"]
  field = "message"
  drop_invalid = true
  
[sinks.elasticsearch]
  type = "elasticsearch"
  inputs = ["message_parser"]
  host = "http://logging-es-http.logging:9200"
  index = "logstash-vector-test"
  auth.strategy = "basic"
  auth.password = "${ELASTICSEARCH_PASSWORD}"
  auth.user = "${ELASTICSEARCH_USER}"
  # https://github.com/timberio/vector/pull/3716
  #encoding.except_fields = ["index"]
  
[sinks.prometheus]
  type = "prometheus"
  inputs = ["internal_metrics"]
  address = "0.0.0.0:9598"
  namespace = "vector"

Debug Output

https://gist.github.com/spencergilbert/52f59f015769e19f4723a08da9c29f2f

Expected Behavior

No panic!

Actual Behavior

Panic!

Example Data

Additional Context

References

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 19 (18 by maintainers)

Commits related to this issue

Most upvoted comments

I’ve jinxed myself of course:

Sep 22 16:21:34.385  WARN sink{name=elasticsearch type=elasticsearch}: vector::sinks::util::sink: Linger polled after ready. clues=[Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: false }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: tr
ue }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }]
Sep 22 16:21:34.385  WARN sink{name=elasticsearch type=elasticsearch}: vector::sinks::util::sink: Linger polled after ready. clues=[Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: false }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: tr
ue }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }]
Sep 22 16:21:34.385  WARN sink{name=elasticsearch type=elasticsearch}: vector::sinks::util::sink: Linger polled after ready. clues=[Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: false }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: tr
ue }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }]
Sep 22 16:21:34.385  WARN sink{name=elasticsearch type=elasticsearch}: vector::sinks::util::sink: Linger polled after ready. clues=[Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: false }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: tr
ue }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }]
Sep 22 16:21:34.385  WARN sink{name=elasticsearch type=elasticsearch}: vector::sinks::util::sink: Linger polled after ready. clues=[Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: false }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: tr
ue }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }]
Sep 22 16:21:34.389  WARN sink{name=elasticsearch type=elasticsearch}: vector::sinks::util::sink: Linger polled after ready. clues=[Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: false }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: tr
ue }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }, Clue { caller: "linger_elapsed", closing: false, batch_was_full: false, batch_is_empty: false, service_was_not_ready: true }]

I’ve just merged #4024, which has two parts:

  1. It should catch and avoid the immediate cause of this panic, allowing vector to continue working normally.
  2. It will collect data about the circumstances leading up to this panic and log it out when the panic would have occurred, which will hopefully help us track down the root cause.

Once this going out in the next nightly, it’d be great if either of you were able to deploy to environment where you’ve been seeing this issue, see if it continues to occur, and send us the output of any log lines containing Linger polled after ready.

@spencergilbert We just probably made a fix, can you try it out? It’s on our main branch now, sadly not a nightly yet. (Tomorrow!)

Sure @MOZGIII either hit me up in the discord or details here and I’d be happy to. I’m currently unable to reproduce this with the 0.10.0-debian image

I’ll be running some tests this morning - if there is anything helpful I can do to help debug let me know and I’ll try and make time for it

I checked the history and didn’t find the relevant mentions of a fix too. I just saw a mention of the fix in Discord, so take it with a grain of salt. This definitely needs to be properly verified.