vector: Kafka sink failure with 0.35.0
A note for the community
- Please vote on this issue by adding a ๐ reaction to the original issue to help the community and maintainers prioritize this request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Problem
Weโve upgraded our vector pods from v0.30.0 (chart 0.26.0) to 0.35.0 (chart: 0.30.0). Vector fails and the pod restart (crash loop backoff).
It seems related to the kafka sink (see related logs). Iโve tested the 0.34.2 (chart: 0.29.3) successfully.
Configuration
---
log_schema:
source_type: x-source_type
timestamp_key: '@timestamp'
timezone: Etc/UTC
sources:
gateway:
type: logstash
address: 0.0.0.0:5044
transforms:
remap_source_gateway:
type: remap
inputs: [gateway]
source: |
%kafka_topic = "systemlogs"
sinks:
kafka:
type: kafka
inputs: [remap_source_gateway]
bootstrap_servers: ${KAFKA_HOST}
topic: '{{ %kafka_topic }}'
compression: none
encoding:
codec: json
healthcheck:
enabled: false
Version
0.35.0
Debug Output
2024-01-15T15:01:48.241263Z DEBUG source{component_kind="source" component_id=gateway component_type=logstash}:connection{peer_addr=10.89.18.95:56433}: vector::sources::util::net::tcp: Connection closed.
thread 'vector-worker' panicked at 'called `Option::unwrap()` on a `None` value', /cargo/registry/src/index.crates.io-6f17d22bba15001f/lockfree-object-pool-0.1.4/src/linear_page.rs:48:30
stack backtrace:
0: 0x128a36c - std::backtrace_rs::backtrace::libunwind::trace::hc7cb25c924fb6eba
at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
1: 0x128a36c - std::backtrace_rs::backtrace::trace_unsynchronized::h5c4ce001d8eb6854
at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
2: 0x128a36c - std::sys_common::backtrace::_print_fmt::h14d9ae8d6479967a
at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/std/src/sys_common/backtrace.rs:65:5
3: 0x128a36c - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h02c30a315588511a
at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/std/src/sys_common/backtrace.rs:44:22
4: 0xa13948 - core::fmt::rt::Argument::fmt::hd5ccf3711fc53a4f
at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/core/src/fmt/rt.rs:138:9
5: 0xa13948 - core::fmt::write::h2876fd52ca55dbfa
at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/core/src/fmt/mod.rs:1094:21
6: 0x12523e4 - std::io::Write::write_fmt::h3391f6b02b293028
at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/std/src/io/mod.rs:1714:15
7: 0x128cfc4 - std::sys_common::backtrace::_print::h23980dcdfc2fe28a
at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/std/src/sys_common/backtrace.rs:47:5
8: 0x128cfc4 - std::sys_common::backtrace::print::h49b97a66db4af001
at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/std/src/sys_common/backtrace.rs:34:9
9: 0x128cbcc - std::panicking::default_hook::{{closure}}::h7138645749f92527
at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/std/src/panicking.rs:269:22
10: 0x128db04 - std::panicking::default_hook::h691846cb3f034db3
at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/std/src/panicking.rs:288:9
11: 0x128db04 - std::panicking::rust_panic_with_hook::hb3c04ea23b04721e
at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/std/src/panicking.rs:705:13
12: 0x128d604 - std::panicking::begin_panic_handler::{{closure}}::hc4edb1e6fd5a6b0f
at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/std/src/panicking.rs:595:13
13: 0x128d59c - std::sys_common::backtrace::__rust_end_short_backtrace::hb6a6277e20eb73e1
at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/std/src/sys_common/backtrace.rs:151:18
14: 0x128d590 - rust_begin_unwind
at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/std/src/panicking.rs:593:5
15: 0x424870 - core::panicking::panic_fmt::hfcaa20ee741d1cc5
at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/core/src/panicking.rs:67:14
16: 0x42498c - core::panicking::panic::h31e7a12887ab7d97
at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/core/src/panicking.rs:117:5
17: 0x3348b64 - <tracing_subscriber::layer::layered::Layered<L,S> as tracing_core::subscriber::Subscriber>::new_span::h9a00643ebd4f75b8
18: 0x334b124 - <tracing_subscriber::layer::layered::Layered<L,S> as tracing_core::subscriber::Subscriber>::new_span::h2c1f9010dda681f0
19: 0x13e1128 - tracing::span::Span::make_with::h0946854b8046d955
20: 0x1c5eda4 - <vector::sinks::kafka::sink::KafkaSink as vector_core::sink::StreamSink<vector_core::event::Event>>::run::{{closure}}::h21f35518df6934ca
21: 0x1c5c8e4 - <vector_core::sink::EventStream<T> as vector_core::sink::StreamSink<vector_core::event::array::EventArray>>::run::{{closure}}::ha31ca5ad7a9a3402
22: 0x320ecc8 - vector::topology::builder::Builder::build_sinks::{{closure}}::{{closure}}::h03ab451c62056c46
23: 0x31c5948 - <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll::heb356ec0acaec9ef
24: 0x31c53bc - tokio::runtime::task::raw::poll::hd3c4a0678998574d
25: 0x12d2128 - tokio::runtime::scheduler::multi_thread::worker::Context::run_task::hb43a1ab74e3daf92
26: 0x12d78e4 - tokio::runtime::task::raw::poll::h9fd75a3a647deed6
27: 0x12bba50 - std::sys_common::backtrace::__rust_begin_short_backtrace::h2b66fe48017742c2
28: 0x12bb690 - core::ops::function::FnOnce::call_once{{vtable.shim}}::hedd6b07b50b25d09
29: 0x129069c - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h407d789be6f3d62c
at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/alloc/src/boxed.rs:1993:9
30: 0x129069c - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::hf45fc529abdd7c00
at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/alloc/src/boxed.rs:1993:9
31: 0x129069c - std::sys::unix::thread::Thread::new::thread_start::hcd656e077ad34229
at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/std/src/sys/unix/thread.rs:108:17
2024-01-15T15:01:48.333995Z DEBUG source{component_kind="source" component_id=gateway component_type=logstash}:connection{peer_addr=10.89.54.188:36347}: vector::sources::util::net::tcp: Accepted a new connection. peer_addr=10.89.54.188:36347
2024-01-15T15:01:48.340134Z DEBUG source{component_kind="source" component_id=gateway component_type=logstash}:connection{peer_addr=10.89.54.188:36347}: vector::sources::util::net::tcp: Connection closed.
2024-01-15T15:01:48.453506Z ERROR sink{component_kind="sink" component_id=kafka component_type=kafka}: vector::topology: An error occurred that Vector couldn't handle: the task panicked and was aborted.
2024-01-15T15:01:48.453568Z INFO vector: Vector has stopped.
2024-01-15T15:01:48.453640Z DEBUG source{component_kind="source" component_id=gateway component_type=logstash}:connection{peer_addr=10.89.37.245:7823}: vector::sources::util::net::tcp: Start graceful shutdown.
2024-01-15T15:01:48.453652Z DEBUG source{component_kind="source" component_id=gateway component_type=logstash}:connection{peer_addr=10.89.62.209:14033}: vector::sources::util::net::tcp: Start graceful shutdown.
2024-01-15T15:01:48.453739Z DEBUG source{component_kind="source" component_id=gateway component_type=logstash}: vector::topology::builder: Source finished normally.
2024-01-15T15:01:48.454775Z INFO vector::topology::running: Shutting down... Waiting on running components. remaining_components="remap_source_gateway, gateway" time_remaining="59 seconds left"
2024-01-15T15:01:49.421134Z ERROR source{component_kind="source" component_id=gateway component_type=logstash}:connection{peer_addr=10.89.37.245:7823}: vector::internal_events::tcp: Error writing acknowledgement, dropping connection. error=Broken pipe (os error 32) error_code="ack_failed" error_type="writer_failed" stage="sending" internal_log_rate_limit=true
2024-01-15T15:01:49.421383Z DEBUG transform{component_kind="transform" component_id=remap_source_gateway component_type=remap}: vector::topology::builder: Synchronous transform finished with an error.
2024-01-15T15:01:49.421399Z ERROR transform{component_kind="transform" component_id=remap_source_gateway component_type=remap}: vector::topology: An error occurred that Vector couldn't handle: receiver disconnected.
2024-01-15T15:01:53.454051Z INFO vector::topology::running: Shutting down... Waiting on running components. remaining_components="gateway" time_remaining="54 seconds left"
2024-01-15T15:01:58.454189Z INFO vector::topology::running: Shutting down... Waiting on running components. remaining_components="gateway" time_remaining="49 seconds left"
2024-01-15T15:02:03.454918Z INFO vector::topology::running: Shutting down... Waiting on running components. remaining_components="gateway" time_remaining="44 seconds left"
2024-01-15T15:02:08.454973Z INFO vector::topology::running: Shutting down... Waiting on running components. remaining_components="gateway" time_remaining="39 seconds left"
2024-01-15T15:02:13.454684Z INFO vector::topology::running: Shutting down... Waiting on running components. remaining_components="gateway" time_remaining="34 seconds left"
2024-01-15T15:02:18.454653Z INFO vector::topology::running: Shutting down... Waiting on running components. remaining_components="gateway" time_remaining="29 seconds left"
2024-01-15T15:02:18.454660Z INFO source{component_kind="source" component_id=gateway component_type=logstash}:connection{peer_addr=10.89.62.209:14033}: vector::sources::util::net::tcp: Resetting connection (still open after seconds). seconds=30s
2024-01-15T15:02:18.454729Z DEBUG source{component_kind="source" component_id=gateway component_type=logstash}: vector::topology::builder: Source pump finished normally.
2024-01-15T15:02:18.454758Z DEBUG source{component_kind="source" component_id=gateway component_type=logstash}: vector::topology::builder: Source pump supervisor task finished normally.
Example Data
No response
Additional Context
No response
References
No response
About this issue
- Original URL
- State: closed
- Created 5 months ago
- Reactions: 3
- Comments: 24 (9 by maintainers)
Commits related to this issue
- chore(deps): Update lockfree-object-pool to 0.1.5 We think this will fix #19627. Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com> — committed to vectordotdev/vector by jszwedko 4 months ago
- chore(deps): Update lockfree-object-pool to 0.1.5 We think this will fix #19627. Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com> — committed to vectordotdev/vector by jszwedko 4 months ago
- chore(deps): Update lockfree-object-pool to 0.1.5 (#20001) We think this will fix #19627. Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com> — committed to vectordotdev/vector by jszwedko 4 months ago
- chore(deps): Update lockfree-object-pool to 0.1.5 (#20001) We think this will fix #19627. Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com> — committed to vectordotdev/vector by jszwedko 4 months ago
- chore(deps): Update lockfree-object-pool to 0.1.5 (#20001) We think this will fix #19627. Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com> — committed to vectordotdev/vector by jszwedko 4 months ago
- [pull] master from vectordotdev:master (#3) * chore(dev): Update release instructions for deploying vector.dev (#19925) We've switched to having a long-running branch representing what to deploy t... — committed to threatcode/vector by pull[bot] 2 months ago
Thanks for testing! Weโll take a closer look at that commit and see if anything obviously jumps out.
Yes sure. Thx again, I canโt today. but I will do next Monday