istio: Envoy segmentation fault
Bug description When making http requests to a specific deployment of ours envoy sometimes crashes, with a segmentation fault. I can only reproduce the behavior, when coming from our ingressgateway, and not from other pods in the mesh.
[2019-05-23 07:44:01.170][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:81] Caught Segmentation fault, suspect faulting address 0x0
[2019-05-23 07:44:01.170][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:69] Backtrace (use tools/stack_decode.py to get line numbers):
[2019-05-23 07:44:01.170][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #0: __restore_rt [0x7f71d1558390]
[2019-05-23 07:44:01.173][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #1: Envoy::Network::FilterManagerImpl::onRead() [0x8dd85a]
[2019-05-23 07:44:01.175][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #2: Envoy::Network::ConnectionImpl::onReadReady() [0x8da39e]
[2019-05-23 07:44:01.177][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #3: Envoy::Network::ConnectionImpl::onFileEvent() [0x8d9e71]
[2019-05-23 07:44:01.179][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #4: Envoy::Event::FileEventImpl::assignEvents()::$_0::__invoke() [0x8d5035]
[2019-05-23 07:44:01.181][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #5: event_process_active_single_queue [0xc399bd]
[2019-05-23 07:44:01.184][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #6: event_base_loop [0xc37f70]
[2019-05-23 07:44:01.186][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #7: Envoy::Event::DispatcherImpl::run() [0x8d462d]
[2019-05-23 07:44:01.188][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #8: Envoy::Server::WorkerImpl::threadRoutine() [0x8cf052]
[2019-05-23 07:44:01.190][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #9: Envoy::Thread::ThreadImplPosix::ThreadImplPosix()::$_0::__invoke() [0xda0205]
[2019-05-23 07:44:01.190][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #10: start_thread [0x7f71d154e6ba]
2019-05-23T07:44:01.193825Z warn Epoch 0 terminated with an error: signal: segmentation fault
2019-05-23T07:44:01.193857Z warn Aborted all epochs
2019-05-23T07:44:01.193889Z info Epoch 0: set retry delay to 12.8s, budget to 3
2019-05-23T07:44:02.847794Z info Envoy proxy is NOT ready: failed retrieving Envoy stats: Get http://127.0.0.1:15000/stats?usedonly: dial tcp 127.0.0.1:15000: connect: connection refused
2019-05-23T07:44:04.847805Z info Envoy proxy is NOT ready: failed retrieving Envoy stats: Get http://127.0.0.1:15000/stats?usedonly: dial tcp 127.0.0.1:15000: connect: connection refused
2019-05-23T07:44:06.847917Z info Envoy proxy is NOT ready: failed retrieving Envoy stats: Get http://127.0.0.1:15000/stats?usedonly: dial tcp 127.0.0.1:15000: connect: connection refused
2019-05-23T07:44:08.847970Z info Envoy proxy is NOT ready: failed retrieving Envoy stats: Get http://127.0.0.1:15000/stats?usedonly: dial tcp 127.0.0.1:15000: connect: connection refused
2019-05-23T07:44:10.847883Z info Envoy proxy is NOT ready: failed retrieving Envoy stats: Get http://127.0.0.1:15000/stats?usedonly: dial tcp 127.0.0.1:15000: connect: connection refused
2019-05-23T07:44:12.847867Z info Envoy proxy is NOT ready: failed retrieving Envoy stats: Get http://127.0.0.1:15000/stats?usedonly: dial tcp 127.0.0.1:15000: connect: connection refused
2019-05-23T07:44:13.993991Z info Reconciling retry (budget 3)
2019-05-23T07:44:13.994065Z info Epoch 0 starting
2019-05-23T07:44:13.994672Z info Envoy command: [-c /etc/istio/proxy/envoy-rev0.json --restart-epoch 0 --drain-time-s 45 --parent-shutdown-time-s 60 --service-cluster deck.default --service-node sidecar~100.118.0.65~deck-696b9cbfd5-tj96t.default~default.svc.cluster.local --max-obj-name-len 189 --allow-unknown-fields -l warning --concurrency 2]
[2019-05-23 07:44:14.010][113][warning][misc] [external/envoy/source/common/protobuf/utility.cc:174] Using deprecated option 'envoy.api.v2.Cluster.hosts' from file cds.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-05-23 07:44:14.010][113][warning][misc] [external/envoy/source/common/protobuf/utility.cc:174] Using deprecated option 'envoy.api.v2.Cluster.hosts' from file cds.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-05-23 07:44:14.010][113][warning][misc] [external/envoy/source/common/protobuf/utility.cc:174] Using deprecated option 'envoy.api.v2.Cluster.hosts' from file cds.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-05-23 07:44:14.013][113][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:86] gRPC config stream closed: 14, no healthy upstream
[2019-05-23 07:44:14.013][113][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:49] Unable to establish new stream
[2019-05-23 07:44:14.477][113][warning][misc] [external/envoy/source/common/protobuf/utility.cc:174] Using deprecated option 'envoy.api.v2.Listener.use_original_dst' from file lds.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
2019-05-23T07:44:14.848938Z info Envoy proxy is ready
Expected behavior I’d expect all requests to go through without crashing envoy in the deployment, just as all our other deployments.
Steps to reproduce the bug I am a bit unsure on this one. I did play around with rules, and handlers, but have removed them all from the namespace that this deployment is in, as it caused it to be unstable. Since removing them, I’ve also recreated the deployment pods, and the ingressgateways, to no avail. Other deployments in the namespace works just fine.
Version (include the output of istioctl version --remote
and kubectl version
)
$ istioctl version --remote
client version: version.BuildInfo{Version:"1.1.7", GitRevision:"eec7a74473deee98cad0a996f41a32a47dd453c2", User:"root", Host:"341b3bf0-76ac-11e9-b644-0a580a2c0404", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Clean", GitTag:"1.1.6-6-geec7a74"}
citadel version: version.BuildInfo{Version:"1.1.7", GitRevision:"eec7a74473deee98cad0a996f41a32a47dd453c2-dirty", User:"root", Host:"341b3bf0-76ac-11e9-b644-0a580a2c0404", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Modified", GitTag:"1.1.6-6-geec7a74"}
galley version: version.BuildInfo{Version:"1.1.7", GitRevision:"eec7a74473deee98cad0a996f41a32a47dd453c2-dirty", User:"root", Host:"341b3bf0-76ac-11e9-b644-0a580a2c0404", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Modified", GitTag:"1.1.6-6-geec7a74"}
ingressgateway version: version.BuildInfo{Version:"1.1.7", GitRevision:"eec7a74473deee98cad0a996f41a32a47dd453c2", User:"root", Host:"341b3bf0-76ac-11e9-b644-0a580a2c0404", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Clean", GitTag:"1.1.6-6-geec7a74"}
ingressgateway version: version.BuildInfo{Version:"1.1.7", GitRevision:"eec7a74473deee98cad0a996f41a32a47dd453c2", User:"root", Host:"341b3bf0-76ac-11e9-b644-0a580a2c0404", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Clean", GitTag:"1.1.6-6-geec7a74"}
pilot version: version.BuildInfo{Version:"1.1.7", GitRevision:"eec7a74473deee98cad0a996f41a32a47dd453c2-dirty", User:"root", Host:"341b3bf0-76ac-11e9-b644-0a580a2c0404", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Modified", GitTag:"1.1.6-6-geec7a74"}
pilot version: version.BuildInfo{Version:"1.1.7", GitRevision:"eec7a74473deee98cad0a996f41a32a47dd453c2-dirty", User:"root", Host:"341b3bf0-76ac-11e9-b644-0a580a2c0404", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Modified", GitTag:"1.1.6-6-geec7a74"}
policy version: version.BuildInfo{Version:"f85fb7d434d62f4f286b9b5de975549056b4c8b8", GitRevision:"f85fb7d434d62f4f286b9b5de975549056b4c8b8", User:"root", Host:"0f6eb930-7b59-11e9-b00d-0a580a2c0205", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Clean", GitTag:"1.1.7-3-gf85fb7d"}
policy version: version.BuildInfo{Version:"f85fb7d434d62f4f286b9b5de975549056b4c8b8", GitRevision:"f85fb7d434d62f4f286b9b5de975549056b4c8b8", User:"root", Host:"0f6eb930-7b59-11e9-b00d-0a580a2c0205", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Clean", GitTag:"1.1.7-3-gf85fb7d"}
sidecar-injector version: version.BuildInfo{Version:"1.1.7", GitRevision:"eec7a74473deee98cad0a996f41a32a47dd453c2-dirty", User:"root", Host:"341b3bf0-76ac-11e9-b644-0a580a2c0404", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Modified", GitTag:"1.1.6-6-geec7a74"}
telemetry version: version.BuildInfo{Version:"f85fb7d434d62f4f286b9b5de975549056b4c8b8", GitRevision:"f85fb7d434d62f4f286b9b5de975549056b4c8b8", User:"root", Host:"0f6eb930-7b59-11e9-b00d-0a580a2c0205", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Clean", GitTag:"1.1.7-3-gf85fb7d"}
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.2", GitCommit:"17c77c7898218073f14c8d573582e8d2313dc740", GitTreeState:"clean", BuildDate:"2018-10-30T21:39:16Z", GoVersion:"go1.11.1", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.12", GitCommit:"c757b93cf034d49af3a3b8ecee3b9639a7a11df7", GitTreeState:"clean", BuildDate:"2018-12-19T11:04:29Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
How was Istio installed? Using the 1.1.7 helm index + charts.
Environment where bug was observed (cloud vendor, OS, etc) KOPS cluster @ AWS
Affected product area (please put an X in all that apply)
[ ] Configuration Infrastructure [ ] Docs [ ] Installation [ ] Networking [ ] Performance and Scalability [X] Policies and Telemetry [ ] Security [ ] Test and Release [ ] User Experience
I am a bit unsure if I hit the correct check-boxes above, but my take on the issue is that some handler/rule/policy is breaking the deployment, as there was no issues prior to toying around with them.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 21 (11 by maintainers)
@Multiply I think this was likely fixed in Istio 1.1.9. Please try it (I recommend trying 1.1.14 actually) and reopen this bug if that doesn’t address your problem. Thanks
@Multiply sorry, I pointed you at the wrong binary, it should be this one instead: https://storage.googleapis.com/istio-build/proxy/envoy-symblol-73fa9b1f29f91029cc2485a685994a0d1dbcde21.tar.gz (
symbol
, notalpha
).