istio: Envoy segmentation fault

Bug description When making http requests to a specific deployment of ours envoy sometimes crashes, with a segmentation fault. I can only reproduce the behavior, when coming from our ingressgateway, and not from other pods in the mesh.

[2019-05-23 07:44:01.170][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:81] Caught Segmentation fault, suspect faulting address 0x0
[2019-05-23 07:44:01.170][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:69] Backtrace (use tools/stack_decode.py to get line numbers):
[2019-05-23 07:44:01.170][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #0: __restore_rt [0x7f71d1558390]
[2019-05-23 07:44:01.173][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #1: Envoy::Network::FilterManagerImpl::onRead() [0x8dd85a]
[2019-05-23 07:44:01.175][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #2: Envoy::Network::ConnectionImpl::onReadReady() [0x8da39e]
[2019-05-23 07:44:01.177][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #3: Envoy::Network::ConnectionImpl::onFileEvent() [0x8d9e71]
[2019-05-23 07:44:01.179][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #4: Envoy::Event::FileEventImpl::assignEvents()::$_0::__invoke() [0x8d5035]
[2019-05-23 07:44:01.181][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #5: event_process_active_single_queue [0xc399bd]
[2019-05-23 07:44:01.184][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #6: event_base_loop [0xc37f70]
[2019-05-23 07:44:01.186][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #7: Envoy::Event::DispatcherImpl::run() [0x8d462d]
[2019-05-23 07:44:01.188][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #8: Envoy::Server::WorkerImpl::threadRoutine() [0x8cf052]
[2019-05-23 07:44:01.190][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #9: Envoy::Thread::ThreadImplPosix::ThreadImplPosix()::$_0::__invoke() [0xda0205]
[2019-05-23 07:44:01.190][107][critical][backtrace] [bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #10: start_thread [0x7f71d154e6ba]
2019-05-23T07:44:01.193825Z	warn	Epoch 0 terminated with an error: signal: segmentation fault
2019-05-23T07:44:01.193857Z	warn	Aborted all epochs
2019-05-23T07:44:01.193889Z	info	Epoch 0: set retry delay to 12.8s, budget to 3
2019-05-23T07:44:02.847794Z	info	Envoy proxy is NOT ready: failed retrieving Envoy stats: Get http://127.0.0.1:15000/stats?usedonly: dial tcp 127.0.0.1:15000: connect: connection refused
2019-05-23T07:44:04.847805Z	info	Envoy proxy is NOT ready: failed retrieving Envoy stats: Get http://127.0.0.1:15000/stats?usedonly: dial tcp 127.0.0.1:15000: connect: connection refused
2019-05-23T07:44:06.847917Z	info	Envoy proxy is NOT ready: failed retrieving Envoy stats: Get http://127.0.0.1:15000/stats?usedonly: dial tcp 127.0.0.1:15000: connect: connection refused
2019-05-23T07:44:08.847970Z	info	Envoy proxy is NOT ready: failed retrieving Envoy stats: Get http://127.0.0.1:15000/stats?usedonly: dial tcp 127.0.0.1:15000: connect: connection refused
2019-05-23T07:44:10.847883Z	info	Envoy proxy is NOT ready: failed retrieving Envoy stats: Get http://127.0.0.1:15000/stats?usedonly: dial tcp 127.0.0.1:15000: connect: connection refused
2019-05-23T07:44:12.847867Z	info	Envoy proxy is NOT ready: failed retrieving Envoy stats: Get http://127.0.0.1:15000/stats?usedonly: dial tcp 127.0.0.1:15000: connect: connection refused
2019-05-23T07:44:13.993991Z	info	Reconciling retry (budget 3)
2019-05-23T07:44:13.994065Z	info	Epoch 0 starting
2019-05-23T07:44:13.994672Z	info	Envoy command: [-c /etc/istio/proxy/envoy-rev0.json --restart-epoch 0 --drain-time-s 45 --parent-shutdown-time-s 60 --service-cluster deck.default --service-node sidecar~100.118.0.65~deck-696b9cbfd5-tj96t.default~default.svc.cluster.local --max-obj-name-len 189 --allow-unknown-fields -l warning --concurrency 2]
[2019-05-23 07:44:14.010][113][warning][misc] [external/envoy/source/common/protobuf/utility.cc:174] Using deprecated option 'envoy.api.v2.Cluster.hosts' from file cds.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-05-23 07:44:14.010][113][warning][misc] [external/envoy/source/common/protobuf/utility.cc:174] Using deprecated option 'envoy.api.v2.Cluster.hosts' from file cds.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-05-23 07:44:14.010][113][warning][misc] [external/envoy/source/common/protobuf/utility.cc:174] Using deprecated option 'envoy.api.v2.Cluster.hosts' from file cds.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-05-23 07:44:14.013][113][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:86] gRPC config stream closed: 14, no healthy upstream
[2019-05-23 07:44:14.013][113][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:49] Unable to establish new stream
[2019-05-23 07:44:14.477][113][warning][misc] [external/envoy/source/common/protobuf/utility.cc:174] Using deprecated option 'envoy.api.v2.Listener.use_original_dst' from file lds.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
2019-05-23T07:44:14.848938Z	info	Envoy proxy is ready

Expected behavior I’d expect all requests to go through without crashing envoy in the deployment, just as all our other deployments.

Steps to reproduce the bug I am a bit unsure on this one. I did play around with rules, and handlers, but have removed them all from the namespace that this deployment is in, as it caused it to be unstable. Since removing them, I’ve also recreated the deployment pods, and the ingressgateways, to no avail. Other deployments in the namespace works just fine.

Version (include the output of istioctl version --remote and kubectl version)

$ istioctl version --remote
client version: version.BuildInfo{Version:"1.1.7", GitRevision:"eec7a74473deee98cad0a996f41a32a47dd453c2", User:"root", Host:"341b3bf0-76ac-11e9-b644-0a580a2c0404", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Clean", GitTag:"1.1.6-6-geec7a74"}
citadel version: version.BuildInfo{Version:"1.1.7", GitRevision:"eec7a74473deee98cad0a996f41a32a47dd453c2-dirty", User:"root", Host:"341b3bf0-76ac-11e9-b644-0a580a2c0404", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Modified", GitTag:"1.1.6-6-geec7a74"}
galley version: version.BuildInfo{Version:"1.1.7", GitRevision:"eec7a74473deee98cad0a996f41a32a47dd453c2-dirty", User:"root", Host:"341b3bf0-76ac-11e9-b644-0a580a2c0404", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Modified", GitTag:"1.1.6-6-geec7a74"}
ingressgateway version: version.BuildInfo{Version:"1.1.7", GitRevision:"eec7a74473deee98cad0a996f41a32a47dd453c2", User:"root", Host:"341b3bf0-76ac-11e9-b644-0a580a2c0404", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Clean", GitTag:"1.1.6-6-geec7a74"}
ingressgateway version: version.BuildInfo{Version:"1.1.7", GitRevision:"eec7a74473deee98cad0a996f41a32a47dd453c2", User:"root", Host:"341b3bf0-76ac-11e9-b644-0a580a2c0404", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Clean", GitTag:"1.1.6-6-geec7a74"}
pilot version: version.BuildInfo{Version:"1.1.7", GitRevision:"eec7a74473deee98cad0a996f41a32a47dd453c2-dirty", User:"root", Host:"341b3bf0-76ac-11e9-b644-0a580a2c0404", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Modified", GitTag:"1.1.6-6-geec7a74"}
pilot version: version.BuildInfo{Version:"1.1.7", GitRevision:"eec7a74473deee98cad0a996f41a32a47dd453c2-dirty", User:"root", Host:"341b3bf0-76ac-11e9-b644-0a580a2c0404", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Modified", GitTag:"1.1.6-6-geec7a74"}
policy version: version.BuildInfo{Version:"f85fb7d434d62f4f286b9b5de975549056b4c8b8", GitRevision:"f85fb7d434d62f4f286b9b5de975549056b4c8b8", User:"root", Host:"0f6eb930-7b59-11e9-b00d-0a580a2c0205", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Clean", GitTag:"1.1.7-3-gf85fb7d"}
policy version: version.BuildInfo{Version:"f85fb7d434d62f4f286b9b5de975549056b4c8b8", GitRevision:"f85fb7d434d62f4f286b9b5de975549056b4c8b8", User:"root", Host:"0f6eb930-7b59-11e9-b00d-0a580a2c0205", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Clean", GitTag:"1.1.7-3-gf85fb7d"}
sidecar-injector version: version.BuildInfo{Version:"1.1.7", GitRevision:"eec7a74473deee98cad0a996f41a32a47dd453c2-dirty", User:"root", Host:"341b3bf0-76ac-11e9-b644-0a580a2c0404", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Modified", GitTag:"1.1.6-6-geec7a74"}
telemetry version: version.BuildInfo{Version:"f85fb7d434d62f4f286b9b5de975549056b4c8b8", GitRevision:"f85fb7d434d62f4f286b9b5de975549056b4c8b8", User:"root", Host:"0f6eb930-7b59-11e9-b00d-0a580a2c0205", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Clean", GitTag:"1.1.7-3-gf85fb7d"}

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.2", GitCommit:"17c77c7898218073f14c8d573582e8d2313dc740", GitTreeState:"clean", BuildDate:"2018-10-30T21:39:16Z", GoVersion:"go1.11.1", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.12", GitCommit:"c757b93cf034d49af3a3b8ecee3b9639a7a11df7", GitTreeState:"clean", BuildDate:"2018-12-19T11:04:29Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

How was Istio installed? Using the 1.1.7 helm index + charts.

Environment where bug was observed (cloud vendor, OS, etc) KOPS cluster @ AWS

Affected product area (please put an X in all that apply)

[ ] Configuration Infrastructure [ ] Docs [ ] Installation [ ] Networking [ ] Performance and Scalability [X] Policies and Telemetry [ ] Security [ ] Test and Release [ ] User Experience

I am a bit unsure if I hit the correct check-boxes above, but my take on the issue is that some handler/rule/policy is breaking the deployment, as there was no issues prior to toying around with them.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 21 (11 by maintainers)

Most upvoted comments

@Multiply I think this was likely fixed in Istio 1.1.9. Please try it (I recommend trying 1.1.14 actually) and reopen this bug if that doesn’t address your problem. Thanks

@Multiply sorry, I pointed you at the wrong binary, it should be this one instead: https://storage.googleapis.com/istio-build/proxy/envoy-symblol-73fa9b1f29f91029cc2485a685994a0d1dbcde21.tar.gz (symbol, not alpha).