istio: Telemetry resulting from injected faults is wrong.
Describe the bug Telemetry resulting from injected faults is wrong.
I have Istio 1.1-snapshot.4 installed, with bookinfo. I then added my own fault injection rule using this yaml:
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: details-abort-vs
spec:
hosts:
- details
http:
- fault:
abort:
httpStatus: 555
percent: 100
route:
- destination:
host: details
subset: details-abort-dr-subset-v1
- route:
- destination:
host: details
subset: details-abort-dr-subset-v1
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: details-abort-dr
spec:
host: details
subsets:
- name: details-abort-dr-subset-v1
labels:
version: v1
When I view this traffic in Kiali, I see some really odd connectivity:
That red edge from “productpage” node to “details” node is correct - that is the client-side injected fault. However, there should not be red edges from the “details” node to the “reviews” node - for one thing, bookinfo’s “details” never go to reviews in the first place, but secondly, these are client-side injected faults, so the request should never have made it to any workload or app node.
Looking at the telemetry in Prometheus, I see that Kiali is indeed graphing the telemetry correctly. However, the telemetry itself is incorrect. Here’s the Prometheus query results:
Look at the timeseries - how is it possible that “productpage” (source_app) is sending to a destination service of “details” but that request is going to a destination_workload of “reviews-v[123]”? Those three timeseries with destination_workload of “reviews-v[123]” should not be there - they are wrong.
Here is the full telemetry for those four timeseries - using prom query istio_requests_total{source_app="productpage",response_code="555"}
istio_requests_total{connection_security_policy="unknown",destination_app="reviews",destination_principal="unknown",destination_service="details.bookinfo.svc.cluster.local",destination_service_name="details",destination_service_namespace="bookinfo",destination_version="v1",destination_workload="reviews-v1",destination_workload_namespace="bookinfo",instance="172.17.0.9:42422",job="istio-mesh",permissive_response_code="none",permissive_response_policyid="none",reporter="source",request_protocol="http",response_code="555",source_app="productpage",source_principal="unknown",source_version="v1",source_workload="productpage-v1",source_workload_namespace="bookinfo"}
istio_requests_total{connection_security_policy="unknown",destination_app="reviews",destination_principal="unknown",destination_service="details.bookinfo.svc.cluster.local",destination_service_name="details",destination_service_namespace="bookinfo",destination_version="v2",destination_workload="reviews-v2",destination_workload_namespace="bookinfo",instance="172.17.0.9:42422",job="istio-mesh",permissive_response_code="none",permissive_response_policyid="none",reporter="source",request_protocol="http",response_code="555",source_app="productpage",source_principal="unknown",source_version="v1",source_workload="productpage-v1",source_workload_namespace="bookinfo"}
istio_requests_total{connection_security_policy="unknown",destination_app="reviews",destination_principal="unknown",destination_service="details.bookinfo.svc.cluster.local",destination_service_name="details",destination_service_namespace="bookinfo",destination_version="v3",destination_workload="reviews-v3",destination_workload_namespace="bookinfo",instance="172.17.0.9:42422",job="istio-mesh",permissive_response_code="none",permissive_response_policyid="none",reporter="source",request_protocol="http",response_code="555",source_app="productpage",source_principal="unknown",source_version="v1",source_workload="productpage-v1",source_workload_namespace="bookinfo"}
istio_requests_total{connection_security_policy="unknown",destination_app="unknown",destination_principal="unknown",destination_service="details.bookinfo.svc.cluster.local",destination_service_name="details",destination_service_namespace="bookinfo",destination_version="unknown",destination_workload="unknown",destination_workload_namespace="unknown",instance="172.17.0.9:42422",job="istio-mesh",permissive_response_code="none",permissive_response_policyid="none",reporter="source",request_protocol="http",response_code="555",source_app="productpage",source_principal="unknown",source_version="v1",source_workload="productpage-v1",source_workload_namespace="bookinfo"}
Expected behavior The telemetry should only have the “unknown” timeseries that you see above in the Prometheus screenshot:
{destination_app="unknown",destination_service_name="details",destination_workload="unknown",reporter="source"}
The other three that show a destination_service of “details” but destination_workload of “reviews-v[123]” are incorrect and should not be there at all.
Steps to reproduce the bug
- Install Istio 1.1-snapshot.4
- Install bookinfo demo
kubectl create -f
the yaml file I show above. It is called bookinfo-details-abort.yaml
Version
$ kubectl version Client Version: version.Info{Major:“1”, Minor:“12”, GitVersion:“v1.12.1”, GitCommit:“4ed3216f3ec431b140b1d899130a69fc671678f4”, GitTreeState:“clean”, BuildDate:“2018-10-05T16:46:06Z”, GoVersion:“go1.10.4”, Compiler:“gc”, Platform:“linux/amd64”} Server Version: version.Info{Major:“1”, Minor:“11+”, GitVersion:“v1.11.0+d4cacc0”, GitCommit:“d4cacc0”, GitTreeState:“clean”, BuildDate:“2019-01-16T18:55:20Z”, GoVersion:“go1.10.3”, Compiler:“gc”, Platform:“linux/amd64”}
$ istioctl version version.BuildInfo{Version:“1.1.0-snapshot.4”, GitRevision:“e661de08e04e78e60f7fbec067e4357974fa27c3”, User:“root”, Host:“b44b0137-027e-11e9-a4d9-0a580a2c0c84”, GolangVersion:“go1.10.4”, DockerHub:“docker.io/istio”, BuildStatus:“Clean”}
Installation
Using Helm
Environment
RHEL 7, OpenShift 3.11
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 23 (22 by maintainers)
Commits related to this issue
- Add an e2e test to validate fault injection telemetry. This attempts to provide validation of telemetry for FI to guard against recurrence of issues such as: https://github.com/istio/istio/issues/111... — committed to douglas-reid/istio by douglas-reid 5 years ago
- Add an e2e test to validate fault injection telemetry. (#11773) * Add an e2e test to validate fault injection telemetry. This attempts to provide validation of telemetry for FI to guard against r... — committed to istio/istio by douglas-reid 5 years ago
- Sync with 1.1 (#12201) * Fix routing when DNS is resolved (#11522) The DNSDomain variable needs to be enhanced to include more then one DNS entry. Change DNSDomain to DNSDomains as a meta and add... — committed to istio/istio by pitlv2109 5 years ago
- Add an e2e test to validate fault injection telemetry. (#11773) * Add an e2e test to validate fault injection telemetry. This attempts to provide validation of telemetry for FI to guard against r... — committed to angelokurtis/istio-bookinfo by douglas-reid 5 years ago
@kcatstack you have to turn on debug logging for the right scopes in Mixer to see that log. I believe the scope is
api
, but it may beattributes
.the
source.ip
encoding issue was fixed. I don’t know which release that fix was in, but it should be fixed.Looks good running on Istio 1.1 Snapshot 6