istio: Investigate authorization policy blocking prometheus scraping metrics at port 15090

Bug description Copied from a customer (Chad Wilson from ThoughtWorks ) report on slack :

We just moved to the new AuthorizationPolicy from the old ClusterRbacConfig/ServiceRole etc. We have MTLS enforced everywhere and a deny-all type of policy for both. One weird thing that we have found is that under the new policy Prometheus scrapes of our pods on a non-service port (configured by prometheus.io anotations) and scrapes of the Envoy metrics port 15090 are now blocked by the AuthorizationPolicy where they were not before. However we don’t quite understand how this happened (why it even worked before under the v1alpha1 API, or what the recommended approach is. Anyone got any pointers? We worked around it for now with

  - to:
    - operation:
        ports:
        - "9080"
        - "15090"

…but not entirely desirable since it basically allows all access from anywhere to the ports. Trying to restrict it by service identity, paths etc doesn’t seem to work; i guess because the source workload (Prometheus) is non-Mesh?

Thanks for the clarifications (and all the great work on the new authz - feedback from myself and my team is that the new policy is easier to follow and work with, which is great. The change in behaviour makes sense with the switch from service-based to workload as we do not declare our monitoring ports in services, only the main app traffic port. Am I correct in inferring this puts it out-of-step with MTLS config? We enforce mTLS everywhere else in our cluster, but the proxy isn’t enforcing this on the non-service monitoring ports (which is ok for us, but now seems a bit inconsistent with authZ). As far as I’m aware we’re not mounting required Istio certs into Prometheus, and aren’t using the prom scrape config for ‘secure’ pods currently. Understand that restricting the traffic based on source namespace or principal may not be possible without an envoy around Prometheus itself. However, is there a way to restrict the target by HTTP level attributes? Only port seems to work, i.e it seems the workload proxy doesn’t know the traffic is HTTP so it blocks traffic if I add paths or method restrictions. I’m pretty sure 15090 was also being blocked because all of the ‘envoy-stats’ scrapes were failing until I added the port as allowed. Ill try and export a proxy-config to see what’s going on a bit later.

I believe the only ‘special’ setup we have is deploying Prometheus with the stable operator chart and our own scrape configs in the operator values file into its own namespace, rather than with the Istio Helm chart.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 19 (8 by maintainers)

Most upvoted comments

@shadjac I’m not aware of any direct option to resolve this; however I am guessing that you have at least two possible options now

  1. Move Spring Boot Actuator endpoints to a different port that you are comfortable exposing without RBAC or mTLS via management.server.port=9080 allowing you to keep RBAC+mTLS on your main app port
  2. Try out Istio 1.6+'s Prometheus metrics merging (never tried this personally) which I believe will allow you to keep scraping with non-TLS via the proxy’s status port, as long as you use prometheus.io annotations rather than (Pod|Service)Monitors.

and in our case it has no sidecar proxy; it’s not part of the Mesh - and we haven’t attempted to make our Prometheus be able to do mTLS with Mesh-proxied-services

That’s a good call. So there is only one possibility: the iptables capture.

@chadlwilson

  1. Could you paste your iptables rule?
  2. Are you using istio-cni? I realize 15090 is not in the exclude list if you use istio-cni and leading to your case. Fixed days ago

I can reproduce this with a fresh install of 1.4.7, the following is the iptable rules for httpbin:

$ k logs -f httpbin-654c6cbbb9-48rqv -c istio-init 
Environment:
------------
ENVOY_PORT=
INBOUND_CAPTURE_PORT=
ISTIO_INBOUND_INTERCEPTION_MODE=
ISTIO_INBOUND_TPROXY_MARK=
ISTIO_INBOUND_TPROXY_ROUTE_TABLE=
ISTIO_INBOUND_PORTS=
ISTIO_LOCAL_EXCLUDE_PORTS=
ISTIO_SERVICE_CIDR=
ISTIO_SERVICE_EXCLUDE_CIDR=

Variables:
----------
PROXY_PORT=15001
PROXY_INBOUND_CAPTURE_PORT=15006
PROXY_UID=1337
PROXY_GID=1337
INBOUND_INTERCEPTION_MODE=REDIRECT
INBOUND_TPROXY_MARK=1337
INBOUND_TPROXY_ROUTE_TABLE=133
INBOUND_PORTS_INCLUDE=*
INBOUND_PORTS_EXCLUDE=15020
OUTBOUND_IP_RANGES_INCLUDE=*
OUTBOUND_IP_RANGES_EXCLUDE=
OUTBOUND_PORTS_EXCLUDE=
KUBEVIRT_INTERFACES=
ENABLE_INBOUND_IPV6=

+ iptables -t nat -N ISTIO_REDIRECT
+ iptables -t nat -A ISTIO_REDIRECT -p tcp -j REDIRECT --to-port 15001
+ iptables -t nat -N ISTIO_IN_REDIRECT
+ '[' '*' == '*' ']'
+ iptables -t nat -A ISTIO_IN_REDIRECT -p tcp -j REDIRECT --to-port 15006
+ '[' -n '*' ']'
+ '[' REDIRECT = TPROXY ']'
+ table=nat
+ iptables -t nat -N ISTIO_INBOUND
+ iptables -t nat -A PREROUTING -p tcp -j ISTIO_INBOUND
+ '[' '*' == '*' ']'
+ iptables -t nat -A ISTIO_INBOUND -p tcp --dport 22 -j RETURN
+ '[' -n 15020 ']'
+ for port in ${INBOUND_PORTS_EXCLUDE}
+ iptables -t nat -A ISTIO_INBOUND -p tcp --dport 15020 -j RETURN
+ '[' REDIRECT = TPROXY ']'
+ iptables -t nat -A ISTIO_INBOUND -p tcp -j ISTIO_IN_REDIRECT
+ iptables -t nat -N ISTIO_OUTPUT
+ iptables -t nat -A OUTPUT -p tcp -j ISTIO_OUTPUT
+ '[' -n '' ']'
+ iptables -t nat -A ISTIO_OUTPUT -o lo -s 127.0.0.6/32 -j RETURN
+ '[' -z '' ']'
+ iptables -t nat -A ISTIO_OUTPUT -o lo '!' -d 127.0.0.1/32 -j ISTIO_IN_REDIRECT
+ for uid in ${PROXY_UID}
+ iptables -t nat -A ISTIO_OUTPUT -m owner --uid-owner 1337 -j RETURN
+ for gid in ${PROXY_GID}
+ iptables -t nat -A ISTIO_OUTPUT -m owner --gid-owner 1337 -j RETURN
+ iptables -t nat -A ISTIO_OUTPUT -d 127.0.0.1/32 -j RETURN
+ '[' 0 -gt 0 ']'
+ '[' 1 -gt 0 ']'
+ '[' '*' == '*' ']'
+ iptables -t nat -A ISTIO_OUTPUT -j ISTIO_REDIRECT
+ set +o nounset
+ '[' -n '' ']'
+ ip6tables -F INPUT
+ ip6tables -A INPUT -m state --state ESTABLISHED -j ACCEPT
+ ip6tables -A INPUT -i lo -d ::1 -j ACCEPT
+ ip6tables -A INPUT -j REJECT
+ dump
+ iptables-save
# Generated by iptables-save v1.6.1 on Thu Apr 23 01:06:14 2020
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:ISTIO_INBOUND - [0:0]
:ISTIO_IN_REDIRECT - [0:0]
:ISTIO_OUTPUT - [0:0]
:ISTIO_REDIRECT - [0:0]
-A PREROUTING -p tcp -j ISTIO_INBOUND
-A OUTPUT -p tcp -j ISTIO_OUTPUT
-A ISTIO_INBOUND -p tcp -m tcp --dport 22 -j RETURN
-A ISTIO_INBOUND -p tcp -m tcp --dport 15020 -j RETURN
-A ISTIO_INBOUND -p tcp -j ISTIO_IN_REDIRECT
-A ISTIO_IN_REDIRECT -p tcp -j REDIRECT --to-ports 15006
-A ISTIO_OUTPUT -s 127.0.0.6/32 -o lo -j RETURN
-A ISTIO_OUTPUT ! -d 127.0.0.1/32 -o lo -j ISTIO_IN_REDIRECT
-A ISTIO_OUTPUT -m owner --uid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -m owner --gid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -d 127.0.0.1/32 -j RETURN
-A ISTIO_OUTPUT -j ISTIO_REDIRECT
-A ISTIO_REDIRECT -p tcp -j REDIRECT --to-ports 15001
COMMIT
# Completed on Thu Apr 23 01:06:14 2020
+ ip6tables-save
# Generated by ip6tables-save v1.6.1 on Thu Apr 23 01:06:14 2020
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A INPUT -m state --state ESTABLISHED -j ACCEPT
-A INPUT -d ::1/128 -i lo -j ACCEPT
-A INPUT -j REJECT --reject-with icmp6-port-unreachable
COMMIT
# Completed on Thu Apr 23 01:06:14 2020

I can confirm that this is an issue, for us the only remedy was:

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: deny-all
  namespace: my-namespace
spec:
  {}

---
---
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: my-app
  namespace: my-namespace
spec:
  selector:
    matchLabels:
      app: my-app
  rules:
  - from:
    - source:
        namespaces: ["prometheus-namespace"]
    to:
    - operation:
        methods: ["GET"]
        paths: ["/metrics"]
  - to:
    - operation:
        ports: ["15090"]

Notice the difference for the app-specific Prometheus rule (works with source and a namespaces target in its operation) and the Envoy proxy specific rules (which must only carry a single to rule with a port in operation, otherwise it’s discarded).

This is a sample request from the operator:

{"bytes_sent":"31863","upstream_cluster":"InboundPassthroughClusterIpv4","downstream_remote_address":"10.0.129.14:43418","authority":"-","path":"-","protocol":"-","upstream_service_time":"-","upstream_local_ │
│ address":"127.0.0.6:55781","duration":"21","downstream_local_address":"10.0.152.1:15090","upstream_transport_failure_reason":"-","route_name":"-","response_code":"0","user_agent":"-","response_flags":"-","st │
│ art_time":"2020-02-04T14:21:56.091Z","method":"-","request_id":"-","upstream_host":"10.0.152.1:15090","x_forwarded_for":"-","requested_server_name":"-","bytes_received":"95","istio_policy_status":"-"}

The Prometheus operator is running outside of the Istio mesh in this case.