istio: Investigate authorization policy blocking prometheus scraping metrics at port 15090
Bug description Copied from a customer (Chad Wilson from ThoughtWorks ) report on slack :
We just moved to the new AuthorizationPolicy from the old ClusterRbacConfig/ServiceRole etc. We have MTLS enforced everywhere and a deny-all type of policy for both. One weird thing that we have found is that under the new policy Prometheus scrapes of our pods on a non-service port (configured by
prometheus.io
anotations) and scrapes of the Envoy metrics port 15090 are now blocked by the AuthorizationPolicy where they were not before. However we don’t quite understand how this happened (why it even worked before under the v1alpha1 API, or what the recommended approach is. Anyone got any pointers? We worked around it for now with
- to:
- operation:
ports:
- "9080"
- "15090"
…but not entirely desirable since it basically allows all access from anywhere to the ports. Trying to restrict it by service identity, paths etc doesn’t seem to work; i guess because the source workload (Prometheus) is non-Mesh?
Thanks for the clarifications (and all the great work on the new authz - feedback from myself and my team is that the new policy is easier to follow and work with, which is great. The change in behaviour makes sense with the switch from service-based to workload as we do not declare our monitoring ports in services, only the main app traffic port. Am I correct in inferring this puts it out-of-step with MTLS config? We enforce mTLS everywhere else in our cluster, but the proxy isn’t enforcing this on the non-service monitoring ports (which is ok for us, but now seems a bit inconsistent with authZ). As far as I’m aware we’re not mounting required Istio certs into Prometheus, and aren’t using the prom scrape config for ‘secure’ pods currently. Understand that restricting the traffic based on source namespace or principal may not be possible without an envoy around Prometheus itself. However, is there a way to restrict the target by HTTP level attributes? Only port seems to work, i.e it seems the workload proxy doesn’t know the traffic is HTTP so it blocks traffic if I add paths or method restrictions. I’m pretty sure 15090 was also being blocked because all of the ‘envoy-stats’ scrapes were failing until I added the port as allowed. Ill try and export a proxy-config to see what’s going on a bit later.
I believe the only ‘special’ setup we have is deploying Prometheus with the stable operator chart and our own scrape configs in the operator values file into its own namespace, rather than with the Istio Helm chart.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 19 (8 by maintainers)
@shadjac I’m not aware of any direct option to resolve this; however I am guessing that you have at least two possible options now
management.server.port=9080
allowing you to keep RBAC+mTLS on your main app portprometheus.io
annotations rather than(Pod|Service)Monitor
s.I can reproduce this with a fresh install of 1.4.7, the following is the iptable rules for httpbin:
I can confirm that this is an issue, for us the only remedy was:
Notice the difference for the app-specific Prometheus rule (works with
source
and anamespaces
target in itsoperation
) and the Envoy proxy specific rules (which must only carry a singleto
rule with aport
inoperation
, otherwise it’s discarded).This is a sample request from the operator:
The Prometheus operator is running outside of the Istio mesh in this case.