envoy: ThriftProxy extension is not generating correct prometheus tags

I have the following config (based on envoy-demo.yaml) which has two listeners. One for http proxying and one for thrift proxying.

Envoy Config
admin:
address:
  socket_address:
    protocol: TCP
    address: 0.0.0.0
    port_value: 9901
static_resources:
listeners:
- name: listener_0
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 10000
  filter_chains:
  - filters:
    - name: envoy.filters.network.http_connection_manager
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
        stat_prefix: this_is_a_stats_prefix
        access_log:
        - name: envoy.access_loggers.stdout
          typed_config:
            "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
        route_config:
          name: local_route
          virtual_hosts:
          - name: local_service
            domains: ["*"]
            routes:
            - match:
                prefix: "/"
              route:
                host_rewrite_literal: www.envoyproxy.io
                cluster: service_envoyproxy_io
        http_filters:
        - name: envoy.filters.http.router
- name: listener_1
  address:
    socket_address:
      address: 0.0.0.0
      port_value: 9090
  filter_chains:
  - filters:
      - name: envoy.filters.network.thrift_proxy
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.thrift_proxy.v3.ThriftProxy
          stat_prefix: this_is_a_stats_thrift_stats_prefix
          route_config:
            name: thing_route
            routes:
              - match:
                  method_name: "" # Empty string means match all method names
                route:
                  weighted_clusters:
                    clusters:
                      - name: service_envoyproxy_io
                        weight: 100
          max_requests_per_connection: 110
          payload_passthrough: true
          thrift_filters:
            - name: envoy.filters.thrift.router
clusters:
- name: service_envoyproxy_io
  type: LOGICAL_DNS
  # Comment out the following line to test on v6 networks
  dns_lookup_family: V4_ONLY
  lb_policy: ROUND_ROBIN
  load_assignment:
    cluster_name: service_envoyproxy_io
    endpoints:
    - lb_endpoints:
      - endpoint:
          address:
            socket_address:
              address: www.envoyproxy.io
              port_value: 443
  transport_socket:
    name: envoy.transport_sockets.tls
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
      sni: www.envoyproxy.io

now looking at http://localhost:9901/stats/prometheus I see the following:

  • Http filter sets the stats_prefix as a tag (expected): envoy_http_downstream_cx_active{envoy_http_conn_manager_prefix="this_is_a_stats_prefix"} 0
  • Thrift filter maintains the stats_prefix in the stat name which pollutes the namespace: envoy_thrift_this_is_a_stats_thrift_stats_prefix_response_error{} 0

Are we open to fixing this? This inconvenient if you want to build generic dashboards. It raises some backwards compatibility questions but I think that it is better if we addressed these long term

A similar thing is also happening with envoy_listener_worker_1_downstream_cx_active

@rgs1 @ggreenway

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 1
  • Comments: 17 (9 by maintainers)

Commits related to this issue

Most upvoted comments

I’ve thought about this some.

On the one hand, I think you could pretty easily talk me into just changing/fixing these. And if someone needs to revert, all the configuration necessary is already present, although it would be a hassle to figure out exactly what one should set it to, via stats_tag.

On the other hand, if we want to do it in a more backward-compatible way, I think we would want to add something like a version, and the user can pin their tag-extraction rules to a specific version, and the default value would be latest. Then we could maintain compatible rule-sets for each version. But I’m pretty sure we want a number/enum, not a bool, for selecting the “new” version because there will always be a newer version as changes happen.

Oops, this is not stale.