vector: CPU increase and memory leak
A note for the community
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Problem
Hello Vector team We see CPU increasement and a memory leak only in one of the agent pods. We are using the agent as DaemonSets --> a vector pod for each node. We are using the aggregator as StatefulSets. The agent is sending the k8s logs to the aggregator. Is that a bug? or did we configure it wrong?
Configuration
affinity: {}
args:
- -w
- --config-dir
- /etc/vector/
- --log-format
- json
- -vv
autoscaling:
behavior: {}
customMetric: {}
enabled: false
maxReplicas: 10
minReplicas: 1
targetCPUUtilizationPercentage: 80
command: []
commonLabels: {}
containerPorts: []
customConfig:
api:
enabled: false
data_dir: /vector-data-dir
sinks:
prom_exporter:
address: 0.0.0.0:9598
inputs:
- internal_metrics
type: prometheus_exporter
splunk_hec_dev_test:
buffer:
type: memory
compression: gzip
default_token: cf69e945-a1b2-a1b2-a1b2-be6777bea1b2
encoding:
codec: json
endpoint: https://hec.a1b2.a234.log.cde.net.abc:443
index: ugw_statistics
inputs:
- transform_remap_gateway_proxy_accesslog_test
type: splunk_hec_logs
vector_aggregator:
address: abc-efgh-vector-aggregator.vector-aggregator.svc.cluster.local:7500
inputs:
- kubernetes_logs
type: vector
sources:
internal_metrics:
scrape_interval_secs: 2
type: internal_metrics
kubernetes_logs:
glob_minimum_cooldown_ms: 1000
max_read_bytes: 8192
type: kubernetes_logs
transforms:
transform_remap_gateway_proxy_accesslog:
inputs:
- transform_route_gateway_proxy.access_log
source: |
., err = parse_json(.message)
if (err != null) {
log("Remap gateway-proxy access_log, Unable to parse json: " + err, "error")
abort
}
if (!exists(.targetLog) || .targetLog != "ABC-API-INSIGHTS-ACCESS-LOG") {
abort
}
del(.targetLog)
# verify insightsFields is not empty
if (!exists(.insightsFields) || .insightsFields == null || .insightsFields == "") {
log("Remap gateway-proxy access_log, insightsFields field is missing.")
abort
}
result, err = merge(., .insightsFields)
if (err != null) {
log("Remap gateway-proxy access_log, Unable to merge insightsFields: " + err, "error")
} else {
. = result
}
del(.insightsFields)
if (!exists(.insightsProfile) || .insightsProfile == null || .insightsProfile == "") {
log("Remap gateway-proxy access_log, insightsProfile field is missing.")
abort
}
path_parts, err = parse_regex(.path, r'(?P<pathname>[^?#]*)(?P<query>.*)')
if (err != null) {
log("Remap gateway-proxy access_log, Unable to parse regex: " + err, "error")
abort
}
.path = path_parts.pathname
if (.serviceTime != null) {
.serviceTime = to_int!(.serviceTime)
}
if (.jwtZid != null) {
.spcTenantId = .jwtZid
}
del(.jwtZid)
if (.abcaaTenantId != null) {
.spcTenantId = .abcaaTenantId
}
del(.abcaaTenantId)
if (.jwtValStatus == null || .jwtSkipFailureReporting == true) {
del(.jwtValStatus)
}
del(.jwtSkipFailureReporting)
if (.rateLimitLimit == null) {
del(.rateLimitLimit)
}
if (.rateLimitRemaining == null) {
del(.rateLimitRemaining)
}
if (.rateLimitReset == null) {
del(.rateLimitReset)
}
if (.rateLimitPolicy == null) {
del(.rateLimitPolicy)
}
type: remap
transform_remap_gateway_proxy_accesslog_all:
inputs:
- transform_remap_gateway_proxy_accesslog
source: |
if (.insightsProfile == "vector-e2e-test") {
abort
}
type: remap
transform_remap_gateway_proxy_accesslog_test:
inputs:
- transform_remap_gateway_proxy_accesslog
source: |
if (.insightsProfile != "vector-e2e-test") {
abort
}
del(.insightsProfile)
type: remap
transform_route_gateway_proxy:
inputs:
- kubernetes_logs
route:
'*': .kubernetes.container_name == "gateway-proxy"
access_log: starts_with(string!(.message), "{") && ends_with(string!(.message),
"}")
type: route
dataDir: ""
defaultVolumeMounts:
- mountPath: /var/log/
name: var-log
readOnly: true
- mountPath: /var/lib
name: var-lib
readOnly: true
defaultVolumes:
- hostPath:
path: /var/log/
name: var-log
- hostPath:
path: /var/lib/
name: var-lib
dnsConfig: {}
dnsPolicy: ClusterFirst
env: []
envFrom: []
existingConfigMaps: []
extraContainers: []
extraVolumeMounts: []
extraVolumes:
- name: secret-volume
secret:
secretName: vector-values
fullnameOverride: ""
global: {}
haproxy:
affinity: {}
autoscaling:
customMetric: {}
enabled: false
maxReplicas: 10
minReplicas: 1
targetCPUUtilizationPercentage: 80
containerPorts: []
customConfig: ""
enabled: false
existingConfigMap: ""
extraContainers: []
extraVolumeMounts: []
extraVolumes: []
image:
pullPolicy: IfNotPresent
pullSecrets: []
repository: haproxytech/haproxy-alpine
tag: 2.6.12
initContainers: []
livenessProbe:
tcpSocket:
port: 1024
nodeSelector: {}
podAnnotations: {}
podLabels: {}
podPriorityClassName: ""
podSecurityContext: {}
readinessProbe:
tcpSocket:
port: 1024
replicas: 1
resources: {}
rollWorkload: true
securityContext: {}
service:
annotations: {}
externalTrafficPolicy: ""
ipFamilies: []
ipFamilyPolicy: ""
loadBalancerIP: ""
ports: []
topologyKeys: []
type: ClusterIP
serviceAccount:
annotations: {}
automountToken: true
create: true
strategy: {}
terminationGracePeriodSeconds: 60
tolerations: []
image:
pullPolicy: IfNotPresent
pullSecrets:
- name: vector
repository: build-releases-external.common.cdn.repositories.cloud.abc/timberio/vector
tag: 0.32.1-distroless-libc
ingress:
annotations: {}
className: ""
enabled: false
hosts: []
tls: []
initContainers: []
lifecycle: {}
livenessProbe: {}
logLevel: info
minReadySeconds: 0
nameOverride: ""
nodeSelector: {}
persistence:
accessModes:
- ReadWriteOnce
enabled: false
existingClaim: ""
finalizers:
- kubernetes.io/pvc-protection
hostPath:
enabled: true
path: /var/lib/vector
selectors: {}
size: 10Gi
podAnnotations: {}
podDisruptionBudget:
enabled: false
minAvailable: 1
podHostNetwork: false
podLabels:
sidecar.istio.io/inject: "true"
vector.dev/exclude: "true"
podManagementPolicy: OrderedReady
podMonitor:
additionalLabels: {}
enabled: false
honorLabels: false
honorTimestamps: true
jobLabel: app.kubernetes.io/name
metricRelabelings: []
path: /metrics
port: prom-exporter
relabelings: []
podPriorityClassName: ""
podSecurityContext: {}
psp:
create: false
rbac:
create: true
readinessProbe: {}
replicas: 1
resources:
limits:
cpu: 1000m
memory: 2Gi
requests:
cpu: 10m
memory: 128Mi
role: Agent
rollWorkload: false
secrets:
generic: {}
securityContext: {}
service:
annotations: {}
enabled: true
externalTrafficPolicy: ""
ipFamilies: []
ipFamilyPolicy: ""
loadBalancerIP: ""
ports: []
topologyKeys: []
type: ClusterIP
serviceAccount:
annotations: {}
automountToken: true
create: true
serviceHeadless:
enabled: true
terminationGracePeriodSeconds: 60
tolerations:
- effect: NoSchedule
key: WorkGroup
operator: Equal
value: abproxy
- effect: NoExecute
key: WorkGroup
operator: Equal
value: abproxy
topologySpreadConstraints: []
ugwSecretConfigEnabled: false
updateStrategy: {}
workloadResourceAnnotations: {}
Version
0.32.1-distroless-libc
Debug Output
No response
Example Data
No response
Additional Context
No response
References
No response
About this issue
- Original URL
- State: open
- Created 8 months ago
- Reactions: 6
- Comments: 21 (9 by maintainers)
Thanks, yes, an upgrade is certainly an option, we’ll try it
No problem! The input of the transform is
internal_metricsandprometheus_exporterconsumes it. You are correct that at this point the metric is already collected, and I believe you are correct that dropping it afterinternal_metricsdoes not lead to vector memory reduction. The only reason I did this is to limit the cardinality of these metrics stored in prometheus, after the metrics have been scraped from the vector pod.I did some investigation into this, and my findings agree with @jszwedko’s comment above. I also confirm using
expire_metrics_secsfixed for me in v0.31 (and presumably other versions below 0.35).In our case, a few nodes in the cluster had much heavier pod churn than the rest of the nodes, and the vector agents we noticed with significantly increasing CPU and mem usage were exclusively on those nodes. One feature I noticed with internal_metrics is that the
vector_component_received_*_totalmetrics includepod_nameas a label. When the vector agent is on a node with heavy pod churn, the cardinality of this metric grows as every new pod is created.I expected this might cause extra load on prometheus, but not vector itself. However, it appears Vector’s default behavior (at least in v0.31) is that metrics will continue to emit for every pod that has ever existed on its node. I could see this reflected in the rate at which events were sent to from internal_metrics to the pod’s prometheus exporter server. Rather than being constant, this rate was increasing. I believe this is the cause of the memory (and CPU) increasing over time; the amount of data sent from the
internal_metricssource increases with every new pod creation, but never decreases with an old pod’s deletion.To fix: I set the (by default unset) global option expire_metrics_secs, as suggested by @jszwedko.
I also created a transform that dropped the pod_name tag from the metrics that feature it. (I also made an analogous transform for the metrics which have one
filetag for each pod). I did this because I am not currently using thepod_namelabel at all in my dashboards or alerts. This is not necessary for controlling cpu and memory growth once you enableexpire_metrics_secs, but I had no use for that label and it helps with reducing load on prometheus.Mem utilization before and after:
CPU utilization before and after:
(In both of these graphs, the very steep line before the fix is the node with very high pod churn)
Do you see the cardinality of the metrics exposed by the
prometheus_exportersink growing over time? That’s one hunch I would have: that the sink is receiving ever more metrics series.This could be due to some components publishing telemetry with unbounded cardinality (like the
filesource which tagged internal metrics with afiletag). In v0.35.0, these “high cardinality” tags were removed or changed to be opt-in. Could you try that version?There is also https://vector.dev/docs/reference/configuration/global-options/#expire_metrics_secs which you can use to expire stale metric contexts.
Hello @pront Upgraded to version 0.34.1 . We see an improvement, but can still see increment in memory & cpu,
Seems like the upgrade to 0.34.1 resolved the issue with the memory leak & CPU increase.
We upgraded to version 0.34.0, but still see increment in cpu & memory.
Hmm, I keyed on the Prometheus component in the config above and misread it as a source (so used to reading sources first). I see you don’t have any metric sources other than
internal_metrics, which should have effectively single cardinality for its metrics, so that was a red herring. However, there were a couple of buffer-related memory leaks that were addressed between version 0.32.0 and 0.34.0. Are you able to upgrade to 0.34.0?