opentelemetry-operator: TargetAllocator : error during loading configuration
Hi, i’ve got these logs at startup:
{"level":"info","ts":"2023-06-05T20:17:18Z","msg":"Starting the Target Allocator"}
{"level":"info","ts":"2023-06-05T20:17:18Z","logger":"allocator","msg":"Unrecognized filter strategy; filtering disabled"}
{"level":"info","ts":"2023-06-05T20:17:18Z","logger":"allocator","msg":"Starting server..."}
{"level":"info","ts":"2023-06-05T20:17:18Z","msg":"Waiting for caches to sync for servicemonitors\n"}
{"level":"info","ts":"2023-06-05T20:17:20Z","msg":"Caches are synced for servicemonitors\n"}
{"level":"info","ts":"2023-06-05T20:17:20Z","msg":"Waiting for caches to sync for podmonitors\n"}
{"level":"info","ts":"2023-06-05T20:17:20Z","msg":"Caches are synced for podmonitors\n"}
{"level":"info","ts":"2023-06-05T20:17:20Z","logger":"allocator","msg":"Successfully started a collector pod watcher","component":"opentelemetry-targetallocator"}
{"level":"error","ts":"2023-06-05T20:17:21Z","logger":"setup","msg":"Unable to load configuration","error":"empty duration string","stacktrace":"main.main.func13\n\t/app/main.go:198\ngithub.com/oklog/run.(*Group).Run.func1\n\t/go/pkg/mod/github.com/oklog/run@v1.1.0/group.go:38"}
{"level":"error","ts":"2023-06-05T20:17:21Z","logger":"setup","msg":"Unable to load configuration","error":"empty duration string","stacktrace":"main.main.func13\n\t/app/main.go:198\ngithub.com/oklog/run.(*Group).Run.func1\n\t/go/pkg/mod/github.com/oklog/run@v1.1.0/group.go:38"}
{"level":"error","ts":"2023-06-05T20:17:22Z","logger":"setup","msg":"Unable to load configuration","error":"empty duration string","stacktrace":"main.main.func13\n\t/app/main.go:198\ngithub.com/oklog/run.(*Group).Run.func1\n\t/go/pkg/mod/github.com/oklog/run@v1.1.0/group.go:38"}
with this configuration:
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
labels:
app.kubernetes.io/component: opentelemetry-collector
app.kubernetes.io/instance: opentelemetry-collector
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: opentelemetry-collector
app.kubernetes.io/part-of: opentelemetry-collector
app.kubernetes.io/version: 1.0.0
argocd.argoproj.io/instance: opentelemetry-collector
helm.sh/chart: opentelemetry-collector-1.0.0
name: metrics
namespace: opentelemetry
spec:
config: |
exporters:
logging:
verbosity: normal
prometheus:
endpoint: 0.0.0.0:9090
metric_expiration: 180m
resource_to_telemetry_conversion:
enabled: true
prometheusremotewrite/mimir:
endpoint: http://mimir-nginx.monitoring.svc.cluster.local:80/api/v1/push
extensions:
basicauth/grafanacloud:
client_auth:
password: ${GRAFANA_CLOUD_METRICS_APIKEY}
username: ${GRAFANA_CLOUD_METRICS_ID}
health_check: null
memory_ballast:
size_in_percentage: 20
pprof:
endpoint: :1888
zpages:
endpoint: :55679
processors:
batch:
send_batch_max_size: 1500
send_batch_size: 1500
timeout: 15s
k8sattributes:
extract:
metadata:
- k8s.namespace.name
- k8s.pod.name
- k8s.pod.uid
- k8s.node.name
- k8s.pod.start_time
- k8s.deployment.name
- k8s.replicaset.name
- k8s.replicaset.uid
- k8s.daemonset.name
- k8s.daemonset.uid
- k8s.job.name
- k8s.job.uid
- k8s.cronjob.name
- k8s.statefulset.name
- k8s.statefulset.uid
- container.image.tag
- container.image.name
passthrough: false
pod_association:
- sources:
- from: resource_attribute
name: k8s.pod.name
memory_limiter:
check_interval: 5s
limit_percentage: 90
spike_limit_percentage: 30
resource:
attributes:
- action: insert
key: collector.name
value: ${KUBE_POD_NAME}
receivers:
hostmetrics:
collection_interval: 60s
scrapers:
cpu: null
disk: null
filesystem: null
load: null
memory: null
network: null
processes: null
prometheus:
config:
global:
evaluation_interval: 60s
scrape_interval: 60s
scrape_timeout: 60s
target_allocator:
collector_id: ${POD_NAME}
endpoint: http://metrics-targetallocator:80
http_sd_config:
refresh_interval: 60s
interval: 30s
service:
extensions:
- health_check
- memory_ballast
- pprof
- zpages
pipelines:
metrics:
exporters:
- logging
- prometheus
processors:
- batch
- memory_limiter
- k8sattributes
receivers:
- hostmetrics
- prometheus
telemetry:
logs:
encoding: json
level: info
metrics:
address: 0.0.0.0:8888
level: detailed
env:
- name: K8S_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: K8S_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: K8S_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
envFrom:
- secretRef:
name: opentelemetry-datadog-credentials
- secretRef:
name: opentelemetry-lightstep-credentials
- secretRef:
name: opentelemetry-grafanacloud-credentials
image: ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-contrib:0.78.0
ingress:
route: {}
mode: statefulset
ports:
- name: metrics
port: 8888
protocol: TCP
targetPort: 8888
replicas: 1
resources:
limits:
memory: 3Gi
requests:
cpu: "1"
memory: 2Gi
serviceAccount: opentelemetry-collector-metrics
targetAllocator:
allocationStrategy: consistent-hashing
enabled: true
filterStrategy: relabel-config
image: ghcr.io/open-telemetry/opentelemetry-operator/target-allocator:0.78.0
prometheusCR:
enabled: true
replicas: 1
serviceAccount: opentelemetry-collector-metrics-targetallocator
upgradeStrategy: automatic
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 17 (5 by maintainers)
thanks i will try that…
@matej-g Please go ahead, I’m busy with something else at the moment.
Seeing the same after updating to latest TA (
0.78.0
). I think it’s coming from the scrape time validation logic, in which we now have empty durations, since these are now coming from the Prometheus object passed to config generator and are empty.I think this is right @eplightning, I was about to open a PR, would you like to instead?