argo-events: too many open files - Error
Describe the bug When a Sensor starts it always throws the following error and then exits with ExitCode 1
{"level":"info","ts":1648741781.4416764,"logger":"argo-events.sensor","caller":"cmd/start.go:73","msg":"starting sensor server","sensorName":"kafka","version":"v1.6.0"}
{"level":"info","ts":1648741781.4422603,"logger":"argo-events.sensor","caller":"metrics/metrics.go:172","msg":"starting metrics server","sensorName":"kafka"}
2022/03/31 15:49:41 too many open files
Unfortunately there are no additional information. I already looked at the nodes where it’s running on for an exhaustion of the file descriptors, but everything is looking good there.
When the Sensor runs on a fresh node though it’s working fine. But we can’t always start fresh nodes and the affected nodes are fine regarding overall resource utilization.
To Reproduce Steps to reproduce the behavior:
- Start any sensor with a kafka EventSource (I did not test if it also happens with other sources)
Expected behavior It starts up normally.
Environment (please complete the following information):
- Kubernetes: [e.g. v1.19.15-eks-9c63c4]
- Argo: v3.2.9
- Argo Events: 1.6.0
Message from the maintainers:
If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 7
- Comments: 23 (9 by maintainers)
The issue still exists for us. We’re running
argo-events 1.8.0
,argo-cd 2.6.7
,Promtail
helm version6.11.3
app version2.8.2
. The nodes look healthy and the issues appear randomly in sensors and/or event sources.From the node I get:
The issue appeared some weeks ago out of the sudden and now randomly comes and goes.
We found the culprit. The privileged promtail pods (<= v3.0.3) are setting the fs.inotify.max_user_instances to 128. We’ll upgrade to at least v3.0.4 and then that error should be gone. Closed for now.
Thank you for your patience with us 😃