earlyoom: earlyoom crash: Could not convert number: Numerical result out of range
I just noticed earlyoom did this, next time I’ll try to provide more debug info:
Out of memory! avail: 138 MiB < min: 786 MiB
Killing process 4084 kworker/u17:3
Out of memory! avail: 155 MiB < min: 786 MiB
Killing process 4084 kworker/u17:3
Out of memory! avail: 157 MiB < min: 786 MiB
Killing process 4084 kworker/u17:3
Could not convert number: Numerical result out of range
But I think earlyoom should never kill kernel threads 😉 and not quit on that conversion error.
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 25 (18 by maintainers)
Commits related to this issue
- meminfo: print more specific error message if strtol fails This condition has actually been triggered by a user: https://github.com/rfjakob/earlyoom/issues/49 — committed to rfjakob/earlyoom by rfjakob 6 years ago
- get_entry: push strtol error up the stack We get more useful logging from get_entry_fatal, so let it handle the error. https://github.com/rfjakob/earlyoom/issues/49 — committed to rfjakob/earlyoom by rfjakob 6 years ago
- userspace_kill: skip kernel threads Kernel threads have a VmRss of zero, and we cannot kill them anyway, so skip them early. https://github.com/rfjakob/earlyoom/issues/49 — committed to rfjakob/earlyoom by rfjakob 6 years ago
I am afraid the cosmic ray hit us too… 😕
From our logs:
I was just able to reproduce it with debug output:
Debug output suggests that
pid 43037/pythonshould be the victim butpid 43037/kworker/u256:0-btrfs-workerwas killed (same pid!). I think what we did was actually starting a write process from this Python process, so it does not surprise me too much that some btrfs thingy is triggered, but I don’t understand why it has the samepid.Meanwhile I’m not sure anylonger if we are seeing the same issue or a different one. I don’t see the error message
Could not convert number: Numerical result out of rangeand we don’t have errors likefopen 11495/oom_score failed: No such file or directoryeither.Using v1.0 until now, will try to reproduce using latest version and provide more information, but not sure if the problem is easily reproducible.
Anyway, earlyoom now explicitely skips kernel threads ( https://github.com/rfjakob/earlyoom/commit/58b66a392755b43d256dcaa1c39e548832b0307d ), and provides a bit more info when things go wrong (https://github.com/rfjakob/earlyoom/commit/0f422f5c3db87dae840d1d36a34edafe9a862128) and also enabled the gcc stack protector ( https://github.com/rfjakob/earlyoom/commit/2a9e3b9b66dac0bfe68b5da7b7eac646b20f7324 ) in case we are seeing memory corruption. Maybe you can recompile and see if you get anything like that again?