test-infra: [Kettle] Process hangs when generating json.gz for 'all' table
What would you like to be added: Add more compute to Kettle instances. Or figure out if CPU is impacting speed/lockup.
Why is this needed:
top - 10:34:45 up 2 days, 5:18, 0 users, load average: 1.16, 1.19, 1.18
Tasks: 8 total, 2 running, 6 sleeping, 0 stopped, 0 zombie
%Cpu(s): 13.4 us, 0.3 sy, 0.0 ni, 86.2 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st
KiB Mem : 53588024 total, 2290884 free, 7649364 used, 43647776 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 45492108 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
619 root 20 0 5952940 5.110g 42432 R 100.0 10.0 2897:56 pypy3
1 root 20 0 18384 3096 2820 S 0.0 0.0 0:00.02 runner.sh
40 root 20 0 26768 9056 5204 S 0.0 0.0 0:00.03 python3
618 root 20 0 4636 836 768 S 0.0 0.0 0:00.00 sh
620 root 20 0 4700 828 768 S 0.0 0.0 1:05.94 pv
621 root 20 0 4792 1528 1252 S 0.0 0.0 0:01.64 gzip
643 root 20 0 18512 3404 3028 S 0.0 0.0 0:00.00 bash
652 root 20 0 36628 3092 2644 R 0.0 0.0 0:00.03 top
It seems that Kettle Prod is hitting cpu limits when trying to build json. It takes extremely long to complete an update cycle and now seems to freeze at point, not updating at all and “catching” on specific builds logs seem to end there
Error while reading data, error message: JSON parsing error in row starting at position 605377307: Parser terminated before end of string
Error while reading data, error message: JSON parsing error in row starting at position 752722019: Parser terminated before end of string
Error while reading data, error message: JSON parsing error in row starting at position 1126381032: Parser terminated before end of string
ERROR:root:error on gs://pivotal-e2e-results/kubo-windows-2019/1553782223
Traceback (most recent call last):
File "make_json.py", line 281, in make_rows
yield rowid, row_for_build(path, started, finished, results)
File "make_json.py", line 254, in row_for_build
build = Build.generate(path, tests, started, finished, metadata, repos)
File "make_json.py", line 94, in generate
build = cls(path, tests)
File "make_json.py", line 90, in __init__
self.populate_path_to_job_and_number()
File "make_json.py", line 112, in populate_path_to_job_and_number
raise ValueError(f'unknown build path for {self.path} in known bucket paths')
ValueError: unknown build path for gs://pivotal-e2e-results/kubo-windows-2019/1553782223 in known bucket paths
/area kettle /assign
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 15 (13 by maintainers)
I think I will try something like this to avoid the STDOUT issues https://stackoverflow.com/questions/49534901/is-there-a-way-to-use-json-dump-with-gzip