lpms: Transcoding hangs under load
While doing throughput testing hit situation where transcoding hangs (reproduced three times).
It was transcoding one 10 minutes generated video to seven 720 renditions in ten streams simultaneously. Looking at output files, it completely transcoded most of the files, except some. nvidia-smi shows that Livepeer process still uses GPU, but output files (one not completed) not growing in size.
There is stopped GCE instance on which I’ve saw this, I’ve left it intact.
To reproduce:
- Start
Ivan-gpu-p100instance - got to
/home/dark/go-livepeer - run
bench.sh - wait 12 minutes
- look at the
/disk-1-tempdirectory - there will be 70 output files, most complete (same size), some smaller. and that smaller files not growing in size.
About this issue
- Original URL
- State: open
- Created 5 years ago
- Comments: 15 (15 by maintainers)
Traced it down to this line:
https://github.com/FFmpeg/FFmpeg/blob/f7f4691f9f745f6a087879ab855dd65f9f16879d/libavcodec/nvdec.c#L162
It enters
cuvidDestroyDecoder(function in Nvidia library) and never exits. So, this is either:For the next thing I’ll try Nvidia’s beta driver.
hangstack.txt