youtube-dl: Format selection should not prefer unreliable 'tbr' metadata on YouTube
The “tbr” value in the YouTube JSON information, also reported in the YouTubeDL -F option, contains the boilerplate values 1155 (for format 135) or 2200 and 2310 (for format 136) when the bitrate is actually unknown. These special sentinal values are thus unrelated to the actual bitrate of the video and should not be interpreted as such, especially when comparing to other formats with valid bitrate values.
This pattern can be seen by scrolling in the following files. which aggregate f135/f136 reports for 60,134 YouTube items. The lines are sorted by reported bitrate, and you can note the very large gaps in the rightmost “filesize” column which are sections corresponding to the singular suspicious values mentioned above. The number of files in these gap areas is too large for the corresponding value to always be exactly “1155”, “2200”, or “2310” by coincidence.
http://www.blobule.com/webshare/ytf-135.txt http://www.blobule.com/webshare/ytf-136.txt
Furthermore, many of the items advertising the aforementioned values have been spot-checked after download, and thier bitrates found to be unrelated to the values shown.
YouTubeDL currently does not detect and ignore these special case values. This often causes the -f bestvideo selection, which prioritizes higher bitrate, to erroneously choose the wrong format.
Please adjust the bestvideo format selection heuristic so that it ignores the bitrate when these special numeric values are seen, and defers instead to a secondary mechanism, such as pixel dimension, for the bestvideo determination.
At a minimum, bestvideo should corroborate that its final selection is credible vis-a-vis the other candidates. For example, consider YouTube video Ot5jszj_ny8, for which youtube-dl reports the following formats:
format code extension resolution note
139 m4a audio only DASH audio 49k , m4a_dash container, mp4a.40.5@ 48k (22050Hz)
140 m4a audio only DASH audio 128k , m4a_dash container, mp4a.40.2@128k (44100Hz)
160 mp4 256x144 DASH video 108k , avc1.4d400b, 30fps, video only
134 mp4 640x360 DASH video 117k , avc1.4d401e, 30fps, video only
133 mp4 426x240 DASH video 242k , avc1.4d400c, 30fps, video only
136 mp4 1280x720 DASH video 365k , avc1.4d401f, 30fps, video only
135 mp4 854x480 DASH video 1155k , avc1.4d4014, 30fps, video only
17 3gp 176x144 small , mp4v.20.3, mp4a.40.2@ 24k
36 3gp 320x180 small , mp4v.20.3, mp4a.40.2
43 webm 640x360 medium , vp8.0, vorbis@128k
18 mp4 640x360 medium , avc1.42001E, mp4a.40.2@ 96k
22 mp4 1280x720 hd720 , avc1.64001F, mp4a.40.2@192k (best)
Here -f bestvideo will download format 135, which, at 854×480, has an actual video bitrate of only 145 kb/s (according to MediaInfo), instead of format 136, which is indeed 1280×720 with a considerably better rate of 169 kb/s. Even with different compression levels, it is simply not credible that a format with 55% fewer pixels could have a bitrate 3x higher for the exact same source content.
Another case, YouTube MUMlwUe-BCo was reported in #14010:
136 mp4 1280x720 DASH video 1114k , avc1.4d401f, 30fps, video only
135 mp4 854x480 DASH video 1155k , avc1.4d4014, 30fps, video only
Again here, format 136 was dis-preferred by -f bestvideo, but after downloading, the true bitrates were found to be:
- 124 kb/s (as opposed to “1155”) for
-f 135 - 128 kb/s (as opposed to “1114”) for
-f 136
About this issue
- Original URL
- State: open
- Created 7 years ago
- Reactions: 2
- Comments: 17 (9 by maintainers)
I agree 100%.
I think it’s more likely that YouTube computes the reported
tbrvalue by using the data size field (which appears in the rightmost column) as a numerator. When the data size is not available for some reason, thetbrcomputation falls back to some kind of extremely crude approximation, possibly using values from a lookup table.This would explain the clustering of false/invalid
tbrvalues that I noted (i.e., “1155” or “2200”/“2310”). It would also suggest that whenever filesize is not reported, the ‘tbr’ value should be ignored, since in those cases we would possess just as much information as YouTube did, and so we can make our own approximations ‘thank you very much.’