youtube-dl: --prefer-free-formats does not prefer webm to mp4

$ youtube-dl 'https://www.youtube.com/watch?v=r9rGX91rq5I' -v -F
[debug] System config: [u'--prefer-free-formats']
[debug] User config: []
[debug] Command-line args: [u'https://www.youtube.com/watch?v=r9rGX91rq5I', u'-v', u'-F']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2015.06.04.1
[debug] Python version 2.7.10 - Linux-4.0.5-300.1.kparal.fc22.x86_64-x86_64-with-fedora-22-Twenty_Two
[debug] exe versions: ffmpeg 2.6.3, ffprobe 2.6.3
[debug] Proxy map: {}
[youtube] r9rGX91rq5I: Downloading webpage
[youtube] r9rGX91rq5I: Extracting video information
[youtube] r9rGX91rq5I: Downloading DASH manifest
[info] Available formats for r9rGX91rq5I:
format code  extension  resolution note
171          webm       audio only DASH audio  108k , audio@128k (44100Hz), 3.84MiB
140          m4a        audio only DASH audio  127k , m4a_dash container, aac  @128k (44100Hz), 4.76MiB
278          webm       256x144    DASH video   62k , webm container, VP9, 15fps, video only, 1.23MiB
242          webm       426x240    DASH video  112k , 30fps, video only, 1.88MiB
160          mp4        256x144    DASH video  122k , 15fps, video only, 3.37MiB
243          webm       640x360    DASH video  205k , 30fps, video only, 3.59MiB
244          webm       854x480    DASH video  281k , 30fps, video only, 6.15MiB
134          mp4        640x360    DASH video  297k , 30fps, video only, 3.44MiB
133          mp4        426x240    DASH video  307k , 30fps, video only, 7.54MiB
135          mp4        854x480    DASH video  626k , 30fps, video only, 7.05MiB
247          webm       1280x720   DASH video  645k , 30fps, video only, 14.85MiB
248          webm       1920x1080  DASH video 1204k , 30fps, video only, 27.42MiB
136          mp4        1280x720   DASH video 1286k , 30fps, video only, 14.79MiB
137          mp4        1920x1080  DASH video 2521k , 30fps, video only, 29.69MiB
271          webm       2560x1440  DASH video 3741k , 30fps, video only, 66.79MiB
264          mp4        2560x1440  DASH video 5990k , 30fps, video only, 96.12MiB
313          webm       3840x2160  DASH video 9743k , VP9, 30fps, video only, 157.88MiB
266          mp4        3840x2160  DASH video 10775k , h264, 30fps, video only, 152.25MiB
17           3gp        176x144    
36           3gp        320x240    
5            flv        400x240    
18           mp4        640x360    
43           webm       640x360    
22           mp4        1280x720   (best)

$ youtube-dl 'https://www.youtube.com/watch?v=r9rGX91rq5I' -v --get-format
[debug] System config: [u'--prefer-free-formats']
[debug] User config: []
[debug] Command-line args: [u'https://www.youtube.com/watch?v=r9rGX91rq5I', u'-v', u'--get-format']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2015.06.04.1
[debug] Python version 2.7.10 - Linux-4.0.5-300.1.kparal.fc22.x86_64-x86_64-with-fedora-22-Twenty_Two
[debug] exe versions: ffmpeg 2.6.3, ffprobe 2.6.3
[debug] Proxy map: {}
266 - 3840x2160 (DASH video)+140 - audio only (DASH audio)

Please note that [debug] System config: [u'--prefer-free-formats'] shows that I have preference to free formats enabled, but youtube-dl wants to download mp4 instead (266+140).

About this issue

  • Original URL
  • State: open
  • Created 9 years ago
  • Reactions: 3
  • Comments: 58 (12 by maintainers)

Commits related to this issue

Most upvoted comments

This should work: ytdl-format="((bestvideo[vcodec^=vp9]/bestvideo)+(bestaudio[acodec=opus]/bestaudio))/best"

This seems like a conflict of interests. If you make this behavior a separate flag, then --prefer-free-formats no longer has a solid reason to exist.

There’s no inherent reason to prefer free formats over proprietary formats other than ideological ones.

The way I see it, the gigantic format string you are describing is born from a need to work-around a different youtube-dl limitation: The fact that it doesn’t understand that VP9 is higher quality than AVC even at lower bitrates.

But why are we spending so much time focusing on a work-around instead of improving the core problem here?

Note: I use a similar such work-around, namely:

(bestvideo[vcodec=vp9]/bestvideo[height>720]/bestvideo[fps>30])[tbr<13000]+(bestaudio[acodec=opus]/bestaudio)/best

But I consider this a pure work-around simply because I really don’t care if my video stream is VP9 or AVC. I’m just special-casing VP9 because of aforementioned youtube-dl bug.

I agree with the comments made by @fireattack above here and here that resolution should be used instead of bitrate for the -f bestvideo determination.

Rather than continue to hijack the --prefer-free- debate, I created #14143 (albeit before discovering the comments I cited here in this unrelated issue).

One unqualified demerit of using bitrate–or for that matter, anything-rate–during the bestvideo determination is that the time denominator is irrelevant when comparing alike-length alternatives. Since the length is exactly the same for all the entries in a given youtube-dl -F format listing, you can multiply by every rate value in the table by the same duration (video length) to “simplify,” or mathematically cancel the division which created that “rate.”

Given a “most pristine” master copy which must exist for every video, entropy can be dissipated in several dimensions, but duration time is not one of them. Either the compression algorithms are more/less efficient, or the frame dimensions or frame rate are different. And it seems to me that given the special constraint discussed above, the latter two will swamp the former in all cases, making some sensible combination of dimension and fps a much more stable indicator of “best.”

I think that, possibly, people have gotten in the mindset of elevating bitrate because it is quite likely a better indicator of quality when comparing dissimilar source material, but this habit may be obscuring better judgment when it comes to the specialized task of comparing all-alike material.

Irrefutably, really, bitrate is just a needlessly complex proxy for the total entropy each available format ultimately makes available. Add in the fact that the YouTube reported bitrate values are often wildly wrong (in fact far more so than this thread has reported; see #14143) and it’s clear that by the nature of this type of selection task, bitrate is not the way to go.

So as I’ve noted, the -f bestvideo determination task, as with tasks in general, can and should capitalize on the special properties that are particular to it. In this case, this means recognizing the exclusion of time-denominated metrics as mathematically vacuous, and therefore unnecessarily complicating:, i.e., at best a harmless obfuscation, and at worst, exposing a tangibly increased error surface.

One further argument - some Linux distributions like Fedora add --prefer-free-formats to the default configuration file, see here: http://pkgs.fedoraproject.org/cgit/youtube-dl.git/tree/youtube-dl.conf

The reason is that h264/aac/other codecs are neither installed by default nor offered in the repositories, because of patent issues. So the wish of the package maintainers of youtube-dl is to allow all Fedora/other distro users to make as many videos playable by default as possible - that means they want to configure youtube-dl to download free formats every time it is possible to do so. So that it is really useful for all users, and not just for those who live outside of USA + other countries where software patents apply and installed proprietary codecs. They don’t want to require it, because downloading something is better than downloading nothing, but they want to prefer it as much as possible.

The bitrate will never be exactly the same, it will always differ slightly. But higher bit rate doesn’t imply higher quality, it depends on the codec. So that assumption is somewhat off.

But as @rg3 says, that option is not about quality, at least as it is currently documented. It says “pick the free format, if it is available”. It does not speak about quality at all. So maybe the current implementation is expected by the developers, but certainly not by the users.

And, if only quality is concerned, what would be the point of such an option? If mp4 has slightly better bitrate, it is picked. If webm has slightly better bitrate, it is picked. Such option would have no effect at all, because it is already the default behavior!

So really, this option should mean “I always want free formats, if they are available, otherwise fall back to proprietary formats”. That’s how I want to configure youtube-dl. Please make it possible. Thank you.

This is expected behavior. --prefer-free-formats is only respected when non-free and free formats are of the same quality. 266 (mp4) has higher bitrate than 313 (webm) therefore is selected. You may want to use -f bestvideo[ext=webm]+bestaudio[ext=webm]/best[ext=webm].

After reading all these old issues I assume this is right place to discuss the “bitrate vs resolution” debate.

My take on this, at least for YouTube, is that resolution is a better indicator, because:

1)) the bit rate reported by YouTube is very unreliable and often quite off. So we shouldn’t rely on it no matter what.

One example is what I reported in #12198. Video https://www.youtube.com/watch?v=iMhwshjSauo (warning: video is not available in the US and other 4 countries) reports:

135          mp4        720x480    DASH video  198k , avc1.4d401e, 30fps, video only
133          mp4        360x240    DASH video  242k , avc1.4d400c, 30fps, video only

However, the actual bitrate is 112 kbps (480p) and 32.2 kbps (240p), respectively. As you can see, not only the numbers are off, the order bwtween formats are dramatically changed.

Another example is in #11451.

2)) since from my experience YouTube never upscale videos (correct me if I’m wrong), higher resolution one should always have better quality (if the encoding didn’t go wrong), since it literally has more raw information.

I always thought that --prefer-free-formats would prefer free formats over proprietary formats at all costs. It is in this issue that I learned it does not. (Not the documentation!).

If I’m specifying this option, odds are that I either:

  1. Want to use a free format for ideological reasons
  2. Don’t have proprietary codecs supported on my system

So I think it’s fair to choose a lower-quality free format over a higher-quality fair proprietary format when this option is specified.

I believe (so basically I’m inventing stuff as I write, haha) people using this option are probably interested in downloading free formats if available, just like the current option documentation suggests.

So if a given website (YouTube or other) offers videos in either MP4 or Webm, using --prefer-free-formats should pick a Webm one, and should only download MP4 if no free formats are available.

Once that “format filter” has been applied, it should choose the highest quality one, or obey other format restrictions the user may have specified.

But then we can raise other questions. For example, someone may want to have --prefer-free-formats in the config file, but in a specific case (s)he may want to specify a format filter that wouldn’t work with that, so they need a second option to override the config file setting in a given program call (as in --no-prefer-free-format).

Just my two cents. You know I don’t want to have any voting weight in the development process and I don’t even use that option, so it’s irrelevant to me. 😛

@aufkrawall Yeah, it gets identifies as vp9.2

I use this:

ytdl-format=(bestvideo[vcodec=vp9.2]/bestvideo[vcodec=vp9][fps>30]/bestvideo[vcodec=vp9][height>=1080]/bestvideo[fps>30]/bestvideo[height>720])+(bestaudio[acodec=opus]/bestaudio)/best

In the case I posted above it will only select opus when a video is found I should mention, for pure audio streaming sites it falls back to “best” But you can add more and more terms any time (bestaudio[acodec=opus]/bestaudio[acodec=whateverelse]/bestaudio[acodec=yetanothercodec]/bestaudio)

Which is why the only sane thing to do (IMO) is remove this flag entirely and provide an --only-free-formats flag.

No, and you are an asshole. You know that a bunch of people want to be able to have the highest quality video, and have the free format of that quality, and that this flag is very useful to them. There is no way to replace this tag without a massively convoluted set of preferences, whereas it’s trivial to only select the free formats with a comparatively simple “f=bestaudio[ext=webm]+bestvideo[ext=webm]/best[ext=webm]”.

You are purposefully ignoring that people actually exist who want this flag, and belittling their preferences in favor of your own, simply because you think your preferences are magically better because you have them.

I am done wasting my time on you, because you adamantly refuse to accept that anyone else has legitimate preferences of their own.

That should probably be a separate flag, such as “–only-free-formats”.

The main issue was that while getting only free formats was relatively easy (“-f bestvideo[ext=webm]+bestaudio[ext=webm]/best[ext=webm]”), it got really complicated if you wanted to get the highest resolution video, and then prefer the free format if it had one which would lead to such horrors as: -f bestvideo[ext=webm,height=1080]+bestaudio[ext=webm]/bestvideo[ext=webm,height=720]+bestaudio[ext=webm]/bestvideo[ext=webm,height=640]+bestaudio[ext=webm]/bestvideo[ext=webm,height=480]+bestaudio[ext=webm]/bestvideo[ext=webm,height=240]+bestaudio[ext=webm/best[ext=webm]

with entries for every possible video size you could expect. This was further complicated by the fact that youtube periodically introduces higher resolutions (such as 1440), and things like 3D videos, which mean you have to continually retune the script to add new resolutions to a massive list. (Edit: Or by making a custom script that runs -F and parses the text looking for the highest res free format, which is another undertaking that’s much more effort than most people can or should be bothered with.)

Thanks @jaimeMF my interpretation of --prefer-free-formats is the same. However, I can’t speak for others because I don’t use it.

Prioritizing resolutions than bitrates is reasonable. Technically free formats often have lower bitrates than non-free formats. Take VP9 for example, it aims to “Reduce video bitrate by 50% with image quality comparable to VP” ( VP9 requirements ). In other words, bitrates is not a synonym of quality. In real cases, it’s uncommon for VP9 to have higher bitrates than H264. As a result, relying on bitrates is problematic.

I’ve already stated the purpose of it - downloading free format when both formats are equal according to metadata describing quality. youtube-dl supports not only YouTube so the probability is much higher especially if some metadata is not available.

So IIUIC --prefer-free-formats has no effect on YouTube videos, because full metadata is available, and the videos are always at least slightly different (the bitrate is never the same), so it never happens that this option is taken into account. It can happen on other video sites, which don’t provide full metadata, but not on YouTube. That’s… suboptimal.

And it doesn’t help at all to those users who don’t have proprietary codecs installed, and want to download free fomats without hand-picking them every time.

@dstftw For you, quality might be the most important factor. But this belief is not universal and other groups of users might have different priorities. Like open formats. I’m one of them, I believe that patented formats need to be purged from this world and I’ll gladly sacrifice some quality because of it. It would be great to have an option which would allow such groups of users to use youtube-dl easily. It can be a new option, if needed (even though in my personal opinion it would make most sense to just use the current one, especially since it currently doesn’t seem to have any effect on the largest and most widely used video site).

@kparal I’ve already stated the purpose of it - downloading free format when both formats are equal according to metadata describing quality. youtube-dl supports not only YouTube so the probability is much higher especially if some metadata is not available.

What is the logic when that option is not given - a random format is picked, or always a proprietary one?

Huh? Not a random but best quality format according to metadata. @jaimeMF for YouTube giving less priority to filesize and bitrate is likely to be reasonable, but in general probably not.

What I’m saying is that there’s no easy way to prefer free formats (regardless of quality, literally prefer) without writing an explosive set of command line arguments. That’s also the case of aforementioned #2196 - it gives a workaround, but it’s neither simple nor clean, if you want to cover all the cases. (And I don’t think anything was rejected - “you are welcome to contribute a patch that adds one”).

@dstftw Can you please clarify what is the purpose of --prefer-free-formats as you see it? I still don’t understand it. Does it work only for formats with exactly the same resolution/fps/bitrate? Doesn’t that have like 0.00001% chance of occurring? Does it make sense to have an option that almost never applies, because bitrate is never exactly the same? What is the logic when that option is not given - a random format is picked, or always a proprietary one?

In my understanding “require” has different meaning than “prefer”. When you require a free format, you pick the best free format, and if it is not available, you fail with an error (“free format required, but none available”). When you prefer a free format, you pick the best free format, and if it is not available, you pick the best non-free format.