youtube-dl: [Reddit] 429 Too Many Requests

Checklist

  • I’m reporting a broken site support
  • I’ve verified that I’m running youtube-dl version 2021.06.06
  • I’ve checked that all provided URLs are alive and playable in a browser
  • I’ve checked that all URLs and arguments with special characters are properly quoted or escaped
  • I’ve searched the bugtracker for similar issues including closed ones

Verbose log

[debug] System config: []
[debug] User config: ['-o', '%(title)s.%(ext)s']
[debug] Custom config: []
[debug] Command-line args: ['--verbose', 'https://www.reddit.com/r/memes/comments/ppbsg1/skyrim_be_like/']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.06.06
[debug] Python version 3.9.7 (CPython) - macOS-10.15.7-x86_64-i386-64bit
[debug] exe versions: ffmpeg 4.4, ffprobe 4.4, rtmpdump 2.4
[debug] Proxy map: {}
[RedditR] ppbsg1: Downloading JSON metadata
ERROR: Unable to download JSON metadata: HTTP Error 429: Too Many Requests (caused by <HTTPError 429: 'Too Many Requests'>); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
  File "/usr/local/Cellar/youtube-dl/2021.6.6/libexec/lib/python3.9/site-packages/youtube_dl/extractor/common.py", line 634, in _request_webpage
    return self._downloader.urlopen(url_or_request)
  File "/usr/local/Cellar/youtube-dl/2021.6.6/libexec/lib/python3.9/site-packages/youtube_dl/YoutubeDL.py", line 2288, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
  File "/usr/local/Cellar/python@3.9/3.9.7/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 523, in open
    response = meth(req, response)
  File "/usr/local/Cellar/python@3.9/3.9.7/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 632, in http_response
    response = self.parent.error(
  File "/usr/local/Cellar/python@3.9/3.9.7/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 561, in error
    return self._call_chain(*args)
  File "/usr/local/Cellar/python@3.9/3.9.7/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 494, in _call_chain
    result = func(*args)
  File "/usr/local/Cellar/python@3.9/3.9.7/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 641, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)

Description

  • Reddit now returns 429 Too Many Requests for any video downloading over youtube-dl. Not affected by rate-limiting, as even the first video downloaded over youtube-dl will fail immediately.
  • The video plays completely fine in browser both while logged in and on fresh private tabs, so this is likely related to Reddit detecting youtube-dl.
  • Using various headers (user agent etc.) to imitate a browser has had no effect.
  • Other generic downloading programs like wget return 200 OK.

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Reactions: 15
  • Comments: 30 (4 by maintainers)

Commits related to this issue

Most upvoted comments

@89z your choice of words is inappropriate and hurtful. You’re free to disagree with other, but please consider your language when making comments like that.

@89z I’m not sure why you’re talking about a filter here, i’m just following this thread. You’re right that the original message could have been worded a bit better, but i do agree with the gist of the post. I’m sure your tool is useful as well, but “use another tool” is not a constructive solution to this particular issue.

Cookies arent needed, not sure why people are saying that they are:

PS C:\> reddit ppbsg1
GET https://api.reddit.com/by_id/t3_ppbsg1
GET https://v.redd.it/bz6q8olysun71/DASHPlaylist.mpd
GET https://v.redd.it/bz6q8olysun71/DASH_720.mp4

https://github.com/89z/mech

What? Nobody wants to go visit another Github page to decipher your post.

Another easy workaround is to use the old.reddit.com URI; while

youtube-dl -F "https://www.reddit.com/r/Justrolledintotheshop/comments/k1ov8f/customer_installed_the_wrong_battery_and_slammed/" -v => 

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-F', 'https://www.reddit.com/r/Justrolledintothesho
p/comments/k1ov8f/customer_installed_the_wrong_battery_and_slammed/', '-v']
[debug] Encodings: locale cp1253, fs utf-8, out utf-8, pref cp1253
[debug] youtube-dl version 2021.06.06+18-git-20210701-ga803582+PR#28955v2
[debug] Python version 3.7.12 (CPython) - Windows-Vista-6.0.6003-SP2
[debug] exe versions: ffmpeg 4.4, ffprobe 4.4, phantomjs 2.1.1, rtmpdump 2.4
[debug] Proxy map: {}
[RedditR] k1ov8f: Downloading JSON metadata
ERROR: Unable to download JSON metadata: HTTP Error 429: Too Many Requests (caus
ed by <HTTPError 429: 'Too Many Requests'>); please report this issue on https:/
/yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U
  to update. Be sure to call youtube-dl with the --verbose flag and include its
complete output.
  File "youtube_dl\extractor\common.py", line 634, in _request_webpage
  File "youtube_dl\YoutubeDL.py", line 2288, in urlopen
  File "urllib\request.py", line 531, in open
  File "urllib\request.py", line 641, in http_response
  File "urllib\request.py", line 569, in error
  File "urllib\request.py", line 503, in _call_chain
  File "urllib\request.py", line 649, in http_error_default

fails, the conversion to https://old.reddit.com/r/Justrolledintotheshop/comments/k1ov8f/customer_installed_the_wrong_battery_and_slammed/ succeeds:

youtube-dl -F "https://old.reddit.com/r/Justrolledintotheshop/comments/k1ov8f/customer_installed_the_wrong_battery_and_slammed/" -v => 

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-F', 'https://old.reddit.com/r/Justrolledintothesho
p/comments/k1ov8f/customer_installed_the_wrong_battery_and_slammed/', '-v']
[debug] Encodings: locale cp1253, fs utf-8, out utf-8, pref cp1253
[debug] youtube-dl version 2021.06.06+18-git-20210701-ga803582+PR#28955v2
[debug] Python version 3.7.12 (CPython) - Windows-Vista-6.0.6003-SP2
[debug] exe versions: ffmpeg 4.4, ffprobe 4.4, phantomjs 2.1.1, rtmpdump 2.4
[debug] Proxy map: {}
[RedditR] k1ov8f: Downloading JSON metadata
[Reddit] 13rd33mnun161: Downloading m3u8 information
[Reddit] 13rd33mnun161: Downloading MPD manifest
[info] Available formats for 13rd33mnun161:
format code          extension  resolution note
hls-0-audio_0        mp4        audio only
hls-1-audio_1        mp4        audio only
dash-audio_0_133518  m4a        audio only DASH audio  133k , m4a_dash container
, mp4a.40.2 (48000Hz)
hls-347              mp4        146x288     347k , avc1.42001e, video only
hls-440              mp4        146x288     440k , avc1.42001e, video only
dash-video_539885    mp4        122x240    DASH video  539k , mp4_dash container
, avc1.4D401E, 30fps, video only
hls-554              mp4        162x320     554k , avc1.42001e, video only
hls-647              mp4        162x320     647k , avc1.42001e, video only
dash-video_748622    mp4        182x360    DASH video  748k , mp4_dash container
, avc1.4D401E, 30fps, video only
hls-958              mp4        218x432     958k , avc1.4d001e, video only
hls-1051             mp4        218x432    1051k , avc1.4d001e, video only
dash-video_1131225   mp4        244x480    DASH video 1131k , mp4_dash container
, avc1.4D401F, 30fps, video only (best)

block 89z right now don’t hear to that

@89z i’m not a maintainer of this repo, but i’m free to comment when i believe something is against good taste and respectful manners. Seeing that the comment in question has been removed (either by you or somebody else) i suppose i’m not the only one.

Working on a PR that’ll just generate random text and set it as the cookie. Seems to work.

Also, sites may be fronted through Cloudflare-like services that may apply different opaque policies to police traffic, depending on location, etc.

Regarding the Reddit extractor, as I pointed out before, it’s easy to identify since it gets a URL not used in a browser session (at least, not now), and that’s the one that gets 429-ed.

wow i am stupid, here I thought I was at the github page for yt-dlp lol. thanks for the heads up.

i’ve been getting this error again non stop. Using latest version of yt-dlp with gallery-dl […] My yt-dlp config is this:

#30839 :

If you were actually running yt-dlp, go there. Similarly with other forks,

😜

The extractor tries to load JSON that the browser doesn’t appear to fetch, and yet it still works:

$ youtube-dl -F -v 'https://www.reddit.com/r/196/comments/qfemzy/snail_rule/'
[debug] System config: ['--prefer-ffmpeg']
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-F', '-v', 'https://www.reddit.com/r/196/comments/qfemzy/snail_rule/']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2021.06.06
[debug] Python version 3.5.2 (CPython) - Linux-4.4.0-210-generic-i686-with-Ubuntu-16.04-xenial
[debug] exe versions: avconv 4.3, avprobe 4.3, ffmpeg 4.3, ffprobe 4.3
[debug] Proxy map: {}
[RedditR] qfemzy: Downloading JSON metadata
[Reddit] wepmtwrd0lv71: Downloading m3u8 information
[Reddit] wepmtwrd0lv71: Downloading MPD manifest
[info] Available formats for wepmtwrd0lv71:
format code          extension  resolution note
hls-0-audio_0        mp4        audio only 
hls-1-audio_0        mp4        audio only 
dash-audio_0_132411  m4a        audio only DASH audio  132k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-video_186902    mp4        240x240    DASH video  186k , mp4_dash container, avc1.4D401E, 30fps, video only
dash-video_314054    mp4        360x360    DASH video  314k , mp4_dash container, avc1.4D401E, 30fps, video only
hls-323              mp4        288x288     323k , avc1.42001e, video only
hls-419              mp4        288x288     419k , avc1.42001e, video only
hls-435              mp4        320x320     435k , avc1.42001e, video only
dash-video_456772    mp4        480x480    DASH video  456k , mp4_dash container, avc1.4D401F, 30fps, video only
hls-531              mp4        320x320     531k , avc1.42001e, video only
hls-541              mp4        432x432     541k , avc1.4d001e, video only
hls-637              mp4        432x432     637k , avc1.4d001e, video only
hls-826              mp4        640x640     826k , avc1.4d001f, video only
dash-video_827970    mp4        720x720    DASH video  827k , mp4_dash container, avc1.4D401F, 30fps, video only
hls-922              mp4        640x640     922k , avc1.4d001f, video only (best)
$

Presumably the failures seen by reporters above would be bypassed if the extractor instead used the JSON sent in the main page in this element: <script id="data">window.___r = {"pages": ...};</script>

This is the .pages.models object in that JSON:

{
      't3_qfemzy': {
        'id': 't3_qfemzy',
        'numComments': 157,
        'created': 1635161292000,
        'score': 7226,
        'distinguishType': null,
        'isLocked': false,
        'isStickied': false,
        'thumbnail': {
          'url': 'https://b.thumbs.redditmedia.com/Al1KpPDlwWnjH9p4ojQeu0_XISR_LUK-NBtdM6F81ps.jpg',
          'width': 140,
          'height': 140
        },
        'title': 'Snail Rule',
        'author': 'numerodezenove',
        'authorId': 't2_dc6e2ide',
        'authorIsBlocked': false,
        'domain': 'v.redd.it',
        'postId': 't3_qfemzy',
        ...
        'media': {
          'obfuscated': null,
          'hlsUrl': 'https://v.redd.it/wepmtwrd0lv71/HLSPlaylist.m3u8?a=1637813642%2CM2UxOThlZDlkNDc2NjkxMjk0ZDkzN2Q3NmM5OTMwNmU1NDY0YjhjZmU3ODdmMGEzZWE5ZGEwZjJmYTZjODVmZQ%3D%3D&v=1&f=sd',
          'dashUrl': 'https://v.redd.it/wepmtwrd0lv71/DASHPlaylist.mpd?a=1637813642%2CYzU3MjljNDMyZTdjNTJkODQ0N2E3NjIwYzY2YTI5MmViNTg3OGJjNGI5MmVlZjY2ZmZkMDcxYzQ1ZjMyZTk2NA%3D%3D&v=1&f=sd',
          'isGif': false,
          'scrubberThumbSource': 'https://v.redd.it/wepmtwrd0lv71/DASH_96.mp4',
          'posterUrl': 'https://external-preview.redd.it/ofgKs5FGw9MkNMVQIZxKAfRiHq0VNFjVpaBJ7_X5XAc.png?width=640&crop=smart&format=pjpg&auto=webp&s=0d527b4dc141e3dae14dbf10af6fee57d957844b',
          'width': 1080,
          'height': 1080,
          'type': 'video'
        },
        'preview': {
          'url': 'https://external-preview.redd.it/ofgKs5FGw9MkNMVQIZxKAfRiHq0VNFjVpaBJ7_X5XAc.png?format=pjpg&auto=webp&s=8ff220d7a1d7ee29cdf12ad85920d757d25869f0',
          'width': 1080,
          'height': 1080
        },
        ...
}

Only working workaround that I have right now is to extract the .m3u8 playlist URL directly from the browser. Inspect Element on the video player, find the data-hls-url property of the video-* div and pass it to youtube-dl after stripping HTTP parameters. Reddit does not seem to be filtering requests to direct resources.

By removing lines from cookies.txt until the command failed, I have determined that the only necessary cookie is reddit_session. Have that cookie and the download works–no matter what you set it to, as far as I can tell.

Temporary workaround:

  1. Export your logged-in cookies with an extension (e.g. get cookies.txt for chrome)
  2. Add the cookies to your youtube-dl command as documented in the readme (–cookies /path/to/cookies.txt)