yt-dlp: Crunchyroll error 403: Forbidden

Checklist

Region

Australia

Description

Whenever I try to download from Crunchyroll, it always puts out 403: Forbidden errors, even when using user agent and cookie file. This has not been happening to me until recently, so I think it may be a change with the site. Before I could just download without using a user agent or cookie file.

Although, I can use wget to download the webpages from the site, by just using a user agent, so it could also be an issue within yt-dlp itself.

Verbose log

$ yt-dlp -Uv --user-agent 'Mozilla/5.0 (X11; Linux x86_64; rv:94.0) Gecko/20100101 Firefox/94.0' --cookies '~/cookies.txt' 'https://www.crunchyroll.com/jojos-bizarre-adventure/episode-2-a-letter-from-the-past-652083'
[debug] Command-line config: ['-Uv', '--user-agent', 'Mozilla/5.0 (X11; Linux x86_64; rv:94.0) Gecko/20100101 Firefox/94.0', '--cookies', '~/cookies.txt', 'https://www.crunchyroll.com/jojos-bizarre-adventure/episode-2-a-letter-from-the-past-652083']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, err utf-8, pref UTF-8
[debug] yt-dlp version 2021.12.01 [91f071af6]
[debug] Python version 3.9.9 (CPython 64bit) - Linux-5.15.6-arch2-1-x86_64-with-glibc2.33
[debug] exe versions: ffmpeg 4.4.1 (setts), ffprobe 4.4.1
[debug] Optional libraries: Cryptodome, mutagen, sqlite, websockets
[debug] Proxy map: {}
Latest version: 2021.12.01, Current version: 2021.12.01
yt-dlp is up to date (2021.12.01)
[debug] [crunchyroll] Extracting URL: https://www.crunchyroll.com/jojos-bizarre-adventure/episode-2-a-letter-from-the-past-652083
[crunchyroll] 652083: Downloading webpage
ERROR: [crunchyroll] 652083: Unable to download webpage: HTTP Error 403: Forbidden (caused by <HTTPError 403: 'Forbidden'>); please report this issue on  https://github.com/yt-dlp/yt-dlp . Make sure you are using the latest version; see  https://github.com/yt-dlp/yt-dlp  on how to update. Be sure to call yt-dlp with the --verbose flag and include its complete output. (caused by <HTTPError 403: 'Forbidden'>); please report this issue on  https://github.com/yt-dlp/yt-dlp . Make sure you are using the latest version; see  https://github.com/yt-dlp/yt-dlp  on how to update. Be sure to call yt-dlp with the --verbose flag and include its complete output.
  File "/usr/lib/python3.9/site-packages/yt_dlp/extractor/common.py", line 715, in _request_webpage
    return self._downloader.urlopen(url_or_request)
  File "/usr/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py", line 3385, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
  File "/usr/lib/python3.9/urllib/request.py", line 523, in open
    response = meth(req, response)
  File "/usr/lib/python3.9/urllib/request.py", line 632, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.9/urllib/request.py", line 555, in error
    result = self._call_chain(*args)
  File "/usr/lib/python3.9/urllib/request.py", line 494, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.9/urllib/request.py", line 747, in http_error_302
    return self.parent.open(new, timeout=req.timeout)
  File "/usr/lib/python3.9/urllib/request.py", line 523, in open
    response = meth(req, response)
  File "/usr/lib/python3.9/urllib/request.py", line 632, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.9/urllib/request.py", line 561, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.9/urllib/request.py", line 494, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.9/urllib/request.py", line 641, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 38 (15 by maintainers)

Commits related to this issue

Most upvoted comments

So, it seems I know where is the problem. Now I only need to figure out how to submit the fix here properly, but before in plain text:

in clurncyroll.py in extractors on lines 731-736 (where we are getting params for the auth) bucket = policy_response['cms']['bucket'] params = { 'Policy': policy_response['cms']['policy'], 'Signature': policy_response['cms']['signature'], 'Key-Pair-Id': policy_response['cms']['key_pair_id'] } replay json changed (or moved) to beta state (they have 2 versions now there), and if that lines changed to: bucket = policy_response['cms_beta']['bucket'] params = { 'Policy': policy_response['cms_beta']['policy'], 'Signature': policy_response['cms_beta']['signature'], 'Key-Pair-Id': policy_response['cms_beta']['key_pair_id'] } All works fine, as far as I did a few tests.

The first time got 403, but so far, my experiments have gone this way:

  • I tried checking WIth my credentials (using -u -p) with VPN (different IP) with the same 403 error
  • I tried on my regular IP without my credentials with success

So far, my conclusion is that either premium/higher quality data access is bugged or some account lock, which I hope is not permanent.

EDIT.

A further investigation showed something interesting - when you try getting JSON without connecting, yt-dlp Redirecting to non-beta site - https://www.crunchyroll.com/ so, it is also possible, that issue might be related to getting data from beta (with error) or non-beta (fine) site/backend

I exposed loads of useful information but the developer just closed my thread for “Duplicate” (with quotes

I closed it as duplicate, because it is a duplicate. In fact, it was supposed to be YOUR responsibility to search for duplicates before opening the issue. Be glad I pointed you to the issue of which it is a duplicate of! The “useful information” you presented is still there, it’s not like I deleted it… Anyone can reference it anytime.

PS: The cant-reproduce label is there because it is not reliably reproducible. I cannot check the specific URL you gave because of geo-blocking, but I have never managed to reproduce this with URLs I can access. That ofc doesn’t mean the issue doesn’t exist. Just that any developers need to go the extra mile to first reproduce this before even attempting to fix it.

because maybe we are not suffering the same problem, in fact I’m not using any kind of VPN and the problem is still not identified for he and for me) without even offering an answer.

It is the exact same issue. The only difference is that OP of this issue “blocked” due to VPN and you are blocked due to excessive downloading

I don’t think we can expect the developer to try to help us. But the developer is the only one that can implement a fix at all…

This is a FOSS project. Anyone can try to figure out a fix and open a PR. If you are expecting me to personally fix every issue in each of the 1500+ websites supported by yt-dlp in addition to maintaining the core project/reviewing PRs, prepare to be disappointed.

I’m getting this HTTP Error 403: Forbidden for the first time ever today. I had no problems yesterday. And, I was only trying to DL a single episode today. Which was definitely more then 24 hours after my last DL yesterday. I hope this is a fixable issue.

EDIT

I also get the same error when simply getting list formats.

F:\YT>yt-dlp.exe --verbose --cookies cookies.txt “https://beta.crunchyroll.com/watch/GJWU25P4E/the-beginning-of-winter” -F [debug] Command-line config: [‘–verbose’, ‘–cookies’, ‘cookies.txt’, ‘https://beta.crunchyroll.com/watch/GJWU25P4E/the-beginning-of-winter’, ‘-F’] [debug] Encodings: locale cp1252, fs utf-8, out utf-8, err utf-8, pref cp1252 [debug] yt-dlp version 2022.04.08 [7884ade] (win_exe) [debug] Python version 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0 [debug] Checking exe version: ffmpeg -bsfs [debug] Checking exe version: ffprobe -bsfs [debug] exe versions: ffmpeg 2022-02-14-git-59c647bcf3-full_build-www.gyan.dev (setts), ffprobe 2022-02-14-git-59c647bcf3-full_build-www.gyan.dev [debug] Optional libraries: brotli, certifi, Cryptodome, mutagen, sqlite, websockets [debug] Proxy map: {} [debug] [crunchyroll:beta] Extracting URL: https://beta.crunchyroll.com/watch/GJWU25P4E/the-beginning-of-winter [crunchyroll:beta] Retrieving main page [crunchyroll:beta] Authenticating with cookie [crunchyroll:beta] Retrieving signed policy [crunchyroll:beta] the-beginning-of-winter: Retrieving episode metadata ERROR: [crunchyroll:beta] GJWU25P4E: Unable to download JSON metadata: HTTP Error 403: Forbidden (caused by <HTTPError 403: ‘Forbidden’>); please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U File “yt_dlp\extractor\common.py”, line 641, in extract File “yt_dlp\extractor\crunchyroll.py”, line 818, in _real_extract File “yt_dlp\extractor\common.py”, line 1029, in _download_json File “yt_dlp\extractor\common.py”, line 1008, in _download_json_handle File “yt_dlp\extractor\common.py”, line 800, in _download_webpage_handle File “yt_dlp\extractor\common.py”, line 785, in _request_webpage

File “yt_dlp\extractor\common.py”, line 767, in _request_webpage File “yt_dlp\YoutubeDL.py”, line 3601, in urlopen File “urllib\request.py”, line 531, in open File “urllib\request.py”, line 640, in http_response File “urllib\request.py”, line 569, in error File “urllib\request.py”, line 502, in _call_chain File “urllib\request.py”, line 649, in http_error_default urllib.error.HTTPError: HTTP Error 403: Forbidden

OP had reported this issue since December and even ElektroStudios’s issue was only temporary. Unless Crunchyroll was A/B testing the change, there may be 2 separate issues at play here.

@Burve For the fix, you’ll want to make the extractor look at both keys using traverse_obj

I believe this most recent issue to pop up may not be the same as the initial one. According to another project I follow, Crunchyroll has made changes to their API and added a new subdomain. Hopefully this information is helpful.

EDIT.

A further investigation showed something interesting - when you try getting JSON without connecting, yt-dlp Redirecting to non-beta site - https://www.crunchyroll.com/ so, it is also possible, that issue might be related to getting data from beta (with error) or non-beta (fine) site/backend

I’ve been able to recreate the same issue. Also, attempting to use cookies or credentials against the non-beta URL it attempts to redirect you to the beta site and you get the 403 error.

[debug] Command-line config: ['--verbose', '-F', '--username', 'PRIVATE', '--password', 'PRIVATE', 'https://www.crunchyroll.com/spy-x-family/operation-strix-842454']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, err utf-8, pref UTF-8
[debug] yt-dlp version 2022.04.08 [7884ade65] (zip)
[debug] Python version 3.8.10 (CPython 64bit) - Linux-5.4.0-109-generic-x86_64-with-glibc2.29
[debug] Checking exe version: ffprobe -bsfs
[debug] Checking exe version: ffmpeg -bsfs
[debug] exe versions: ffmpeg 4.2.4, ffprobe N-103927-g68815d6791
[debug] Optional libraries: certifi-2019.11.28, pycrypto-2.6.1, secretstorage-2.3.1, sqlite3-2.6.0
[debug] Proxy map: {}
[crunchyroll] Getting session id
[crunchyroll] Logging in
[debug] [crunchyroll] Extracting URL: https://www.crunchyroll.com/spy-x-family/operation-strix-842454
[crunchyroll] 842454: Downloading webpage
[crunchyroll] 842454: Redirected to beta site - https://beta.crunchyroll.com/watch/GMKUX838J/operation-strix
[debug] [crunchyroll:beta] Extracting URL: https://beta.crunchyroll.com/watch/GMKUX838J/operation-strix
[crunchyroll:beta] Retrieving main page
[crunchyroll:beta] Authenticating with cookie
[crunchyroll:beta] Retrieving signed policy
[crunchyroll:beta] operation-strix: Retrieving episode metadata
ERROR: [crunchyroll:beta] GMKUX838J: Unable to download JSON metadata: HTTP Error 403: Forbidden (caused by <HTTPError 403: 'Forbidden'>); please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U
  File "/usr/local/bin/youtube-dl/yt_dlp/extractor/common.py", line 642, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/bin/youtube-dl/yt_dlp/extractor/crunchyroll.py", line 815, in _real_extract
    episode_response = self._download_json(
  File "/usr/local/bin/youtube-dl/yt_dlp/extractor/common.py", line 1030, in _download_json
    res = self._download_json_handle(
  File "/usr/local/bin/youtube-dl/yt_dlp/extractor/common.py", line 1009, in _download_json_handle
    res = self._download_webpage_handle(
  File "/usr/local/bin/youtube-dl/yt_dlp/extractor/common.py", line 801, in _download_webpage_handle
    urlh = self._request_webpage(url_or_request, video_id, note, errnote, fatal, data=data, headers=headers, query=query, expected_status=expected_status)
  File "/usr/local/bin/youtube-dl/yt_dlp/extractor/common.py", line 786, in _request_webpage
    raise ExtractorError(errmsg, cause=err)

  File "/usr/local/bin/youtube-dl/yt_dlp/extractor/common.py", line 768, in _request_webpage
    return self._downloader.urlopen(url_or_request)
  File "/usr/local/bin/youtube-dl/yt_dlp/YoutubeDL.py", line 3600, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
  File "/usr/lib/python3.8/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/usr/lib/python3.8/urllib/request.py", line 640, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.8/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

The only solution I found for my specific problem is: wait 24 hours

I shared some suggestions to affected users and a batch-script code example to try to avoid this block in the future, here.

Thanks for read.

I think there has to be more to this than a simple rate limit. I’m pretty sure my ISP provided IP is not shared, or if it is it’s a very small pool. Yet I am seeing this 403 problem consistently on first attempt, for example for the Crunhyroll URL provided in the opening message of this issue. I have tried waiting several days between requests. I don’t do bulk downloads at all.

@vikkstar12 If they blocked my ip and the vpn’s ip, then why can I still connect to the site in a browser?

Hmm, this look not exactly as original code, but I assume more correct:

        bucket = traverse_obj(policy_response, ('cms_beta', 'bucket'))
        params = {
            'Policy': traverse_obj(policy_response, ('cms_beta', 'policy')),
            'Signature': traverse_obj(policy_response, ('cms_beta', 'signature')),
            'Key-Pair-Id': traverse_obj(policy_response, ('cms_beta', 'key_pair_id'))
        }

I don’t think we can expect the developer to try to help us.

Well, no. If you look at the major sites task list, you’ll see that the issue is recorded for future work. Speaking from the viewpoint of yt-dl, it’s just that it’s not widely reproducible, depends on black magic known only to a third party, and no easy and globally applicable solution has been found.

bucket = traverse_obj(policy_response, (‘cms_beta’, ‘bucket’)) params = { ‘Policy’: traverse_obj(policy_response, (‘cms_beta’, ‘policy’)), ‘Signature’: traverse_obj(policy_response, (‘cms_beta’, ‘signature’)), ‘Key-Pair-Id’: traverse_obj(policy_response, (‘cms_beta’, ‘key_pair_id’)) }

I applied this manually and it appears to be working. Hopefully the PR will be in soon 😃

@jvschiavo I think your problem is unrelated, because my error is it fails to even connect to the site AT ALL, even to scrape the webpage. It is very odd how I am the only one getting this error, even when using a vpn.

Sorry but it still doesn’t work.

$ yt-dlp -v https://www.crunchyroll.com/jojos-bizarre-adventure/episode-2-a-letter-from-the-past-652083
[debug] Command-line config: ['-v', 'https://www.crunchyroll.com/jojos-bizarre-adventure/episode-2-a-letter-from-the-past-652083']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, err utf-8, pref UTF-8
[debug] yt-dlp version 2021.12.01 [91f071af6]
[debug] Python version 3.9.9 (CPython 64bit) - Linux-5.15.6-arch2-1-x86_64-with-glibc2.33
[debug] exe versions: ffmpeg 4.4.1 (setts), ffprobe 4.4.1
[debug] Optional libraries: Cryptodome, mutagen, sqlite, websockets
[debug] Proxy map: {}
[debug] [crunchyroll] Extracting URL: https://www.crunchyroll.com/jojos-bizarre-adventure/episode-2-a-letter-from-the-past-652083
[crunchyroll] 652083: Downloading webpage
ERROR: [crunchyroll] 652083: Unable to download webpage: HTTP Error 403: Forbidden (caused by <HTTPError 403: 'Forbidden'>); please report this issue on  https://github.com/yt-dlp/yt-dlp . Make sure you are using the latest version; see  https://github.com/yt-dlp/yt-dlp  on how to update. Be sure to call yt-dlp with the --verbose flag and include its complete output. (caused by <HTTPError 403: 'Forbidden'>); please report this issue on  https://github.com/yt-dlp/yt-dlp . Make sure you are using the latest version; see  https://github.com/yt-dlp/yt-dlp  on how to update. Be sure to call yt-dlp with the --verbose flag and include its complete output.
  File "/usr/lib/python3.9/site-packages/yt_dlp/extractor/common.py", line 715, in _request_webpage
    return self._downloader.urlopen(url_or_request)
  File "/usr/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py", line 3385, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
  File "/usr/lib/python3.9/urllib/request.py", line 523, in open
    response = meth(req, response)
  File "/usr/lib/python3.9/urllib/request.py", line 632, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.9/urllib/request.py", line 555, in error
    result = self._call_chain(*args)
  File "/usr/lib/python3.9/urllib/request.py", line 494, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.9/urllib/request.py", line 747, in http_error_302
    return self.parent.open(new, timeout=req.timeout)
  File "/usr/lib/python3.9/urllib/request.py", line 523, in open
    response = meth(req, response)
  File "/usr/lib/python3.9/urllib/request.py", line 632, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.9/urllib/request.py", line 561, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.9/urllib/request.py", line 494, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.9/urllib/request.py", line 641, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)

I even tried using a vpn, and tried on a different machine, still the same output.