yt-dlp: Cannot download video from udemy

Hi there, I got 403 error when downloading course from udemy business. I used command: yt-dlp -u udemy@abc.com -p password -P '~/Downloads' -o '%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s' https://abc.udemy.com/course/ielts-vocab-builder-002/ I’m sure that the username/password in command is correct. Please see the log:

[debug] Command-line config: ['-u', 'PRIVATE', '-p', 'PRIVATE', '-P', "'~/Downloads'", '-o', "'%(playlist)s/%(chapter_number)s", '-', "%(chapter)s/%(title)s.%(ext)s'", 'https://abc.udemy.com/course/ielts-vocab-builder-002/', '-v']
[debug] Encodings: locale cp1252, fs utf-8, out utf-8, pref cp1252
[debug] yt-dlp version 2021.09.25 (exe)
[debug] Python version 3.8.10 (CPython 64bit) - Windows-10-10.0.18363-SP0
[debug] exe versions: none
[debug] Optional libraries: Crypto, mutagen, sqlite, websockets
[debug] Proxy map: {}
[debug] [generic] Extracting URL: -
ERROR: [generic] '-' is not a valid URL. Set --default-search "ytsearch" (or run  yt-dlp "ytsearch:-" ) to search YouTube
Traceback (most recent call last):
  File "yt_dlp\extractor\common.py", line 585, in extract
  File "yt_dlp\extractor\generic.py", line 2490, in _real_extract
yt_dlp.utils.ExtractorError: '-' is not a valid URL. Set --default-search "ytsearch" (or run  yt-dlp "ytsearch:-" ) to search YouTube
Traceback (most recent call last):
  File "yt_dlp\extractor\common.py", line 585, in extract
  File "yt_dlp\extractor\generic.py", line 2490, in _real_extract
yt_dlp.utils.ExtractorError: '-' is not a valid URL. Set --default-search "ytsearch" (or run  yt-dlp "ytsearch:-" ) to search YouTube

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "yt_dlp\YoutubeDL.py", line 1227, in wrapper
  File "yt_dlp\YoutubeDL.py", line 1252, in __extract_info
  File "yt_dlp\extractor\common.py", line 601, in extract
yt_dlp.utils.ExtractorError: [generic] '-' is not a valid URL. Set --default-search "ytsearch" (or run  yt-dlp "ytsearch:-" ) to search YouTube

[debug] [generic] Extracting URL: %(chapter)s/%(title)s.%(ext)s'
ERROR: [generic] "%(chapter)s/%(title)s.%(ext)s'" is not a valid URL. Set --default-search "ytsearch" (or run  yt-dlp "ytsearch:%(chapter)s/%(title)s.%(ext)s'" ) to search YouTube
Traceback (most recent call last):
  File "yt_dlp\extractor\common.py", line 585, in extract
  File "yt_dlp\extractor\generic.py", line 2490, in _real_extract
yt_dlp.utils.ExtractorError: "%(chapter)s/%(title)s.%(ext)s'" is not a valid URL. Set --default-search "ytsearch" (or run  yt-dlp "ytsearch:%(chapter)s/%(title)s.%(ext)s'" ) to search YouTube
Traceback (most recent call last):
  File "yt_dlp\extractor\common.py", line 585, in extract
  File "yt_dlp\extractor\generic.py", line 2490, in _real_extract
yt_dlp.utils.ExtractorError: "%(chapter)s/%(title)s.%(ext)s'" is not a valid URL. Set --default-search "ytsearch" (or run  yt-dlp "ytsearch:%(chapter)s/%(title)s.%(ext)s'" ) to search YouTube

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "yt_dlp\YoutubeDL.py", line 1227, in wrapper
  File "yt_dlp\YoutubeDL.py", line 1252, in __extract_info
  File "yt_dlp\extractor\common.py", line 601, in extract
yt_dlp.utils.ExtractorError: [generic] "%(chapter)s/%(title)s.%(ext)s'" is not a valid URL. Set --default-search "ytsearch" (or run  yt-dlp "ytsearch:%(chapter)s/%(title)s.%(ext)s'" ) to search YouTube

[udemy:course] Downloading login popup
ERROR: [udemy:course] course: Unable to download webpage: HTTP Error 403: Forbidden (caused by <HTTPError 403: 'Forbidden'>); please report this issue on  https://github.com/yt-dlp/yt-dlp . Make sure you are using the latest version; see  https://github.com/yt-dlp/yt-dlp  on how to update. Be sure to call yt-dlp with the --verbose flag and include its complete output.
  File "yt_dlp\extractor\common.py", line 694, in _request_webpage
  File "yt_dlp\YoutubeDL.py", line 3256, in urlopen
  File "urllib\request.py", line 531, in open
  File "urllib\request.py", line 640, in http_response
  File "urllib\request.py", line 569, in error
  File "urllib\request.py", line 502, in _call_chain
  File "urllib\request.py", line 649, in http_error_default

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Reactions: 16
  • Comments: 28 (3 by maintainers)

Most upvoted comments

I was just playing around with this and found a simple work-around, namely --cookies-from-browser firefox

I don’t know that firefox is a necessary argument, but I think the cloudflare issue must be getting proc’d by some difference in --cookies-from-browser and --cookies. It seems plausible udemy may be directly parsing the cookies files for the header lines

# Netscape HTTP Cookie File
# This file is generated by yt-dlp.  Do not edit.
yt-dlp  --user-agent 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:105.0) Gecko/20100101 Firefox/105.0' --cookies-from-browser firefox -P . -o "%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s" "https://www.udemy.com/course/docker-mastery"
[Cookies] Extracting cookies from firefox
[Cookies] Extracted 1909 cookies from firefox
[udemy:course] Extracting URL: https://www.udemy.com/course/docker-mastery
[udemy:course] course: Downloading webpage
[udemy:course] 1035000: Downloading course curriculum
[download] Downloading playlist: 1035000
[udemy:course] Playlist 1035000: Downloading 170 items of 170
[download] Downloading item 1 of 170
[udemy] Extracting URL: https://www.udemy.com/course/learn/v4/t/lecture/32367182#__youtubedl_smuggle=%7B%22course_id%22%3A+%221035000%22%7D
[udemy] 32367182: Downloading lecture JSON
[udemy] 41522200: Downloading m3u8 information
[info] 41522200: Downloading 1 format(s): hls-3215
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 99
[download] Destination: 1035000\1 - Quick Start!\What is Docker in 2022? The Three Innovations.mp4
[download] 100% of  227.58MiB in 00:01:07 at 3.35MiB/s
[FixupM3u8] Fixing MPEG-TS in MP4 container of "1035000\1 - Quick Start!\What is Docker in 2022? The Three Innovations.mp4"
[download] Downloading item 2 of 170
[udemy] Extracting URL: https://www.udemy.com/course/learn/v4/t/lecture/32367184#__youtubedl_smuggle=%7B%22course_id%22%3A+%221035000%22%7D
[udemy] 32367184: Downloading lecture JSON
[udemy] 41522954: Downloading m3u8 information
[info] 41522954: Downloading 1 format(s): hls-2580
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 109
[...]

I have not tried to modify the netscape-formatted cookies file to see if it changes behavior, just a shot in the dark.

Account shared. Answer from pukkandan: Cloudflare captcha is causing the issue. Will need to investigate further to find a solution

The workaround from @gamedazed provided few days ago does not work for me. Also it seems that Udemy Business is using AWS CloudFront now. It stopped working all of a sudden around May 14th / May 15th (relative to sun).

I want to watch Udemy video on mpv for studying and note-taking. I tested 2022.06.29 and found some issues and workaround.

  1. Login check logic needs to be updated in udemy.py:L168-L171
        def is_logged(webpage):
            return any(re.search(p, webpage) for p in (
                r'href=["\'](?:https://www\.udemy\.com)?/user/logout/',
                r'>Logout<',
                r'"is_authenticated":true', # added
                fr'"email":"{username}"')) # added

Udemy login popup is updated. So it needs to be updated. But after a few successful logins, I couldn’t log in. Udemy just returnend the login popup html. I don’t know why. Maybe Udemy restriction?

  1. Captcha page of cloudflare is changed. yt-dlp needs to be updated in udemy.py:L132-L139
        if any(p in webpage for p in (
                '>Please verify you are a human',
                'Access to this page has been denied because we believe you are using automation tools to browse the website',
                '"_pxCaptcha"',
                'cf-captcha-container')): # added
            raise ExtractorError(
                'Udemy asks you to solve a CAPTCHA. Login with browser, '
                'solve CAPTCHA, then export cookies and pass cookie file to '
                'yt-dlp with --cookies.', expected=True)
  1. The course id is wrong in udemy.py:L205
    def _real_extract(self, url):
        lecture_id = self._match_id(url)

        webpage = self._download_webpage(url, lecture_id)

        course_id, _ = self._extract_course_info(webpage, lecture_id)
        #course_id = "3833504" # If I pass correct course id, it would get correct lecture info.

        try:
            lecture = self._download_lecture(course_id, lecture_id)

_real_extract passes wrong course_id to _download_lecture. It always gets 403 Forbidden. If course_id is correct, yt-dlp will download video properly. I wanted to make a PR but I couldn’t fix the 3 because I couldn’t pass playlist url which has real course_id to _real_extract.

same issue using version 2022.05.18

$ yt-dlp --verbose -u 'xxx@gmail.com' -p 'secret' -P './' -o'%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s' 'https://www.udemy.com/course/docker-kubernetes-the-practical-guide/'
[debug] Command-line config: ['--verbose', '-u', 'PRIVATE', '-p', 'PRIVATE', '-P', './', '-o', '%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s', 'https://www.udemy.com/course/docker-kubernetes-the-practical-guide/']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.05.18 [b14d52355] (source)
[debug] Lazy loading extractors is disabled
[debug] Plugins: ['SamplePluginIE', 'SamplePluginPP']
[debug] Git HEAD: 926ccc84e
[debug] Python version 3.8.10 (CPython 64bit) - Linux-5.4.0-1058-raspi-aarch64-with-glibc2.29
[debug] Checking exe version: ffprobe -bsfs
[debug] Checking exe version: ffmpeg -bsfs
[debug] exe versions: ffmpeg 4.2.7, ffprobe 4.2.7
[debug] Optional libraries: Cryptodome-3.14.1, brotli-1.0.9, certifi-2019.11.28, mutagen-1.45.1, secretstorage-2.3.1, sqlite3-2.6.0, websockets-10.2
[debug] Proxy map: {}
[udemy:course] Downloading login popup
ERROR: [udemy:course] course: Unable to download webpage: HTTP Error 403: Forbidden (caused by <HTTPError 403: 'Forbidden'>); please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issuetemplate. Confirm you are on the latest version using  yt-dlp -U
  File "/home/johnroyer/devel/yt-dlp/yt_dlp/extractor/common.py", line 640, in extract
    self.initialize()
  File "/home/johnroyer/devel/yt-dlp/yt_dlp/extractor/common.py", line 545, in initialize
    self._perform_login(username, password)
  File "/home/johnroyer/devel/yt-dlp/yt_dlp/extractor/udemy.py", line 169, in _perform_login
    login_popup = self._download_webpage(
  File "/home/johnroyer/devel/yt-dlp/yt_dlp/extractor/common.py", line 933, in _download_webpage
    res = self._download_webpage_handle(
  File "/home/johnroyer/devel/yt-dlp/yt_dlp/extractor/udemy.py", line 131, in _download_webpage_handle
    ret = super(UdemyIE, self)._download_webpage_handle(
  File "/home/johnroyer/devel/yt-dlp/yt_dlp/extractor/common.py", line 801, in _download_webpage_handle
    urlh = self._request_webpage(url_or_request, video_id, note, errnote, fatal, data=data, headers=headers, query=query, expected_status=expected_status)
  File "/home/johnroyer/devel/yt-dlp/yt_dlp/extractor/common.py", line 786, in _request_webpage
    raise ExtractorError(errmsg, cause=err)

  File "/home/johnroyer/devel/yt-dlp/yt_dlp/extractor/common.py", line 768, in _request_webpage
    return self._downloader.urlopen(url_or_request)
  File "/home/johnroyer/devel/yt-dlp/yt_dlp/YoutubeDL.py", line 3596, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
  File "/usr/lib/python3.8/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/usr/lib/python3.8/urllib/request.py", line 640, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.8/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

with v2021.10.22 I only get the “course: Unable to download webpage” error:

+$ yt-dlp -u user -p password https://companyname.udemy.com/course/learn-docker --verbose
[debug] Command-line config: ['-u', 'PRIVATE', '-p', 'PRIVATE', 'https://companyname.udemy.com/course/learn-docker', '--verbose']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, err utf-8, pref UTF-8
[debug] yt-dlp version 2021.10.22
[debug] Lazy loading extractors is disabled
[debug] Python version 3.9.7 (CPython 64bit) - Linux-5.10.79-1-lts-x86_64-with-glibc2.33
[debug] exe versions: ffmpeg 4.4.1 (setts), ffprobe 4.4.1
[debug] Optional libraries: Cryptodome, mutagen, sqlite, websockets
[debug] Proxy map: {}
[udemy:course] Downloading login popup
ERROR: [udemy:course] course: Unable to download webpage: HTTP Error 403: Forbidden (caused by <HTTPError 403: 'Forbidden'>); please report this issue on  https://github.com/yt-dlp/yt-dlp . Make sure you are using the latest version; see  https://github.com/yt-dlp/yt-dlp  on how to update. Be sure to call yt-dlp with the --verbose flag and include its complete output.
  File "/usr/lib/python3.9/site-packages/yt_dlp/extractor/common.py", line 712, in _request_webpage
    return self._downloader.urlopen(url_or_request)
  File "/usr/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py", line 3287, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
  File "/usr/lib/python3.9/urllib/request.py", line 523, in open
    response = meth(req, response)
  File "/usr/lib/python3.9/urllib/request.py", line 632, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.9/urllib/request.py", line 561, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.9/urllib/request.py", line 494, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.9/urllib/request.py", line 641, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)