youtube-dl: [Udemy] Unable to extract course id

Checklist

  • I’m reporting a broken site support issue
  • I’ve verified that I’m running youtube-dl version 2021.12.17
  • I’ve checked that all provided URLs are alive and playable in a browser
  • I’ve checked that all URLs and arguments with special characters are properly quoted or escaped
  • I’ve searched the bugtracker for similar bug reports including closed ones
  • I’ve read bugs section in FAQ

Verbose log

youtube-dl --cookies c:\udemy_cookies.txt -o 'E:/Udemy/%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s' https://www.udemy.com/typescript-full/ --verbose
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--cookies', 'c:\\udemy_cookies.txt', '-o', 'E:/Udemy/%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s', 'https://www.udemy.com/typescript-full/', '--verbose']
[debug] Encodings: locale cp1251, fs utf-8, out utf-8, pref cp1251
[debug] youtube-dl version 2021.12.17
[debug] Python version 3.9.6 (CPython) - Windows-10-10.0.19044-SP0
[debug] exe versions: none
[debug] Proxy map: {}
[udemy:course] typescript-full: Downloading webpage
ERROR: Unable to extract course id; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "c:\python39\lib\site-packages\youtube_dl\YoutubeDL.py", line 815, in wrapper
    return func(self, *args, **kwargs)
  File "c:\python39\lib\site-packages\youtube_dl\YoutubeDL.py", line 836, in __extract_info
    ie_result = ie.extract(url)
  File "c:\python39\lib\site-packages\youtube_dl\extractor\common.py", line 534, in extract
    ie_result = self._real_extract(url)
  File "c:\python39\lib\site-packages\youtube_dl\extractor\udemy.py", line 442, in _real_extract
    course_id, title = self._extract_course_info(webpage, course_path)
  File "c:\python39\lib\site-packages\youtube_dl\extractor\udemy.py", line 78, in _extract_course_info
    course_id = course.get('id') or self._search_regex(
  File "c:\python39\lib\site-packages\youtube_dl\extractor\common.py", line 1012, in _search_regex
    raise RegexNotFoundError('Unable to extract %s' % _name)
youtube_dl.utils.RegexNotFoundError: Unable to extract course id; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

Description

I am trying to download a paid course from udemy.com And when I try use login/pass - I getting this: ERROR: Unable to download webpage: HTTP Error 403: Forbidden

So I tried use --cookies option and getting another error (above). I’m export udemy_cookies.txt form browser by this chrome extention: https://chrome.google.com/webstore/detail/get-cookiestxt/bgaddhkoddajcdgocldbbfleckgcbcid

About this issue

  • Original URL
  • State: open
  • Created 2 years ago
  • Comments: 16 (6 by maintainers)

Most upvoted comments

curl -b udemy_cookies.txt https://www.udemy.com/typescript-full/

Try curl -c udemy_cookies.txt "https://www.udemy.com/typescript-full/".

But your observation may be enough. This patch would find the course ID, though obviously there may be other changes if the course ID is being sent differently:

--- old/youtube-dl/youtube_dl/extractor/udemy.py
+++ new/youtube-dl/youtube_dl/extractor/udemy.py
@@ -77,8 +77,8 @@
             video_id, fatal=False) or {}
         course_id = course.get('id') or self._search_regex(
             [
-                r'data-course-id=["\'](\d+)',
-                r'"courseId"\s*:\s*(\d+)'
+                r'data-(?:clp-)?course-id\s*=\s*["\'](\d+)',
+                r'"courseId"\s*:\s*\[?(\d+)'
             ], webpage, 'course id')
         return course_id, course.get('title')

thank you @pashakiz and @dirkf . I was able to get it working as a result of your answers.

Great! It works now with this patch! Thank you!

But I saw in browser and found this (data-clp-course-id)

<body id="udemy" class="
    ud-app-loader ud-component--course-landing-page-udlite
  udemy " data-clp-course-id="4412496" data-module-id="course-landing-page/udlite" data-module-args="...">

and this (courseId)

<div class="clp-component-render"><div class="clp-component-render"><div class="ud-component--course-landing-page-udlite--purchase-body-container" data-component-props="{&quot;componentProps&quot;:{&quot;purchaseSection&quot;:{&quot;is_course_paid&quot;:true,&quot;has_subscription_offerings&quot;:false,&quot;subscription&quot;:null,&quot;style_full_lifetime_access&quot;:&quot;full-lifetime-access&quot;,&quot;style_money_back_guarantee&quot;:&quot;money-back-guarantee&quot;},&quot;purchaseInfo&quot;:{&quot;isValidStudent&quot;:false,&quot;purchaseDate&quot;:null},&quot;moneyBackGuarantee&quot;:{&quot;is_enabled&quot;:true},&quot;addToCart&quot;:{&quot;buyables&quot;:[{&quot;buyable_object_type&quot;:&quot;course&quot;,&quot;id&quot;:4412496,&quot;buyableContext&quot;:{&quot;contentLocaleId&quot;:null}}],&quot;onAddRedirectUrl&quot;:&quot;/cart/added/course/4412496/&quot;,&quot;addedButtonBsStyle&quot;:&quot;primary&quot;,&quot;is_enabled&quot;:true}},&quot;courseId&quot;:[4412496],&quot;courseObject&quot;:{&quot;id&quot;:4412496,&quot;is_private&quot;:false}}"><div data-unique-id="450" style="display:none"></div><div><div class="purchase-section-container-skeleton--price--3Xcfk purchase-section-container-skeleton--skeleton--1UsRE skeleton--skeleton--1jc5m"><div class="text-skeleton--text-skeleton--7BlWc skeleton--skeleton--1jc5m"><p><span class="text-skeleton--line--3Pla- block--block--1b0nE"></span><span class="text-skeleton--line--3Pla- block--block--1b0nE"></span></p><div class="skeleton--shine--2nD_V"></div></div><div class="skeleton--shine--2nD_V"></div></div><div class="purchase-section-container-skeleton--cta--jnShg purchase-section-container-skeleton--skeleton--1UsRE skeleton--skeleton--1jc5m"><span class="block--block--1b0nE"></span><div class="skeleton--shine--2nD_V"></div></div><div class="purchase-section-container-skeleton--money-back--3lqS1 purchase-section-container-skeleton--skeleton--1UsRE skeleton--skeleton--1jc5m"><span class="block--block--1b0nE"></span><div class="skeleton--shine--2nD_V"></div></div></div></div></div></div>
</div>

Will it help?