yt-dlp: Can't download tiktok video

Checklist

Region

Russia

Description

Video link is available, but yt-dlp can’t download it: https://www.tiktok.com/@denidil6/video/7065799023130643713

Verbose log

D:\PROGRAMS\TikTok Download>!dl https://www.tiktok.com/@denidil6/video/7065799023130643713 -vU
[debug] Command-line config: ['https://www.tiktok.com/@denidil6/video/7065799023130643713', '-vU']
[debug] Encodings: locale cp1251, fs utf-8, out utf-8, err utf-8, pref cp1251
[debug] yt-dlp version 2022.04.08 [7884ade] (win_exe)
[debug] Python version 3.8.10 (CPython 64bit) - Windows-10-10.0.19044-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
[debug] exe versions: ffmpeg n5.0.1-3-gb655beb025-20220419 (setts), ffprobe n5.0.1-3-gb655beb025-20220419
[debug] Optional libraries: brotli, certifi, Cryptodome, mutagen, sqlite, websockets
[debug] Proxy map: {}
Latest version: 2022.04.08, Current version: 2022.04.08
yt-dlp is up to date (2022.04.08)
[debug] [TikTok] Extracting URL: https://www.tiktok.com/@denidil6/video/7065799023130643713
[TikTok] 7065799023130643713: Downloading video details
WARNING: [TikTok] 7065799023130643713: Video not available; please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U; Retrying with feed workaround
[TikTok] 7065799023130643713: Downloading video feed
WARNING: [TikTok] 7065799023130643713: Unable to find video in feed; please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U; Retrying with webpage
[TikTok] 7065799023130643713: Downloading webpage
ERROR: [TikTok] 7065799023130643713: Unable to download webpage: The read operation timed out (caused by timeout('The read operation timed out')); please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U
  File "yt_dlp\extractor\common.py", line 641, in extract
  File "yt_dlp\extractor\tiktok.py", line 520, in _real_extract
  File "yt_dlp\extractor\common.py", line 932, in _download_webpage
  File "yt_dlp\extractor\common.py", line 800, in _download_webpage_handle
  File "yt_dlp\extractor\common.py", line 785, in _request_webpage

  File "yt_dlp\extractor\common.py", line 767, in _request_webpage
  File "yt_dlp\YoutubeDL.py", line 3601, in urlopen
  File "urllib\request.py", line 525, in open
  File "urllib\request.py", line 542, in _open
  File "urllib\request.py", line 502, in _call_chain
  File "yt_dlp\utils.py", line 1543, in https_open
  File "urllib\request.py", line 1358, in do_open
  File "http\client.py", line 1344, in getresponse
  File "http\client.py", line 307, in begin
  File "http\client.py", line 268, in _read_status
  File "socket.py", line 669, in readinto
  File "ssl.py", line 1241, in recv_into
  File "ssl.py", line 1099, in read
socket.timeout: The read operation timed out

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 15 (7 by maintainers)

Commits related to this issue

Most upvoted comments

You can see the author of the PR is the maintainer of youtube-dl. So it will be merged in ytdl when he believes the code is good enough to be merged, and I will pull it to yt-dlp after that. Or, if you want to bypass ytdl altogether, you should make a PR directly to yt-dlp and address any reviews.

merge pls

Messages like this and https://github.com/ytdl-org/youtube-dl/pull/30479#issuecomment-1115007307 are only counter-productive. If you have nothing constructive to add to an issue, the best thing you can do is to patiently wait

Seems like my example video was removed from tiktok so I can’t try it with new version of yt-dlp

@Neurotoxin001 Use a shorter filename

In this case the yt-dl PR had been lingering because of unwanted 403s and timeouts when pulling the video metadata from the page, now fixed by forcing all unspecified UAs to Mozilla/5.0. The yt-dlp extractor uses API URLs first and then falls back to extraction from the page.

In the yt-dlp version the problem in this issue should be fixed by replacing the code that extracts the ‘sigi’ hydration JSON by a call to this method:

    def _get_SIGI_STATE(self, video_id, html):
        state = self._parse_json(
            get_element_by_id('SIGI_STATE', html)
            or self._search_regex(
                r'''(?s)<script\s[^>]*?\bid\s*=\s*(?P<q>"|'|\b)sigi-persisted-data(?P=q)[^>]*>[^=]*=\s*(?P<json>{.+?})\s*(?:;[^<]+)?</script''',
                html, 'sigi data', default='{}', group='json'), video_id)
        return state if isinstance(state, dict) else {}

One possible issue is that the yt-dlp get_element_by_id() uses HTML parsing whereas yt-dl uses regex, so different failure modes are possible. If the page breaks the parser, possible work-arounds include sanitising the page before parsing, or using a regex instead.

When the sigi-persisted-data target above was added, tests showed that TT was sending both the previous page format and the ‘sigi’ format, perhaps depending on CDN or A-B testing. Probably the same is true of the SIGI_STATE target.

@afterdelight Feel free to make a PR