yt-dlp: Can't download tiktok video

Checklist

I’m reporting a broken site
I’ve verified that I’m running yt-dlp version 2022.04.08 (update instructions) or later (specify commit)
I’ve checked that all provided URLs are alive and playable in a browser
I’ve checked that all URLs and arguments with special characters are properly quoted or escaped
I’ve searched the bugtracker for similar issues including closed ones. DO NOT post duplicates
I’ve read the guidelines for opening an issue
I’ve read about sharing account credentials and I’m willing to share it if required

Region

Russia

Description

Video link is available, but yt-dlp can’t download it: https://www.tiktok.com/@denidil6/video/7065799023130643713

Verbose log

D:\PROGRAMS\TikTok Download>!dl https://www.tiktok.com/@denidil6/video/7065799023130643713 -vU
[debug] Command-line config: ['https://www.tiktok.com/@denidil6/video/7065799023130643713', '-vU']
[debug] Encodings: locale cp1251, fs utf-8, out utf-8, err utf-8, pref cp1251
[debug] yt-dlp version 2022.04.08 [7884ade] (win_exe)
[debug] Python version 3.8.10 (CPython 64bit) - Windows-10-10.0.19044-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
[debug] exe versions: ffmpeg n5.0.1-3-gb655beb025-20220419 (setts), ffprobe n5.0.1-3-gb655beb025-20220419
[debug] Optional libraries: brotli, certifi, Cryptodome, mutagen, sqlite, websockets
[debug] Proxy map: {}
Latest version: 2022.04.08, Current version: 2022.04.08
yt-dlp is up to date (2022.04.08)
[debug] [TikTok] Extracting URL: https://www.tiktok.com/@denidil6/video/7065799023130643713
[TikTok] 7065799023130643713: Downloading video details
WARNING: [TikTok] 7065799023130643713: Video not available; please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U; Retrying with feed workaround
[TikTok] 7065799023130643713: Downloading video feed
WARNING: [TikTok] 7065799023130643713: Unable to find video in feed; please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U; Retrying with webpage
[TikTok] 7065799023130643713: Downloading webpage
ERROR: [TikTok] 7065799023130643713: Unable to download webpage: The read operation timed out (caused by timeout('The read operation timed out')); please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U
  File "yt_dlp\extractor\common.py", line 641, in extract
  File "yt_dlp\extractor\tiktok.py", line 520, in _real_extract
  File "yt_dlp\extractor\common.py", line 932, in _download_webpage
  File "yt_dlp\extractor\common.py", line 800, in _download_webpage_handle
  File "yt_dlp\extractor\common.py", line 785, in _request_webpage

  File "yt_dlp\extractor\common.py", line 767, in _request_webpage
  File "yt_dlp\YoutubeDL.py", line 3601, in urlopen
  File "urllib\request.py", line 525, in open
  File "urllib\request.py", line 542, in _open
  File "urllib\request.py", line 502, in _call_chain
  File "yt_dlp\utils.py", line 1543, in https_open
  File "urllib\request.py", line 1358, in do_open
  File "http\client.py", line 1344, in getresponse
  File "http\client.py", line 307, in begin
  File "http\client.py", line 268, in _read_status
  File "socket.py", line 669, in readinto
  File "ssl.py", line 1241, in recv_into
  File "ssl.py", line 1099, in read
socket.timeout: The read operation timed out

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 15 (7 by maintainers)

Commits related to this issue

Most upvoted comments

You can see the author of the PR is the maintainer of youtube-dl. So it will be merged in ytdl when he believes the code is good enough to be merged, and I will pull it to yt-dlp after that. Or, if you want to bypass ytdl altogether, you should make a PR directly to yt-dlp and address any reviews.

merge pls

Messages like this and https://github.com/ytdl-org/youtube-dl/pull/30479#issuecomment-1115007307 are only counter-productive. If you have nothing constructive to add to an issue, the best thing you can do is to patiently wait

pukkandan on May 2, 2022

Seems like my example video was removed from tiktok so I can’t try it with new version of yt-dlp

Neurotoxin001 on May 27, 2022

@Neurotoxin001 Use a shorter filename

pukkandan on May 11, 2022

In this case the yt-dl PR had been lingering because of unwanted 403s and timeouts when pulling the video metadata from the page, now fixed by forcing all unspecified UAs to Mozilla/5.0. The yt-dlp extractor uses API URLs first and then falls back to extraction from the page.

In the yt-dlp version the problem in this issue should be fixed by replacing the code that extracts the ‘sigi’ hydration JSON by a call to this method:

    def _get_SIGI_STATE(self, video_id, html):
        state = self._parse_json(
            get_element_by_id('SIGI_STATE', html)
            or self._search_regex(
                r'''(?s)<script\s[^>]*?\bid\s*=\s*(?P<q>"|'|\b)sigi-persisted-data(?P=q)[^>]*>[^=]*=\s*(?P<json>{.+?})\s*(?:;[^<]+)?</script''',
                html, 'sigi data', default='{}', group='json'), video_id)
        return state if isinstance(state, dict) else {}

One possible issue is that the yt-dlp get_element_by_id() uses HTML parsing whereas yt-dl uses regex, so different failure modes are possible. If the page breaks the parser, possible work-arounds include sanitising the page before parsing, or using a regex instead.

When the sigi-persisted-data target above was added, tests showed that TT was sending both the previous page format and the ‘sigi’ format, perhaps depending on CDN or A-B testing. Probably the same is true of the SIGI_STATE target.

dirkf on May 2, 2022

@afterdelight Feel free to make a PR

pukkandan on May 2, 2022