yt-dlp: [tiktok:user] Failed to parse JSON
Checklist
- I’m reporting a broken site
- I’ve verified that I’m running yt-dlp version 2022.05.18 (update instructions) or later (specify commit)
- I’ve checked that all provided URLs are playable in a browser with the same IP and same login details
- I’ve checked that all URLs and arguments with special characters are properly quoted or escaped
- I’ve searched the bugtracker for similar issues including closed ones. DO NOT post duplicates
- I’ve read the guidelines for opening an issue
- I’ve read about sharing account credentials and I’m willing to share it if required
Region
USA
Description
Starting earlier today, tiktok user pages started timing out. Downloading an individual video still works, but user pages don’t.
For example (in the log below), yt-dlp.sh "https://www.tiktok.com/@derekbrunsonmma" -vU
times out, but yt-dlp.sh "https://www.tiktok.com/@derekbrunsonmma/video/7098932076711284014" -vU
works fine.
Verbose log
$ ~/yt-dlp/yt-dlp.sh "https://www.tiktok.com/@derekbrunsonmma" -vU
[debug] Command-line config: ['https://www.tiktok.com/@derekbrunsonmma', '-vU']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.05.18 [b14d52355] (source)
[debug] Lazy loading extractors is disabled
[debug] Plugins: ['SamplePluginIE', 'SamplePluginPP']
[debug] Git HEAD: 926ccc84e
[debug] Python version 3.8.10 (CPython 64bit) - Linux-5.4.0-1029-aws-x86_64-with-glibc2.29
[debug] Checking exe version: ffprobe -bsfs
[debug] Checking exe version: ffmpeg -bsfs
[debug] exe versions: ffmpeg 4.2.4, ffprobe 4.2.4
[debug] Optional libraries: Cryptodome-3.13.0, certifi-2019.11.28, mutagen-1.45.1, secretstorage-2.3.1, sqlite3-2.6.0, websockets-10.1
[debug] Proxy map: {}
Latest version: 2022.05.18, Current version: 2022.05.18
yt-dlp is up to date (2022.05.18)
[debug] [tiktok:user] Extracting URL: https://www.tiktok.com/@derekbrunsonmma
[tiktok:user] derekbrunsonmma: Downloading webpage
ERROR: [tiktok:user] derekbrunsonmma: Unable to download webpage: The read operation timed out (caused by timeout('The read operation timed out')); please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U
File "/home/ubuntu/yt-dlp/yt_dlp/extractor/common.py", line 642, in extract
ie_result = self._real_extract(url)
File "/home/ubuntu/yt-dlp/yt_dlp/extractor/tiktok.py", line 629, in _real_extract
webpage = self._download_webpage(url, user_name, headers={
File "/home/ubuntu/yt-dlp/yt_dlp/extractor/common.py", line 933, in _download_webpage
res = self._download_webpage_handle(
File "/home/ubuntu/yt-dlp/yt_dlp/extractor/common.py", line 801, in _download_webpage_handle
urlh = self._request_webpage(url_or_request, video_id, note, errnote, fatal, data=data, headers=headers, query=query, expected_status=expected_status)
File "/home/ubuntu/yt-dlp/yt_dlp/extractor/common.py", line 786, in _request_webpage
raise ExtractorError(errmsg, cause=err)
File "/home/ubuntu/yt-dlp/yt_dlp/extractor/common.py", line 768, in _request_webpage
return self._downloader.urlopen(url_or_request)
File "/home/ubuntu/yt-dlp/yt_dlp/YoutubeDL.py", line 3596, in urlopen
return self._opener.open(req, timeout=self._socket_timeout)
File "/usr/lib/python3.8/urllib/request.py", line 525, in open
response = self._open(req, data)
File "/usr/lib/python3.8/urllib/request.py", line 542, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain
result = func(*args)
File "/home/ubuntu/yt-dlp/yt_dlp/utils.py", line 1419, in https_open
return self.do_open(
File "/usr/lib/python3.8/urllib/request.py", line 1358, in do_open
r = h.getresponse()
File "/usr/lib/python3.8/http/client.py", line 1348, in getresponse
response.begin()
File "/usr/lib/python3.8/http/client.py", line 316, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.8/http/client.py", line 277, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/lib/python3.8/socket.py", line 669, in readinto
return self._sock.recv_into(b)
File "/usr/lib/python3.8/ssl.py", line 1241, in recv_into
return self.read(nbytes, buffer)
File "/usr/lib/python3.8/ssl.py", line 1099, in read
return self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out
About this issue
- Original URL
- State: open
- Created 2 years ago
- Reactions: 21
- Comments: 135 (46 by maintainers)
For those looking for a dumb manual workaround to just grab a set of videos from a user page, this JS oneliner should copy all the loaded posts straight to your clipboard, ready to run:
Add your preferred command line opts after
yt-dlp
(for example,(a => "yt-dlp --verbose " + a.href)
).Edit to avoid further spamming this issue: You can use this code by going to a TikTok user page and scrolling until all the videos you want are visible, pressing F12, going to the Console tab, and pasting the above code and pressing enter. Tested on Chrome and Firefox.
Can confirm this report. Additionally, vm.tiktok.com URL’s timeout also.
For me pasting
copy(Array.from(document.querySelectorAll(".e19c29qe7")).map((x) => x.children[0].children[0].children[0].children[0].href).join("\n"))
into the developer console in Chromium (hit F12 to bring it up) copies all TikToks on someone’s page to the clipboard (scoll down someone’s page to load more). Pasting that into a file, sayurls.txt
and then runningyt-dlp -a urls.txt
works to download them all. The magice19c29qe7
class name may change as TikTok updates their webpages, but inspecting the page withctrl+shift+c
and clicking a TikTok to see its html/css one can update that.Go to redraskal’s fork of yt-dlp, and then either
git clone
it and checkout thefix/tiktok-user
branch, or select the same branch from the web interface and download the ZIP of that. From there, compile yt-dlp for your OS using the README.md instructions, and you’ll have an executable you can use on the command line like usual, except with redraskal’s patch applied.I should add that I’ve tried this from multiple machines with different IPs, so it’s not that I’ve personally been blocked.
It’s working again.
Can also confirm this report. Same error.
I created a working user extraction 😃
Looks like tests are not passing, but it works with my own personal tests. Anyone want to test this?Tests now passWe know it is broken. Please stop saying “me too”!
The issue seem similar to https://github.com/yt-dlp/yt-dlp/issues/2135. It may be worth looking into more app versions
please fix
yt-dlp https://www.tiktok.com/@t_k7276 [tiktok:user] t_k7276: Downloading webpage ERROR: [tiktok:user] t_k7276: Unable to download webpage: The read operation timed out (caused by TimeoutError('The read operation timed out')); please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U
Can confirm, same error.
Users already have a web browser that’s at least 10x too big (cf. Netscape 9 for Windows, <6MB download, 2007) and only yt-dlp users who are dedicated TT/DY fiends are likely to want another one for this specific application. Even Node.js is a small fraction of 150MB. But perhaps this is an olde worlde concern when users may be regularly downloading GBs that are immediately discarded.
I wondered if there was a lower overhead solution using https://chromedevtools.github.io/devtools-protocol/ but apparently that’s what pyppeteer uses anyway.
Accepting that this pyppeteer/chromium solution effectively resolves the problem in the issue and is the only successful approach found despite many other attempts, it seems like a setback for the yt-dl[p] approach as well as for the open Web in general. We can’t back-port such a solution to yt-dl, whose use cases include overcoming deficiencies of, or actual lack of, a platform web browser.
Right, it did for me. To clarify, I meant I think it’d be a good idea to at least give users a heads up that using this particular extractor downloads a 150 MiB binary to your system, since some users’ devices may not have much storage space available, that single binary is roughly 3x the size of the entire yt-dlp repo, some users might have internet/data plans with a quota, etc. Maybe this is just overly cautious on my part, but I’m not sure.
I have same error on vm.tiktok
@snazss Don’t talk about something unrelated to the issue here
Noticed it today also, but I still can download videos. But only with “feed” method:
I think it’s worth looking at, I’m going to try the same with my iPhone
I don’t have a Pixel 4, but I do have a moto g power and access to packet-capturing software, so since ja3 fingerprints are based on certain elements of the client hello packet, I imagine it wouldn’t hurt to run a pcap on my phone while using the TikTok app and see if I can replicate successful sessions based on that data. Since TikTok has a web client in addition to the mobile apps (not quite as featureful, though), that may be another vector for getting at this stuff, even if ja3 doesn’t end up being the specific method they’re using. I recognize this could be a lot of work just to get one feature of one extractor up and running again, but I think it’s still worth taking a crack at. Thoughts?
Having exact same problem in-time it has worked at the past few weeks
Summary of the situation:
The issues can be separated into a few parts:
#3551FixedPossible solutions:
https://github.com/ytdl-org/youtube-dl/pull/30479 should fix (1), (5) for youtube-dlIt’s broken now#3624 should fix (5), but the PR was abandonedProbably brokenhttps://github.com/yt-dlp/yt-dlp/compare/master...pukkandan:features/tiktok is my attempt at bringing https://github.com/ytdl-org/youtube-dl/pull/30479 to yt-dlp, but I am unable to test it due to (4). If anyone can confirm it fixes (1) and (5), it can be merged as a partial fixbrokenPS: It’s been a while since I looked at this issue, correct me if I’m wrong on something
Unless you have something useful to add to this, please refrain from adding noise to the issue
Update:
https://github.com/ytdl-org/youtube-dl/pull/30479 is completely broken now, so we are back to square 1 except for (5)
There must be some discrimination at TT. Here:
And Py3.9 (same for 3.5):
If the
_real_initialize()
fails, the later requests will behave as you saw. Fetching the home page should set some state (cookies IIRC) that let the later requests fetch the actual pages instead of being sent to captcha hell.Same issue, the download works fine for desktop links, but
vm.tiktok.com
and the newwww.tiktok.com/t/some-id
links that come from mobile time out completely, even with good headersIn that example, you are downloading an individual video…which OP reported still works. It is profiles and vm links that are not working.
Try
--add-header 'user-agent:Mozilla/5.0'
(aka--user-agent 'Mozilla/5.0'
).Actual code changes are needed for the shortcut (and profile) URLs as the redirect extractor may not see the custom UA. For example, try this yt-dl PR, which currently finds the first 30 videos, and can download at least the first 10kB of at least the first item, with the problem URL, but says
Also had to add
pyppeteer
as a dependency onpyinst.py
line 83 so pyinstaller can include this package in the executable. The updated extractor works perfectly btw, great work!It’s so much more realistic to use pyppeteer to sign requests. One change to the signing method could set back the extractor weeks if we went ahead with reverse engineering it. It’s only used for API requests that are needed to be signed (not bloated web pages). The requests by pyppeteer are actually faster than the requests for video information.
TikTok is just not wanting their stuff openly accessible.
We can rewrite tls (to emulate browsers), as well as reverse engineer the tiktok signer but that would take a long time and does not seem worth it when we could throw on pyppeteer for this one use case.
Successfully downloaded a whole user account – nice work! If merged, I think it’d be a good idea to make a note of the pyppeteer dependency, and its possible need to grab a 150 MiB Chromium binary to do the extraction, in the README.md.
The one lingering question I have (and please forgive me if I’m missing something obvious here) is how to specify format for the whole-user download – for example, how to get 720p all the time. It looks like the format ID changes for each individual video, but there might be a workaround I’m missing here.
And there is: https://douyin.wtf/
Which utilizes https://github.com/Evil0ctal/Douyin_TikTok_Download_API who seems to have figured out & solved recent API changes/breaks: https://github.com/Evil0ctal/Douyin_TikTok_Download_API/issues/77
Looks like 720p videos are not available using the feed method. Using -F
The resolution is wrong, you can tell by the 540p under ID column. The 720p versions do not show up at all when using links to videos now. They can only be downloaded from links to profile when it decides to not completely fail.
Ran a batch file on a loop until it picked up some videos since I noticed it occasionally downloads new videos with the one I use on a schedule. This is what happened, using account from OP. It only sees a limited number of videos.
I can confirm that 1, 2, and 5 are fixed. I have not yet confirmed 3 and 4 are fixed; I will be able to test them in a little under 12 hours, and would be happy to do so if no one else is inclined.
😦
@pukkandan I forked this repo and merged your branch into my copy. Build was successful; it addresses point 5, but unfortunately, point 1 is still broken.
Re: point 5:
As for point 1:
As far as I can tell, the error produced here is novel, which would indicate some progress has been made towards downloading user pages, but not a total fix. I’m testing on Ubuntu 20.04.4 via WSL, using a residential Internet connection in the USA.
Unfortunately, “vm” user profile links are still not working for me here in Canada. I also have “version 2022.05.18” as @D3SOX mentioned - but I get the timeout error.
Just tried it today on May 25th. Not an IP issue, because I can load same profiles on browser or app without trouble.
It works perfectly now.
I see what you are saying about TT blocking based on Connection: close / no persistent connections.
Copying in the updated extractor from master to #3668 and removing the hardcoded header (so we use the yt-dlp generated one), I get the same behaviour as your branch. When I use
--no-persistent-connections
it gets the same timeout.Though with removing the hardcoded header, and using
--add-header 'user-agent:Mozilla/5.0'
as specified above works too, even with Connection:close.Related: https://github.com/python/cpython/issues/57058
@pukkandan I uploaded a branch for this https://github.com/kjerk/yt-dlp/tree/ISSUE-3776 and you can see the commit details here https://github.com/kjerk/yt-dlp/commit/de4a6005e70f81f0f88807a368895a6da852a24b . This allows the request to go through and removes the hardcoded user agent (with the UA it still hangs), but I hadn’t proposed it as a real fix because then the self._video_entries_api() still returns empty as people have been saying, likely needing reinvestigation to figure out what happened to that api if that’s still something this fork wants to keep using, as opposed to page scraping 🤷
Workaround which just uses curl to get the redirect target and then passes it down to yt-dlp for the actual download :
Crude, but worked for me.
really, it worked