yt-dlp: [facebook] Cannot parse data
Checklist
- I’m reporting a broken site
- I’ve verified that I’m running yt-dlp version 2022.06.29 (update instructions) or later (specify commit)
- I’ve checked that all provided URLs are playable in a browser with the same IP and same login details
- I’ve checked that all URLs and arguments with special characters are properly quoted or escaped
- I’ve searched the bugtracker for similar issues including closed ones. DO NOT post duplicates
- I’ve read the guidelines for opening an issue
- I’ve read about sharing account credentials and I’m willing to share it if required
Region
Bangladesh
Provide a description that is worded well enough to be understood
Downloading worked perfectly fine a just few days ago but now any facebook video link in “facebook.com/. . ./ videos/ . . .” format throws this error. I have checked a similar type of issue #4289 , but it works fine on my side. The aforementioned link format, however, shows this error
Provide verbose output that clearly demonstrates the problem
- Run your yt-dlp command with -vU flag added (
yt-dlp -vU <your command line>
) - Copy the WHOLE output (starting with
[debug] Command-line config
) and insert it below
Complete Verbose Output
❯ yt-dlp.exe -vU "https://www.facebook.com/100015388953240/videos/424804832900683/"
[debug] Command-line config: ['-vU', 'https://www.facebook.com/100015388953240/videos/424804832900683/']
[debug] Encodings: locale cp1252, fs utf-8, pref cp1252, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.06.29 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.19044-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
[debug] exe versions: ffmpeg 5.0-full_build-www.gyan.dev (setts), ffprobe 5.0-full_build-www.gyan.dev
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
[debug] Downloading _update_spec from https://github.com/yt-dlp/yt-dlp/releases/download/2022.06.29/_update_spec
Latest version: 2022.06.29, Current version: 2022.06.29
yt-dlp is up to date (2022.06.29)
[debug] [facebook] Extracting URL: https://www.facebook.com/100015388953240/videos/424804832900683/
[facebook] 424804832900683: Downloading webpage
[facebook] 424804832900683: Downloading webpage
ERROR: [facebook] 424804832900683: Cannot parse data; please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U
File "yt_dlp\extractor\common.py", line 640, in extract
File "yt_dlp\extractor\facebook.py", line 713, in _real_extract
File "yt_dlp\extractor\facebook.py", line 659, in _extract_from_url
About this issue
- Original URL
- State: open
- Created 2 years ago
- Reactions: 14
- Comments: 96 (31 by maintainers)
Commits related to this issue
- [ie/facebook] Improve extraction Partially addresses #4311 Authored by: jingtra, ringus1 Co-authored-by: Jing Kjeldsen <jingtra@gmail.com> — committed to yt-dlp/yt-dlp by ringus1 5 months ago
This theory is only partially correct in my case. I tried with this public video in the following scenarios:
I’m pretty sure this is a regression in yt-dlp 2021.12.25
Actually, true. Something has changed in Facebook (again) and now not even youtube-dl is able to download it. Moreover, Facebook has been actively locking accounts if you use
--cookies
. Better to be safe than sorry. 🙂This does not work with
yt-dlp
. Though as a test, I downloaded the same private (group) video withyoutube-dl
manually runningyoutube-dl -v --cookies "$HOME\Documents\ytdl-cookies\facebook_cookies.txt" somePrivateVideoUrl)
over ten times in less than 5 minutes with no “shadow ban” so this is definitely a bug with youtube-dlp.I will even go as far as to running a script that will repeat this over a couple hours I’m so confident there’s no “shadow ban” for using cookies to prove this is a
youtube-dlp
bug. As for the other reports saying otherwise, I will say that Facebook is held together with sticks and glue anymore so random errors are prone unless there’s some niche cases I’m not catching.Per #7839, this error now occurs with nearly every FB URL
I can confirm youtube-dl works while yt-dlp does not. I have downloaded a few random videos with youtube-dl without any issue, while yt-dlp keep saying
Cannot parse data
.Indeed it was! 👍 ; to sum it up:
The Tahoe API endpoint “fallback” in recent
yt-dlp
versions is broken because of:User-Agent
s; the cut-off Chrome version in this case is 77 (>=78 gets the door shut on its face 😞 ); “us” engaged in retrocomputing have a saying: “Newer isn’t always better”, which, I think, fits perfectly here 😜 …--add-header "Sec-Fetch-Mode:"
(thanks @dirkf 😄 ).Putting 1+2 into practice 😉, with latest
yt-dlp-nightly
:😃
Hii guys i’m facing the same issue “cannot parse data” is there any solution , because we are stucked in our product development.
I have the same problem
I am pretty sure it’s an account-related problem, excessive use of yt-dlp can cause some kind of “invisible ban”.
Attempting to use the cookies of an account with this kind of “ban” (even with public videos that are downloaded normally with no cookies on the same machine) will result in the error mentioned in this issue, another attempt within 0 - 10 minutes will lock the account and “someone may have tried to access your account”.
In the past (until about a few months ago) you could easily know you currently have this kind of “ban” as almost every non-public video you normally have access to would give “content isn’t available at the moment”, and you could easily know when it ends. Currently, everything works as expected on browser but using yt-dlp with cookies causes the aforementioned error, and there is no way to know if the “ban” has ended other than trial and error.
I am 95% sure about this, but the only way to prove this theory is to use a clean account on a clean IP.
… For anyone still interested 😉 , the very last
yt-dlp-nightly
release without #7890 merged is 2023.09.05.200110 …Last night I did continue my tests, this time using some experimental
yt-dlp_x86.exe
builds, compiled with modded CPython 3.8/3.9 (so as to launch even on NT 5.1 😉 ) and modded OpenSSL-3.1.0 lib, and I still found that adding:--user-agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120 Safari/537.36" --add-header "Sec-Fetch-Mode:"
to my commands did make the Tahoe API “happy” 😜 ; as a test, I used the
TEST 'url'
inside #7890,https://www.facebook.com/radiokicksfm/videos/3676516585958356/
CPython 3.8.13+OpenSSL-3.1.0-dev
but
CPython 3.9.13+OpenSSL-3.1.0-dev
but
Sadly 😢 , I can’t test your theory about FB serving different page sources in different parts of the globe, as any attempt to use a VPN node/HTTPS Proxy (to “move out” of my region 😜 ) has resulted in a Tahoe API block 😠 …
The results above are repeatable with yt-dl master and Python 3.9 and also with the nightly release. Note that yt-dl currently sends UA versions < FF 90 by default.
However:
yt-dl --add-header 'Sec-Fetch-Mode: same-origin' ...
breaks'Sec-Fetch-Mode: navigate'
'Sec-Fetch-Dest: document'
yt-dlp --add-header 'Sec-Fetch-Mode:' ...
succeeds, with UA set to FF 89, with these formats:Mozilla/5.0
, but only these formats:Far-sighted.
Python 3.10’s SSL ciphers were universally backported regardless of the actual Python version used
And we don’t know exactly what aspect of the fingerprint is triggering a block from FB. It could be any number of things
There was very little difference between the yt-dlp FB extractor code and the youtube-dl FB extractor code prior to the reels patch (which also does not have a meaningful impact on this issue). I’d go so far as to say there was no meaningful difference at all.
As you see in the log, there are 2 requests being made – the first is for the actual FB webpage, and extraction silently fails. The 2nd request is to FB’s “Tahoe” video API endpoint, unhelpfully also labeled as
Downloading webpage
– yt-dlp does this as well. But the difference is in the response, since this API endpoint seems to employ some sort of TLS fingerprinting protection that blocks yt-dlp’s Py>=3.10 fingerprint and allows youtube-dl’s Py<=3.9 fingerprint. This is what yt-dlp gets back:The facebook extractor is designed rather poorly IMO, and everything fails silently until there are no more fallbacks, at which point an identical
Cannot parse data
extractor error is raised. So that’s why #7901 is tracking the newly manifested problem where webpage extraction always fails with all links, while this longstanding issue (#4311) should be tracking the Tahoe API request failure (and whatever else was happening before FB changed their webpage structure around)no
Hello, is there any update about the bugfix for this “cannot parse data” error which causes the problem of not being able to download Facebook videos? Sorry for bothering with the question and thanks in advance for your support
Hi!
I have been looking into this, and Facebook would consistently return “RelayPrefetchedStreamCache\u00406fa52080e4dc537033301a17dbec99b5” instead of “RelayPrefetchedStreamCache”. I fixed it by modifying the facebook function a little:
def extract_relay_prefetched_data(_filter): return traverse_obj(extract_relay_data(_filter), ( ‘require’, (None, (…, …, …, ‘__bbox’, ‘require’)), lambda _, v: any(key.startswith(‘RelayPrefetchedStreamCache’) for key in v if isinstance(key, str)), …, …, ‘__bbox’, ‘result’, ‘data’, {dict}), get_all=False) or {}
Nightly build suddenly works again!
$ ~/yt-dlp_linux --version 2023.09.05.203540
~/yt-dlp_linux --extract-audio --audio-format mp3 -o xx1.mp3 https://web.facebook.com/watch/?v=2139244499701931 [facebook] Extracting URL: https://web.facebook.com/watch/?v=2139244499701931 [facebook] 2139244499701931: Downloading webpage [info] 2139244499701931: Downloading 1 format(s): sd [download] Destination: xx1.mp4 [download] 100% of 4.59MiB in 00:00:00 at 8.64MiB/s [ExtractAudio] Destination: xx1.mp3 Deleting original file xx1.mp4 (pass -k to keep)
I think there is more to the fingerprinting than just the headers, since I still get the error with the same command line on yt-dlp master. Or it could be a regional thing, maybe
Then it was almost certainly d76d15a6699dc41eea26a96d054a1b7bcb12c69b that broke the FB “Tahoe” fallback
Yet:
Apparently FB is looking at the final version number, not the
rv
.With this page yt-dl nightly gets:
But that’s another problem.
yt-dlp --user-agent “Mozilla/5.0 (Windows NT 6.0; rv:89.0) Gecko/20100101 Firefox/89.0” --add-header ‘Sec-Fetch-Mode:’ --extract-audio --audio-format mp3 https://web.facebook.com/watch/?v=2139244499701931 [facebook] Extracting URL: https://web.facebook.com/watch/?v=2139244499701931 [facebook] 2139244499701931: Downloading webpage [facebook] 2139244499701931: Downloading webpage ERROR: [facebook] 2139244499701931: Cannot parse data; please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U
yt-dlp is up to date (stable@2023.07.06)
@Vangelis66 I think it’s more likely that b1156c1e59646d450836ce8e61c34641070e8ccb / d14cbdd92d8bbb9deedc77da80085b0280ae52bb was the culprit, since FB’s fingerprinting seems to be very http header-focused
@bashonly
Many thanks for your erudite reply; highly appreciated 😄 👍 … BTW:
… the link has to be changed to bb5d84c9d2f1e978c3eddfb5ccbe138036682a36 😉 …
I had to do a “refresh” on TLS fingerprinting, still I kindly ask for a further clarification on:
From the article I read it comes that both apps’ TLS fingerprints depend on the combination of CPython+OpenSSL lib in use when the scripts are being invoked; or is there something additional involved in the case of
yt-dlp
?I find that even
yt-dlp_x86.exe
, currently built on CPython 3.7, stills fails on FB 😿 , so it isn’t only on Py>=3.10 that the Tahoe API endpoint blocksyt-dlp
(unless I misunderstood what you wrote previously) …The successful
yt-dl
logs I posted previously were on CPython 3.4+OpenSSL-1.0.2k and CPython 2.7+OpenSSL-1.0.2t, so both using the 1.0.2 OpenSSL branch; AFAIAA, 1.0.2 doesn’t even support TLSv1.3, so why older (less secure) CPython and older (less secure) OpenSSL are being whitelisted by that API is beyond me…Finally, I dug up my archives and found out that the last
yt-dlp
version that emulates the currentyt-dl
behaviour in regards to FB is v2021.12.01
(!):As the more observant among you might have noticed,
avc1
streams are missing, but I guess this is due to this …Kind regards 😄
The FB extractor in yt-dl hasn’t been updated since release (2021.12.17) and should be expected to be broken. A yt-dl version using a version of the extractor from a PR might work. Also, it might depend on which Python version is used since one failure mode (IIRC, and why no PR was merged) is blockage by FB’s bouncers based on client fingerprinting.
Above, a video that can be downloaded with yt-dl doesn’t match the FB extractor directly but with the generic extractor its embedded videos (that do match the FB extractor) can be downloaded.
The FB extractor in yt-dl hasn’t been updated since release (2021.12.17) and should be expected to be broken. A yt-dl version using a version of the extractor from a PR might work. Also, it might depend on which Python version is used since one failure mode (IIRC, and why the extractor wasn’t updated) is blockage by FB’s bouncers based on client fingerprinting.
yt-dlp has made some networking improvements that might help to fix such blockage.
@UtopianElectronics can you share an example? I can’t find any URLs that work with youtube-dl anymore
Experiencing the same issue. Surprisingly, the latest version of
youtube-dl
can download Facebook videos, butyt-dlp
cannot.this makes me think the difference could be in the python version / SSL ciphers used rather than the extractor code
@omarmohamedmoustafa facebook does enforce a rate-limit on how many posts you are able to view when not logged in. I can download that link without issue. Does that video load for you in your browser when not logged in? (use private/incognito tab)
Going to add here that using yt-dlp to try to download a reel has locked me out of my Facebook account. I am jumping through the hoops to get it back but be cautious as it’s very much not worth getting a few min video to jump through Facebook security hoops.
I also do not understand why it is doing this when I have 2fa enabled… smh
As mentioned in https://github.com/yt-dlp/yt-dlp/issues/4477#issuecomment-1200612649 (I’ve also tested using the same cookies.txt file), youtube-dl works, but not yt-dlp. So it’s caused by a regression, not Facebook “banning” you.