yfinance: Exception: yfinance failed to decrypt Yahoo data response
Looks like more encryption issues from yahoo.com
import yfinance as yf ticker = ‘PENN’ stock_info = yf.Ticker(ticker).balance_sheet
Exception: yfinance failed to decrypt Yahoo data response
[ Basically affects everything except price history @ValueRaider ]
Using Python version 3.11.0 yf version 0.2.9
@ValueRaider hijacking top post
[2023-06-23] Update! Latest release fixes financials tables (and removes decryption code).
What is happening? In December 2022 Yahoo began encrypting webpage data, maybe to block scraping. Now, Yahoo is regularly changing their encryption key, we think every day (and maybe multiple times a day). Without an automated system to extract key from their webpage (work in progress), fixing decryption requires a volunteer to manually extract the new key and provide to developers to upload to yfinance
.
~Help needed~
~Need a Javascript dev to write a script that extract AES decryption key from obfuscated JS that Yahoo uses to en/decrypt. The key is there plaintext, just need to automate extraction. The JS changes every day so limited scope to hardcode (use Git branch hotfix/decryption
to print today’s JS url). Don’t worry about sandboxing etc, end users won’t execute this.~
~Script should be separate to yfinance
codebase. I expect your only interaction with yfinance
is testing the extracted key works by putting in yfinance/data.py
~
~Useful comments:~
- ~“The decryption of the json string is done in this function call:”~
- ~“Need to execute the JS in e.g. js2py”~
- ~“Obfuscation done using the popular javascript-obfuscator tool, easily reversible”~
- ~“I’ve been told can extract key via breakpoint in Firefox”~
- ~“fwiw, youtube-dl faced similar issues regarding encryption. They ended up implementing their own (very limited) JS interpreter”~
Progress updates
2023-06-21
Update your yfinance! Latest release fixes financials tables and removes decryption code.
2023-06-04
Obvious that the decryption won’t be fixed. See last message for plan.
2023-03-25
Ticker.info
fixed by fetching from API. Financials still broken.
2023-02-17
Yahoo finally started using a new encryption key not in yfinance
backup list of keys, so decryption failing. Inevitable. Surprised it took 4 days.
2023-02-13
What is the “backup decryption method”? This is simply yfinance fetching decryption keys from this GitHub project website instead of extracting from Yahoo.com. Was broken in 0.2.9 but fixed in 0.2.10. Today worked for many thanks to a key uploaded yesterday. Discussion continues on a decent system for extracting & sharing decryption key.
workaround - yahooquery
Python module yahooquery
is a functional alternative to yfinance
. Instead of scraping webpages it accesses Yahoo’s undocumented API. Not encrypted and faster, but lacks earnings_dates
. GitHub Documentation
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 284
- Comments: 96 (6 by maintainers)
If you came to report same issue, just upvote the top comment. Keep this thread clean and constructive.
@jasmohan-narula Because all they essentially say is “I have issue too”, contributing nothing. Thread quickly gets messy, some of us want to discuss problem.
Agreed. I noticed 12 hours ago Yahoo was more sensitive to spam, but only now a total block.
FYI I’ve just released 0.2.10 which fixes the backup decrypt methods but doesn’t help (I hoped it would), so don’t feel pressured to upgrade 0.2.9. Unless you want to debug and fix, then definitely upgrade.
@SymbReprUnlim This is not a platform for free speech. This is a platform for constructive collaboration, and that requires moderation. We don’t need dozens of “I also have this error” replies - imo this is software sabotage. The few that want to contribute shouldn’t have to sift through many useless comments.
I missed the part where you paid for
yfinance
and we are paid to fix this. Some people have already volunteered time and effort with very useful debugging, and now a solution appears visible.@valankar Maybe I should have explained. The
steampipe
example is less useful and harder to install than Pythonyahooquery
, already a great example of using the “hidden” API.ValueRaider - Treating human beings like filtered out list elements by deleting those attempts at being helpful is not a good long-term policy, even if it helps better focus on some technical issue at hand. What’s needed is a way for you to add tags to certain posts that YOU consider most relevant, so that you and others can view the list of posts you consider most relevant to solving the technical issue(s) at hand. As right as you may be in deleting those posts, it is an infringement on free speech, human cognition and a total teamwork approach. GITHUB apparently needs a software modification to allow you the capability to tag and filter while still allowing people to contribute, without being deleted; except for rudeness, crassness, deliberate attempts at software sabotage, etc. deletions still being helpful or some sort of auto-rudeness filtering as allowable. 3-5 days now and still no clear-cut solution, maybe Yahoo is doing what it is doing purposefully, for a reason. There’s always the SEC and direct access to its database, XBRL’s, financial statements etc. Google Finance has closed its previously open doors to web scraping. There are other potential scraping alternatives, Zack’s, MarketWatch, ForExFactory… Feel free to delete this post after reading it and giving pause for thought. And I’m not suggesting giving up on a technical solution to the current apparent encryption inability to access data with YahooFinance.
The json loaded from root.App.main always comprises 10004 key/value pairs, but simply joining the last 4 values is no longer working.
The password needed to disentangle “stores” is generated by a javascript function supplied in “main.xxxxxxxxxxxxxxxxxx.modern.js”. The version of this file is indicated by the hash “xxxxxxxxxxxxxxxxxx”. The javascript code in this file changes with every version and seems to be heavily obfuscated. I got the same version of “main.xxxxxxxxxxxxxxxxxx.modern.js” for all pages I called on the same day, and another version on the next day. All pages delivered with a certain version of “main.xxxxxxxxxxxxxxxxxx.modern.js” are including the same 10004 key/value pairs in root.App.main, but the order of these 10004 key/values is changed with each page call.
I loaded a stock page in a webbrowser and then opened the inspection console (F12). After setting a breakpoint in “main.xxxxxxxxxxxxxxxxxx.modern.js” I could scrap the password from an internal variable. The password is still a concatenate of 4 of the values comprised in root.App.main and it is 128 bytes long. After manually copying the password into python code, I could read the “stores” dict.
The javascript code in “main.xxxxxxxxxxxxxxxxxx.modern.js” is obfuscated. Variable and function names seem to change in diffrenet versions. The decryption of the json string is done in this function call:
return s.context.dispatcher.stores=JSON.parse(function(e,t){return c().decrypt(e,t).toString(…
In this case, a variable named “e” is holding the entangled content of “stores” and a variable name “t” comprising the 128 bytes password. This password can be used to decrypt the “stores” in all pages delivered with that particular version of “main.xxxxxxxxxxxxxxxxxx.modern.js”.
I have no idea, how to automate the generation of the password with “main.xxxxxxxxxxxxxxxxxx.modern.js”. Maybe someone experienced in javascript will find a solution.
The way Yahoo is wrapping their data is by no means proper encryption. It is just a kind of obfuscation by misusing standard functions from cryptography.
No version works anymore.
Nothing formal. My personal stance is to see
yfinance
fixed but I lack skills & time for this specific problem, so problem waiting for a volunteer to fix. I only mentionedyahooquery
up top because some people can’t wait.I think the decryption is worth pursuing because then have an alternative to API calls.
If someone wants to switch
yfinance
to API calls likeyahooquery
then go ahead. I suspect is a big job so partial conversions welcome, collaboration can complete.Hello, I have investigated and I noticed different things during my testing using : python test_yfinance.py (Script tested in a python 3.11.2 docker)
_get_decryption_keys_from_yahoo_js(self, soup) always return an empty array of keys for me and I get the error : WARNING: No decryption keys could be extracted from JS file. Falling back to backup decrypt methods.
For function _get_decryption_keys_from_yahoo_js in data.py, line 218 :
if len(sub_keys) == key_count:
=> always return FALSE for me because key_count == 4 and len(sub_keys) always return 10004 for me, so the script never execute the code inside the if since last yahoo changes ?So I tried to make this if work and I replace the instruction before :
sub_keys = key_list[ind+1:]
To :
sub_keys = key_list[len(key_list)-4:]
=> To really take the last 4 keys as explained in the comment of the first attemptAnd the method now return the concatenate result of the last 4 keys :
I guess the code can now try to decrypt the store with the non-empty keys :
stores = decrypt_cryptojs_aes_stores(data, keys)
But I’m still getting the exception :Exception: yfinance failed to decrypt Yahoo data response
When decrypt_cryptojs_aes_stores(data, keys) is called …
It seems that the keys contained in the plugin object doesn’t work anymore?
I hope it helped, I’ll try go deeper in the code to see what makes the decryption failed.
To the next person who asks when will this be fixed?
I assume that this will never get fixed. It doesn’t seem to be the direction that the broader community wants to go, and it makes sense.
I finally had the time to go rewrite all of my stuff and took an opportunity to re-architect everything. yahooquery was trivially easy to use - I would say simpler than yfinance - and leaps and bounds faster for data that would have come from
info
in yfinance.I had implemented my own caching layer on top of yfinance which I kept for yahooquery (this was mostly to minimize any unneeded requests to yahoo) and ported that over without issue. For anyone concerned about speed because lack of caching… assuming you know how to build a cache, this is only like 50 lines of code in python to create a basic symbol + data segment cache.
All in all, it took me on the order of 10 hours to reimplement the code that deals with sourcing data from yahoo. Thanks for all of the past work building and maintaining yfinance, @ValueRaider et al.
I don’t know much, only what others have done so send further questions to them.
I have been told you can set a breakpoint at specific point in the Javascript with Firefox, run specific JS function when it hits, and key is printed in console (128-character alphanumeric key).
JS file: https://s.yimg.com/uc/finance/dd-site/js/main.*.modern.js
Breakpoint on code resembling (remember obfuscation changes):
When breakpoint hits, run that code in console.
@ValueRaider It does scrape - it’s built around Selenium which is a scraper. And it’s created and maintained by someone who had to reverse engineer Yahoo’s internal APIs, not by Yahoo. When Yahoo does some internal reorganization to release a new version of their webpage and internal APIs to serve that webpage, YahooQuery is liable to break just as YFinance is.
When I say that the right thing to do is for Yahoo to provide an API, I mean an actual API where they publish and maintain an endpoint and/or language-specific libraries. Then consumers could rely on it and their servers would be putting their cycles towards serving up the financial data points as efficiently as possible, not marshalling them into JSONs or applying cryptography for the sole purpose of obfuscating that JSON. No web driver needed.
No, with one of the updates in the last couple of weeks a new way was introduced that he was referring to as backup decrypt. It is basically a textfile with keys. That file is already loaded through the regular yfinance code and can therefore be modified online without the need of updating yfinance.
@Rogach Good points. Given key doesn’t change within a single day, maybe some volunteer can setup & run a separate service that regularly runs this JS to extract decryption key then post somewhere public e.g. a separate GitHub project?
yfinance
already capable of fetching keys from GitHub HTML (the “backup decrypt” method), can easily redirect.@JECSand Are you sure that js2py is securely sandboxed (cannot access any sensitive functionality on the host system)? Because otherwise we will be basically deploying RCE vulnerability to all the users of the library, which is quite suboptimal.
And if it is securely sandboxed, then we have a whole new can of worms - such execution environment will be trivially detectable. The code then will be able to do various shenanigans - from semi-harmless endless loops to randomizing the data if interpreter is detected.
Unfortunately executing JS is not a final solution either 😦
fwiw, youtube-dl faced similar issues regarding encryption. They ended up implementing their own (very limited) JS interpreter.
Yahoo removed their API and created this problem for themselves, while also fragmenting programmatic users into the various scraping tools that now exist. The best way to avoid people scraping webpages is to provide an API to the underlying data, then they can throttle and control load to their heart’s content.
So I think it’s pretty fair to say that they are making a mistake in focusing their efforts on repeatedly obfuscating their client-side code in an attempt to mitigate a problem which they themselves created.
@domsde Correct. Currently
yfinance
can ping GitHub for new keys, but uploading new keys is manual process - not good when key changes daily. Just need one PIP update to change whereyfinance
pings.@Meborl @ValueRaider I’ve come to the conclusion that the only way to do this in a worthwhile manner is by executing the JS code itself. I’m looking at js2py and PyMiniRacer.
Of the two, my preference would be to find a solution using js2py as in this guide: https://devpress.csdn.net/python/630502f87e6682346619d3dc.html
PyMiniRacer has a lot more overhead and doesn’t seem as stable.
There’s just no point spending hours rewriting their smoke and mirrors logic in Python, only for them to change a few mirrors around and break it.
Don’t see any obvious change to
dict
structure - still 10004 extra items just like before. Maybe they’ve upgraded their obfuscation from simply changing key to changing other encryption parameters.This is the Javascript we think they use to encrypt: https://s.yimg.com/uc/finance/dd-site/js/main.e0c853d8cea2b75a5208.min.js Reading compressed Javascript not my expertise, maybe someone can extract the encryption parameters and cross-check against
yfinance/data.py::decrypt_cryptojs_aes_stores()
I don’t see the decryption being fixed - too difficult, and probably Yahoo don’t want us to fix scraping.
So what next? @ranaroussi’s preference is replacing the scraping with API requests like
yahooquery
does #1420 - if anyone is interested in helping implement drop a message in #1546.I have no problem with community wanting to go in a different direction, whether that’s moving
yfinance
to Yahoo’s API or simply jumping ship toyahooquery
- this is your project not mine. My only enforcement here is keeping this specific Issue focused on the decryption - any significant API discussion should occur in a separate Issue / Discussion, otherwise this thread gets messy.Great! Would you share it?
Here is an example (reference) of wrapping
yfinance
’syearly_earnings
element usingyahooquery
-> hence a temporary workaroundhttps://github.com/asafravid/sss/commit/b633d0999028b84cabac5bf4209c71e142a559ea
yahooquery
uses the Yahoo API which gives it the advantage (vsyfinance
) of receiving (correct) non-decrypted data.You decide. You’re basically asking “When will this be fixed?” - that’s impossible to answer with OSS reliant on volunteer effort.
@ValueRaider I suspect that what we are seeing is just some type of a rolling deployment (or green-black deployments, A/B tests, etc). Similar to how previous encryption changes didn’t affect everyone at once - some people started getting new versions many hours before, while everything worked fine for everybody else.
I looked at the obfuscation, it’s done using the popular javascript-obfuscator tool, easily reversible with some manual effort.
Right now there is not much code in the unobfuscated version - four array keys are basically hardcoded:
Each
main.js
version contains a hash in the filename (the format is “main.a0b1c2d3.modern.js” at the moment), so maybe it will make sense to makeyahoo-keys.txt
into a json dictionary:And if filename is not found in the dictionary then yfinance can throw an error instructing the user to report a new filename.
BTW, @khalidcruz, what’s the full name of the
main.js
file you are seeing? (you can search for it in the page source, or in devtools, either filter by “main.” in Network tab or look in the Sources tab in s.yimg.com/uc/finance/dd-site/js/ folder).@ChristianKuehnel Thanks for info, I’ll try to speak with @ranaroussi.
Seems the backup decrypt is working today. Anyone disagree? Because I’m curious if Yahoo uses different key for different regions.
If you’re looking for a place to execute potentially unsafe code: GitHub actions is a nice place for this. You can also directly store the output of your pipeline in Github again…
I’ve created branch
hotfix/decryption
for people to collab on. You’ll still need to Pull Request but I’ll merge with minimal review - proper review can happen later. Just make sure your fork is on that branch notmain
.@williamc1998 For a good programmer its a relatively easy job. It took me two evenings for a mid-size application. Its more or less a one to one mapping. Thereafter it took me about a week to get it fail safe. There are different error messages and detecting the errors is different. This part in not described in the manual pages. So its a trial and error process.
Not really, except to say
n
query parameter challenge that YT introduced, without which download speeds were throttled.In both cases YT sent some variable meta-language instructions that had themselves to be interpreted by JS to perform the decoding. If there is an actual JS decryption routine in the linked gigantic .min.js, you may do better to reproduce it in Python. In any case the browser JS debugger is your friend.
I’ve created a separate discussion about moving
yfinance
to the internal API vs just recommendingyahooquery
#1420.Lets first describe my program. Its downloading stock data in 3 scans. In the first scan the data is downloaded stock by stock. In the second and third scan the missing or corrupt data are corrected. The program uses caching and rate-limiting. I am following 917 stocks.
This night I have run the program from 1:00 am CET in the Netherlands with
yfinance
0.2.10 starting with a clean cache. Only 3 stocks failed to download. I am getting a lot of warnings:Observations:
info
is missing, its very likely ( ca. 66%) thatquarterly_income_stmt
,quarterly_balance_sheet
andquarterly_cashflow
are also missingearnings_dates
with equal row indices; different runs are producing different resultsConclusions:
@nkt42 Warning isn’t an error. An error will kill code or return None.
@doobery47 We have bunch of workarounds, but they are unstable and so the work continues.
Hopefully Yahoo fired whichever product manager was spending developer time on this instead of actually improving the quality of their product, and then the keys were freed.
@pchedas Please no more confirmations from Europe.
Why are the comments being deleted?
Closing this as decryption will never be fixed.
Financials tables now fully ported to use API - was already 95% done just had to stop scraping the keys, 5 minute fix.
@ValueRaider What’s the latest status of this? I’m failing calling
ticker.info
with the latest version (as well as the main branch).Can anyone succeeds getting
info
using any version?I found that
https://query1.finance.yahoo.com/v7/finance/quote
provides part of the originalinfo
information that currentfast_info
does not provide, and did some modification over current main branch to fit my needs in my fork. Happy to create a PR if that may help.@asafravid Can you move your
yahooquery
work into a separate Issue/Discussion, keep this Issue focused on decryption.@dwmanikandan No
@ClaudioValletta92 I’ve added a ‘Help needed’ section to top post, read that.
@ValueRaider have you decided to retire yfinance and recommend yahooquery? If so, it may be useful to formally announce that with your recommendation. There has been a lot of discussion in this and others, but not such a clear decision, and I get the sense from recent comments that some people are in a holding pattern hoping yfinance is updated to mitigate the encryption of the summary data store.
@williamc1998 All the information is in this thread.
@ValueRaider When I get home I can share more, however the approach I attempted was regexing out the decryption functions from the .min.js file by leveraging js2py.
https://www.geeksforgeeks.org/how-to-run-javascript-from-python/amp/
The biggest problem I faced was that the js functions that get the keys have layers of nested functions. These functions in turn have randomly changing single and double letter names (“i()”, “fu()”) and are in random places.
@JECSand I believe you started on a script to extract keys from JS, although may not have finished - can you share what you have? We can finish it.
~Anyone want to experiment with GitHub actions, to create a Python script that extract keys from Yahoo’s JS file and add to a file? Do it in a personal GitHub repo. If it works I’ll copy into this repo.~
Simpler request: Can someone create a script that automates the extraction of key from Yahoo’s JS file? Someone suggested using a simple JS interpreter. The JS url is dynamic, use branch
hotfix/decryption
to print it. Ignore security, this will run in a GitHub action not user systems.@Rogach Those are good points. My initial thought was to regex out the relevant JS functions, transpile them to python using js2py, then execute. However due to the issues you mentioned along with the lack of reliability I just went ahead and ended up refactoring YahooFinancials to fully utilize the api. All the data is available via the api and then some.
@ValueRaider I suspect the time is quickly coming when the keys will be extremely dynamic. I’d at least consider leveraging the api as an in-code backup option for yfinance. The api data is in roughly the same format, it’s not too much work to refactor without breaking changes, and would save a lot of end-user from having to change their code. If you want I can submit a PR with those updates sometime this weekend.
I commented this 12 hours ago “doesn’t work” mean “WARNING: No decryption keys could be extracted from JS file.” not “Exception: yfinance failed to decrypt Yahoo data response”. If you misunderstand it, it will cause trouble, so I replied just in case.
Looking through reports - mixed results. Worked for most, but few didn’t work even when similar timezone as others. Suggests Yahoo uses multiple different keys within a day (aka serving different versions of the obfuscated JS @Rogach).
No more reports.
@Yazzito
yfinance
doesn’t read your local key file, it fetches from GitHub where I can update instantly without PIP. When I designed that I didn’t expect key to change daily.For me, I’m seeing the warning you mentioned followed by:
line 162, in decrypt_cryptojs_aes_stores raise Exception(“yfinance failed to decrypt Yahoo data response”) Exception: yfinance failed to decrypt Yahoo data response
It worked for me this morning with no code changes. (I also saw the warning this morning, but no errors at that time.) PS. I am not rate limited because I only tested 2 ticker runs (both failed, same msg)
@Rogach I looked at your instructions in the post above. My main.js naming is also: main.9c2e056368902a7b446e.modern (same as @khalidcruz) When using the command in devtools I get: [“87b62ee5fe65”, “08a3ee23291a”, “25d6a4526abc”, “e50551b7d7ab”].reduce((a, b) => “” + a + App.main[b], “”)
‘3c895fb5ddcc37d20d3073ed74ee3efad59bcb147c8e80fd279f83701b74b092d503dcd399604c6d8be8f3013429d3c2c76ed5b31b80c9df92d5eab6d3339fce’
I added this key to the yfinance yahoo_keys.txt locally but I’m still seeing the same decrypt error above. I’m not sure where you got the 4 key numbers in the reduce command from? So, not sure if I’m using the correct keys.
Note that this warning message does not mean failure, just that it’s falling back to the backup decrypt method which seems to work for a significant number of people right now. Check to see if you do get a value back after receiving that message.
@khalidcruz I forgot I also need the contents of
App.main
(from the page source) to actually extract the key 😦But here’s a piece of code that you can run in devtools to extract the key corresponding to your main.js version:
@Rogach it’s main.9c2e056368902a7b446e.modern
@cmjordan42 “Selenium is only utilized to login to Yahoo, to retrieve data only accessible to premium subscribers.” I’ve looked at how
yahooquery
works, it just sends GET requests to internal API. But I accept your broader point about officially supporting an API - maybe that’s what these Yahoo changes are working towards, by the new owner Apollo Global.@giantroadracer Thanks for report. We think key changes daily, and your report suggests Yahoo uses your local date to decide - that key I added worked 13-Feb and you’re in 14-Feb. So ‘key derivation service’ needs to run in Far East or Australia.
Just tested this (below). Canadian stock in Canada works. Which wasn’t working prior to update., now at 0.2.11
r = yf.Ticker('DFN.TO') print('r.fastinfo: ', r.fast_info) #print('r.info:', r.info')
@cmjordan42 That’d unfair. This encryption only affects webpage scraping. Direct GET requests work fine as
yf.download
andyahooquery
does. I can understand why Yahoo wants to stop webpage scraping (expensive) in favour of GETs.doesn’t work in Canada, (.info() that is)
Just a clarification: Even if we find such a volunteer, would it imply yfinance users will need to ‘pip update yfinance’ every day (/few days) in order to have the updated keys?