nba_api: HTTPSConnectionPool(host='stats.nba.com', port=443): Read timed out. (read timeout=30)
Noticed there is a few threads on this issue yet the solutions provided haven’t worked. Maybe I’m doing something wrong? Thanks for the help in advance!
from nba_api.stats.static.teams import find_teams_by_full_name
from nba_api.stats.endpoints.teamplayerdashboard import TeamPlayerDashboard
mia_id = find_teams_by_full_name("Miami Heat")[0]['id']
mia = TeamPlayerDashboard(measure_type_detailed_defense = "Base",per_mode_detailed = "Totals", team_id = mia_id, season = "2019-20").players_season_totals.get_data_frame()
ERROR: `--------------------------------------------------------------------------- timeout Traceback (most recent call last) /usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw) 383 # otherwise it looks like a programming error was the cause. –> 384 six.raise_from(e, None) 385 except (SocketTimeout, BaseSSLError, SocketError) as e:
24 frames timeout: The read operation timed out
During handling of the above exception, another exception occurred:
ReadTimeoutError Traceback (most recent call last) ReadTimeoutError: HTTPSConnectionPool(host=‘stats.nba.com’, port=443): Read timed out. (read timeout=30)
During handling of the above exception, another exception occurred:
ReadTimeout Traceback (most recent call last) /usr/local/lib/python3.6/dist-packages/requests/adapters.py in send(self, request, stream, timeout, verify, cert, proxies) 527 raise SSLError(e, request=request) 528 elif isinstance(e, ReadTimeoutError): –> 529 raise ReadTimeout(e, request=request) 530 else: 531 raise
ReadTimeout: HTTPSConnectionPool(host=‘stats.nba.com’, port=443): Read timed out. (read timeout=30)`
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 1
- Comments: 24
For those having issues with calls, there are a couple of known factors to consider.
The best option is to always try it locally first to see if all is well. If it is, then it’s likely a block.
While I have not tried it, there is an option of using a proxy. You could attempt to use that from cloud to determine if your deploy worked, but you are in fact getting blocked.
Hope that helps. It’s a common issue that is raised often.
@leimao - The NBA does not make its firewall rules public. That being said, I have spent some time in the networking space. Here are the basics of what is likely happening. I’m going to assume no prior knowledge. I should extend this and put this out on Medium! 👍
THE CLOUD ARCHITECTURE
Cloud provides, like AWS, millions of physical servers. On top of that, those servers are virtualized, creating millions more virtual machines. You can split that into millions of containers, such as running on top of Kubernetes (K8s). If that’s not your preferred route, you can simply run a Serverless Application and use FaaS (Functions as a Service) like Lambda.
THE PROBLEM
Any single cloud provider has enough computing power that any single person could scale an application to take down any site in the world effectively. This is referred to as DDOS (Distributed Denial of Service); it doesn’t even have to be intentional, someone could have just written an infinite loop.
DEFENSE IN DEPTH
Security is managed via the concept of Defense in Depth. This means that security is provided in layers. Should any layer be compromised, there is yet another layer that must be breached. In the same way, protecting a highly available service like the NBA’s website, stats, and other services is done using this practice. Multiple tools can be used, and prices range from relatively inexpensive to very expensive. A good article from Fortinet titled, Defense in Depth
I will cover three primary DDOS defenses that can be put into place with relative ease; though the extent of implementation determines price.
IP ALLOW LISTS AND BLOCK LISTS
Probably the easiest implementation is to ask the question, who are my customers? I don’t think it will take you long to guess that you and I, running programs on the cloud, that get statistical data from the NBA for free, are their target audience. To the NBA, our programs are nothing more than bots. There is zero revenue to be gained. With that in mind, the NBA can ask themselves why they would allow any cloud provider to connect to our APIs, given the potential for a DDOS attack. There is no good reason. In short, they want human traffic in which they can build their brand, interact with fans, and generate revenue.
In this case, it is relatively easy for the NBA to block all IPs from cloud providers. The majority of cloud providers make their IP addresses publicly known. AWS makes their IP address ranges available, and companies can subscribe to them
RATE LIMITING
Beyond the allow and block lists, the next defensive measure is to limit how many times an individual can make requests to a given service. This is called rate limiting. Cloudflare has a good article titled, What is rate limiting? Rate limiting and bots. Rate limiting is also a form of DDOS protection. Through trial and error (reverse engineering), I determined I could request the NBA’s API once every 600ms.
Like allow and block lists, rate limiting is typically implemented within a Firewall. When implementing a rate limit, the firewall rule can be set with filtering keys. A key can be as simple as an IP address and may contain other characteristics. A quick article on AWS WAF (Web Application Firewall) titled Rate-based rule statement will give you an idea.
DETECTION AND MITIGATION
This is where DDOS protection can get pricey. Is it worth it? Yes. There is simply too much risk today not to have DDOS protection. On top of that, several companies and products are available.
One of the leading companies in this space is Radware. You can learn more on their DDoS Attack Prevention Services: Multi Layered DDoS Protection and Security Solutions page. Check out their Live Threat Map for some real cool data!
The idea here is that even if I have an IP allow list, an IP block list, and have rate limiting configured, that does not stop someone from making repeated calls over and over without end. This is where detection and mitigation come in. Should a firewall become so overwhelmed that it is no longer able to respond to legitimate traffic, a product such as Radware will step in the middle and begin absorbing, filtering, and redirecting that traffic. Note, while I say step in, Radware is always there inspecting the traffic, it’s just quietly analyzing it.
IN SUMMARY
While I do not have any details regarding how the NBA has configured their networking infrastructure, there are some general design patterns that the industry uses that can be applied based on observation. Even if you were lucky enough to find someone who works for the NBA and specifically works on their network, they would not tell you either simply for the fact of a potential security breach. We all know how bad that can get.
I hope you enjoyed this, was filled within some things that perhaps you didn’t know, and didn’t bore you so much that you fell asleep reading it. 😂
I am using the teamgamelog, teamdashboardbyopponent and teamdashboardbygeneralsplits endpoints. All three are working correctly in the local environment. But as soon as I deployed the app, I started receiving the ReadTimeoutError.
I have since then deployed the app using only the teamgamelog endpoint but even that one endpoint is not working. I have also increased the timeout to 45 seconds, but that also did not help.
Any suggestions ?
I am calling the API only once and that too in the first line. How will time.sleep() help ?
Thank you very much! I will try this proxy workaround in order to deploy a Lambda function on AWS. This whole thread helped me a lot to understand the
ReadTimeouterror.Similarly to how it is used here https://github.com/swar/nba_api/blob/master/docs/nba_api/stats/examples.md#endpoint-usage-example
Looks similar to mine.
What I did was store mine as an environment variable (needed this to not hard code the proxy in prod) and send it into a forked API I made. It was just a string but the approach you’re going with above will probably work.
The string proxy I would send to the package looked like this
http://<USER_NAME>:<ID>@us.smartproxy.com:<NUMBER>
Having the same timeout problem. I’m using the BoxScoreAdvancedV2 endpoint, and seems to be very inconsistent with respect to success. I wonder if there is a maximum times I can pull info from the api before it starts to time out. Especially frustrating as the BoxScoreAdvancedV2 endpoint only provides stats for one game, if iterating in a loop. Currently using v1.1.8.