requests: Consider making Timeout option required or have a default
I have a feeling I’m about to get a swift education on this topic, but I’ve been thinking about the pros/cons of changing requests so that somehow there is a timeout value configured for every request.
I think there are two ways to do this:
- Provide a default value. I know browsers have a default, so that may be a simple place to begin.
- Make every user configure this in every request – bigger API breakage. Probably not the way to go.
The reason I’m thinking about this is because I’ve used requests for a few years now and until now I didn’t realize the importance of providing a timeout. It took one of my programs hanging forever for me to realize that the default here isn’t really very good for my purposes. (I’m in the process of updating all my code…)
I also see that a lot of people want Session objects to have a timeout parameter, and this might be a way to do that as well.
If a large default were provided to all requests and all sessions, what negative impact would that have? The only thing I can think of is that some programs will get timeout exceptions where they previously hung, which seems like an improvement to me.
Caveat, added May 13, 2016:
Please don’t use this issue to discuss adding a timeout attribute to requests. There are a number of discussions about this elsewhere (search closed issues), and we don’t want that conversation to muddy this issue too. Thanks.
About this issue
- Original URL
- State: open
- Created 8 years ago
- Reactions: 77
- Comments: 50 (37 by maintainers)
Commits related to this issue
- Amplifies the timeout warning in the docs. Partially addresses #3070. — committed to mlissner/requests by mlissner 8 years ago
- requests: set a timeout By default, requests does not set a time. As a consequence of this, the lib will hang as soon as the communication has failed. This commit set the time to 30s. At this point,... — committed to redhat-cip/python-dciclient by goneri 7 years ago
- #3070 Move "allow_redirects" default to session-level, add session-level config for "timeout" (None) — committed to FilipMalczak/requests by FilipMalczak 5 years ago
- #3070 fix broken "or" for booleans — committed to FilipMalczak/requests by FilipMalczak 5 years ago
- adds a default timeout and Resolves #3070 — committed to grintor/requests by grintor 3 years ago
- sets a default timeout and resolves #3070 — committed to grintor/requests by grintor 3 years ago
- ⚡️ Helper: A more robust Implementation of Downloading Artifacts (Fixes #204) - Added a custom HTTP `TimeoutHTTPAdapter` Adapter with a default timeout for all HTTP calls based on [this Github commen... — committed to abhiTronix/vidgear by abhiTronix 3 years ago
- New Gear Enhancements, Bug-Fixes and CI Updates. [#203] - WebGear_RTC: Gear that utilizes WebRTC for Streaming Video. - A new API that is similar to WeGear API in all aspects but utilizes WebRTC... — committed to abhiTronix/vidgear by abhiTronix 3 years ago
- Simplify `certificate` property and switch to use `cryptography` A few things: 0. This is the first half of swapping out M2Crypto for `cryptography`. 1. This fixes a bug in this method because prev... — committed to mlissner/django-ses by mlissner 2 years ago
- swap m2crypto for cryptography (#239) * Simplify _get_bytes_to_sign() I simplified this a bunch: - Removes smart_str. If we need to use that and if it replaces `errors`, that will certainly ... — committed to django-ses/django-ses by mlissner 2 years ago
- define http-request-timeouts (+ refactor) As suggested by `requests`'s documentation [0], define time-outs for http-requests. This change was inspired/suggested by bandit [1][2]. This change addresse... — committed to gardener/cc-utils by ccwienk 6 months ago
- define http-request-timeouts (+ refactor) As suggested by `requests`'s documentation [0], define time-outs for http-requests. This change was inspired/suggested by bandit [1][2]. This change addresse... — committed to gardener/cc-utils by ccwienk 6 months ago
- define http-request-timeouts (+ refactor) As suggested by `requests`'s documentation [0], define time-outs for http-requests. This change was inspired/suggested by bandit [1][2]. This change addresse... — committed to gardener/cc-utils by ccwienk 6 months ago
Here are a few example idle timeout defaults from other libraries, in case that helps:
I doubt this will change anybody’s mind, but I vehemently disagree with closing this bug without a fix. We improved the documentation via the PR that I filed, but in practice I regularly run into programs that hang because they do not have a timeout. It’s gotten bad enough that it’s one of the first things I think of when a program hangs.
Requests is a wonderful library but you can’t document your way out of a problem like this. Look at the list of issues referencing this one. Documentation is just not enough, or else issues would stop referencing this one.
I respect the requests maintainers greatly, but I hope you’ll reconsider closing this. If it causes API changes, that’s what version bumps are for. But I’m not even sure API changes would be needed.
Please reconsider.
At the very least, how about we include the
timeoutarg in the first introductory examples in the documentation? It’s bad when intro examples are subtly wrong for the sake of brevity.Every operating system has a default TCP timeout. On windows it’s 240 seconds, the RFC for TCP recommends a 100 second minimum.
The requests library, without sane defaults, is fundamentally unsafe to use. Any nested dependent library a project relies on can - at any time and even after thorough testing - hang forever. This can be exploited, for example, to produce efficient denial of service attacks.
In a recent project I worked on, one requests call using a prepared send in a 3 level deep nested dependent library was missed. It took us months to figure out what was going on.
Sure, from now on I will patch the send function in every project I work on. But no amount of linting or code review would have caught this.
If requests isn’t going to add a default to earlier versions, at least library should spit out a warning to the console, and the docs should be updated.
https://github.com/psf/requests/pull/5434/files
Hooray for reopening this bug. I like the solution of providing a default with an env override. Seems simple enough to me.
I do disagree with this though. The thing I love about Requests (and Python in general) is that these kinds of details are usually taken care of for me. For me, I think that’s what so jarring about this issue. A super simple GET request has a gotcha that could bite you way later.
Anyway, thanks for reopening. I really appreciate it and I think the community will greatly benefit from a default timeout.
I’m definitely in favor of this in light of dealing with a package that doesn’t provide a way to set a timeout, but does accept a fully configured session. My current fix is a specialized
HTTPAdapterthat sets a default timeout if none is provided:I’d much prefer something like:
However, this current fix isn’t too cumbersome either.
As the person that filed this nearly five years ago, as a long-time user of requests that doesn’t have any other issues with it, as a reviewer of pull requests that constantly begs developers to remember the timeout parameter, I can’t say strongly enough how important — and safe! — I think this feature is. I think the risks to backwards incompatibility are almost negligible, and that this should be done as soon as possible.
The worst case of a default timeout is that a connection that badly hung in the past will now fail — that’s good! I can’t think of a time that’d be bad.
Is there anything preventing us from finally fixing this wart in an otherwise nearly perfect library?
What if there was a way to set a default using env variables. Then people could opt in to the default, and it would be easier to also fix programs in production without having to dig through all code looking for libraries that might be using requests without the default. I was thinking something similar to how the http proxy has env variables.
This one got quite some negative thumbs, but I think it’s a good idea. The requests library already relies on the environment, i.e. it honors a .netrc and it honors proxy settings - and it has a
trust_envparameter that can be set to False to disregard environment settings.Such a solution may empower quite some end-users. Some of the problem with this issue is that there is no strong consensus on what the default timeout should be - but this problem persists even if it’s pushed to the users of the library. Like, I have my caldav library, and I could consider setting some default timeout there - but then I have the same issue, what would the sane timeout be? This library is again used by other libraries and utilities. Sometimes the end user is the one to learn if the timeout setting is too low or too high. The timeout may be passed through the full chain and be specified by a configuration file for each tool out there - but it adds a lot of complexity. Putting an environment variable there is a simple solution that works for (nearly) all projects.
A default timeout given through environment variables may solve quite some problems. It is not a breaking change, users relying on the default timeout to be infinite won’t be affected - hence it can be rolled out whenever and without DeprecationWarnings. However, it does not solve this issue. I strongly believe that the default should be set to something else than “infinite” in the long-term future. I’ve used the requests library a lot, and I always thought that such a high-level library would come with some kind of default timeout … first time I encountered this issue my script had been hanging for days. I think even a very high default timeout, like 24h, is better than infinite.
I recently found that there is a semgrep rule to check that a timeout is set for these calls. If you run this in CI you at least don’t need to remind anyone anymore:
This is just a workaround though, I absolutely agree that default timeouts are essential and this should be fixed in the library here.
Making the timeout option have a default value, and setting that value to a very high number would not cause any breaking changes. I feel like any hesitation here about setting a default on a non-major release version is for fear of breaking changes. I propose a default value of 1000 seconds for the next minor release and lowering that default to 300 seconds for the next major release.
I added a lint for my code for this but would much prefer there is a default timeout in the library
That’s fair enough, @Lukasa. What about making the Timeout section more explicit then? Right now it says:
And then goes on with a long, somewhat complicated warning. Could the first part say:
Something like that?
I think that we’ve addressed this through documentation. I don’t think any of the core team is going to change the API to require a timeout and I think we’ve done our diligence here.
It’s extremely unclear to me what “wrong” means here. They aren’t wrong: they work, as designed. The introductory examples are all clearly typed into the Python interactive interpreter, so they have a timeout: the user sitting there can hit ^C anytime they like.
I’d welcome enhancements to the “timeouts” section of the documentation to be more emphatic, that certainly does seem reasonable. But I don’t think we need to go through and add the
timeoutarg.That sounds great, and like it might break the dam for some features like this one that need more consensus to move forward. It’ll let us make a decision and then warn people of coming things before those things come.
I guess a simple warning could say:
Does that look about right?
I pushed for documentation updates too and I think the docs are pretty good now, but I think the overall goodness of requests creates a blind spot for developers. It’s very easy to get started with requests, get things working, assume the defaults are good, and then never read more of the documentation.
This issue is now four years old. I’d very much support adding a warning to current versions of request, since requests 2.0 (and the opportunity to change defaults) may never come.
That’s not how Requests timeouts work. They apply per socket I/O syscall, not over the total execution time. That is, they govern how much each call to
read()blocks. Large file downloads shouldn’t be an issue.And as a user of software, I’d like to not have yet more mysterious inputs to programs, for what that’s worth. Nobody’s going to bother documenting that their daemon uses
requestsand therefore has behavior that depends on this environment variable.👍 for
session = Session(timeout=...). Would you merge a patch?Has there been any progress on this?
@mlissner To add to your statement, I think that the current situation kind-of contradicts the stated project goal “HTTP Requests for Humans”.
Humans need sensible defaults. Not just in graphical user interfaces, but also in command-line user interfaces and software libraries. And “no timeout” is almost never a sensible default. I was very surprised to read about that trap in the manual. This note should be more prominent in the manual, perhaps being the first note in the Quickstart (rather than the last note), because this is surprising and not what you expect from a library that performs network calls.
(Even better than a more prominent place in the manual, of course, would be to fix this once and for all.)
@sigmavirus24 @chris-martin I fully agree, and I don’t need timeout environment variables, either. All I need is either a sane default timeout, or making the timeout argument mandatory. No more, no less.
Environment overrides are already causing us to pull our hair out. I’d like to not add anymore, for what that’s worth.
We could consider making a default in 3.0, but I don’t know what it would be… 120s?
Idk, I like our current design. You really shouldn’t be hitting the internet without proper timeouts in place ever in production. Any sane engineer knows that — it’s not Requests’ job to do your job for you.
Thanks @sigmavirus24, that’s helpful. 3.0 looks like it may never come though — it’s already been in the works for at least five years and the milestone has a lot of open issues that look hairy. (Sorry I didn’t notice this before when laying out options above.)
Is there any way we can re-think this as a bug fix instead of a new feature? There are no API changes here, and I think for the vast majority of programs using requests, a default timeout will fix issues instead of creating new ones. If it’s in a minor release, folks can very easily upgrade their code to add really long timeouts if that’s really what they need (I think this would be very rare).
Am I wrong that we still haven’t heard an argument about how a default timeout would cause actual real-world problems? I think the closest we come is:
Yes, this could impact people’s stuff, but I think in most cases this would change a silently hung program to one that crashes, which is something that’s really good. A bug fix.
This could be framed in release notes as:
I’m really sorry to be pushing on this so much. It’s not my usual thing. I just don’t want to wait (literally?) forever for this fix to land, and I think it’s so safe we should find a way to do it. Sorry, this feels like the kind of thing I see in release notes all the time.
Would a 3.0 release with a default timeout change, and then changing what’s currently designated for 3.x to be 4.x instead be a path forward?
I’ve seen this issue crop up over the years, with various code bases. It would be very nice to have a default timeout to prevent programs that wait forever due to a missing timeout.
Requests has no logging of its own any longer. We can definitely start logging things though (like this for example) and add a deprecation warning which might help raise hackles more in preparation for 3.0 (where people might see the warning in their CI environments)
Ok, well I created a pull request for the proposed change, and it was approved by the CI. It’s all in @nateprewitt’s hands now to merge it and release it. Fingers Crossed
@vog there hasn’t been any progress on this. We’re planning it for 3.0.
Those seem reasonable to me. I just just think this should be a connect-only timeout, not a download timeout.
I think the point that’s generally acknowledged by this bug is that the design was wrong. There’s a general consensus here that adding a default timeout or requiring it as an argument is a good idea. The docs could address that, and that seems like a simple step to take until this issue is resolved in the next major version.
What happens in practice is that people grab the examples from the docs, don’t read the timeout section carefully (or don’t understand its implications – I didn’t for years until I got bit), and then wind up with programs that can hang forever. I completely agree with @chris-martin that until this issue is fixed, all examples in the docs should provide the timeout argument. Otherwise, we’re providing examples that can (and probably will) break your programs.
I’m (as a user) slightly against this proposal. There are following caveats:
What I’ve missed?
This is an entirely reasonable suggestion.
Honestly, I’m not averse to doing it: there are definitely worse things to do than this. I’d be open to providing a default timeout. However, I don’t think we can do it until 3.0.0.
Of course @kennethreitz as keeper of the spirit of requests has got the final say on this, and should definitely weigh in.