requests: get request hangs "in readinto return self._sock.recv_into(b)" without raising exception
Hi,
I’m pretty desperately trying to nail down the reason that an url unshortener script is freezing from time to time.
I’m using requests with several head and get-requests, such as
with requests.Session as s:
r = s.get(url, allow_redirects=True, timeout=self.timeout, headers=self.headers, verify=False)
where self.timeout are 10 seconds and the headers just include a user-agent.
The request is called by a worker in a multiprocessing Pool with 10 workers that get’s constantly feeded via Pool.imap() with a list of urls.
When keyboard interrupting I get from some of the workers the following Traceback:
Traceback (most recent call last):
File "---/dataprocessor.py", line 98, in try_unshorten_expanded_urls
result = unshorten_expanded_urls(tweet_dataframe_tuple)
File "---/dataprocessor.py", line 84, in unshorten_expanded_urls
'unshortened_url': u.unshorten(expanded_urls[i]),
File "---/unshorten.py", line 162, in unshorten
verify=False)
File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/sessions.py", line 480, in get
return self.request('GET', url, **kwargs)
File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/sessions.py", line 468, in request
resp = self.send(prep, **send_kwargs)
File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/sessions.py", line 608, in send
r.content
File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/models.py", line 737, in content
self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/models.py", line 660, in generate
for chunk in self.raw.stream(chunk_size, decode_content=True):
File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/packages/urllib3/response.py", line 344, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/site-packages/requests/packages/urllib3/response.py", line 301, in read
data = self._fp.read(amt)
File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/http/client.py", line 433, in read
n = self.readinto(b)
File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/http/client.py", line 473, in readinto
n = self.fp.readinto(b)
File "/home/ubuntu/miniconda3/envs/TQ/lib/python3.5/socket.py", line 575, in readinto
return self._sock.recv_into(b)
KeyboardInterrupt
If the function would raise an exception, I’ve made sure that it would be catched within the function so that it would not block the pool. But as it seems, readinto is just hanging here (for hours). I don’t know whether this is related to #3066, as I am not familiar enough with the protocols or requests itself, so please excuse a duplicate if this should be one or if the error is my bad.
The freeze happened several times with the same batch of urls now. I tried to find the exact url causing this freeze but did not succeed as the batch contains several thousand, though I’ve already excluded everything that exceeds a certain content length or does not have text as content type via a head request before the get.
I am running requests 2.10.0 with Python 3.5.1 in a miniconda environment installed via pip on an Ubuntu 14.04.
However, great library, hope this helps to make it better.
Cheers!
UPDATE:
Here the relevant code for the traceback (line 162 in unshorten.py):
def check_size_and_type(request_object, max_content_length):
try:
length = int(request_object.headers['content-length'])
except:
length = None
try:
type = str(request_object.headers['content-type'])
except:
type = ''
if ((length is None or length < max_content_length) and
(type.startswith('text/html') or
type.startswith('application/xhtml') or
type.startswith('text/xml'))):
return True
else:
return False
with requests.Session() as s:
r = s.head(url, allow_redirects=True,
timeout=self.timeout,
headers=self.headers,
verify=False)
if check_size_and_type(r, self.max_content_length):
r = s.get(url, allow_redirects=True,
timeout=self.timeout,
headers=self.headers,
verify=False)
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 20 (6 by maintainers)
I am also facing similar issue, using Python3.7
Code: import speech_recognition as sr
r = sr.Recognizer() with sr.Microphone() as source: print(“Hello World 1”) audio = r.listen(source) voice_data = r.recognize_google(audio) print(voice_data)
Error:- voice_data = r.recognize_google(audio) File “C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\site-packages\speech_recognition_init_.py”, line 840, in recognize_google response = urlopen(request, timeout=self.operation_timeout) File “C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\urllib\request.py”, line 222, in urlopen return opener.open(url, data, timeout) File “C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\urllib\request.py”, line 525, in open response = self._open(req, data) File “C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\urllib\request.py”, line 543, in _open ‘_open’, req) File “C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\urllib\request.py”, line 503, in _call_chain result = func(*args) File “C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\urllib\request.py”, line 1345, in http_open return self.do_open(http.client.HTTPConnection, req) File “C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\urllib\request.py”, line 1320, in do_open r = h.getresponse() File “C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\http\client.py”, line 1336, in getresponse response.begin() File “C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\http\client.py”, line 306, in begin version, status, reason = self._read_status() File “C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\http\client.py”, line 267, in _read_status line = str(self.fp.readline(_MAXLINE + 1), “iso-8859-1”) File “C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\socket.py”, line 589, in readinto return self._sock.recv_into(b) ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host