twython: ChunkedEncodingError

Just spotted the following in the logs for a pair of my streamers:

Traceback (most recent call last):
  File "/home/keyz/tweets/tweetstream.py", line 20, in <module>
    stream.statuses.filter(locations=location)
  File "/usr/local/lib/python2.7/dist-packages/twython/streaming/types.py", line 65, in filter
    self.streamer._request(url, 'POST', params=params)
  File "/usr/local/lib/python2.7/dist-packages/twython/streaming/api.py", line 134, in _request
    for line in response.iter_lines(self.chunk_size):
  File "/usr/local/lib/python2.7/dist-packages/requests/models.py", line 602, in iter_lines
    decode_unicode=decode_unicode):
  File "/usr/local/lib/python2.7/dist-packages/requests/models.py", line 575, in generate
    raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: IncompleteRead(0 bytes read)

The line in question is

for line in response.iter_lines(self.chunk_size)

in https://github.com/ryanmcgrath/twython/blob/master/twython/streaming/api.py

Should there be some catch for this to pass it to on_error, rather than throwing an uncaught exception?

Looks like the exception is a fairly new one, see https://github.com/kennethreitz/requests/pull/1498.

I’d submit a patch, but I’m not sure of the best way to catch this. Putting a try block around the whole loop seems messy.

About this issue

  • Original URL
  • State: closed
  • Created 11 years ago
  • Comments: 54 (10 by maintainers)

Most upvoted comments

I’ve been able to process slow-flowing streams (i.e., searching for rare keywords) for weeks at a time, collecting tens-of-thousands of tweets with with no problems. I first encountered this particular error when I was doing a test run with very broad/popular search terms - streaming a lot of tweets very fast. I’ve been able to replicate the error pretty consistently by increasing the rate of data streaming (especially on computers with slower processors). From what I can tell, this error is directly related to Twitter API disconnecting due to queue overload:

A client reads data too slowly. Every streaming connection is backed by a queue of messages to be sent to the client. If this queue grows too large over time, the connection will be closed.

Here’s an example where I recorded stream latency for a particularly fast stream on a particularly slow computer. You’ll notice that stream latency grows to a peak (red points = data collected) and then drops off, resulting in many seconds of lost data:

2014-12-12_latency

Note: x-axis: Data Collection Time is parsed from the tweet’s timestamp y-axis: Latency = actual clock time - tweet’s timestamp

Each of those peaks directly corresponds with a Chunked Encoding Error. When this happens, Twitter’s streaming queue dumps and you start over in real time (if you immediately restart the streamer)… but you lose as many seconds of data as you had latency.

If you want to avoid this issue, your best bet is to eliminate extra processes that slow down your ability to retrieve streaming data. Stream the JSON data directly to storage, then use a secondary process to parse it as needed. If you can narrow the filter terms, that would also help to slow the stream of data. Alternately, you could get a dedicated server with more processing power.

If you’re not too worried about data loss, this is a crude solution that got me back up and running quickly. I altered my own code instead of updating the underlying Twython code. A similar solution was mentioned early-on in the thread, but I didn’t see any example code for it. Enclose the stream.statuses.filter() call in a while loop with an exception handler, like this:

(Example works in Python version 2.7)

#import sys #Do this if you want to log error output

while True:  #Endless loop: personalize to suit your own purposes
    try: 
        stream.statuses.filter(track='foo bar,foobar,more search strings here')
    except:
        #e = sys.exc_info()[0]  #Get exception info (optional)
        #print 'ERROR:',e  #Print exception info (optional)
        continue

Note: This affects the handling of other types of exceptions as well, so use with care.

Recommendations and refinements welcome. Happy streaming!

i have the same error, do you have a solution please?