requests: Possible Memory Leak

I have a very simple program that periodically retrieves an image from an IP camera. I’ve noticed that the working set of this program grows monotonically. I’ve written a small program that reproduces the issue.

import requests
from memory_profiler import profile


@profile
def lol():
    print "sending request"
    r = requests.get('http://cachefly.cachefly.net/10mb.test')
    print "reading.."
    with open("test.dat", "wb") as f:
        f.write(r.content)
    print "Finished..."

if __name__=="__main__":
    for i in xrange(100):
        print "Iteration", i
        lol()

The memory usage is printed at the end of each iteration. This is the sample output. ** Iteration 0 **

Iteration 0
sending request
reading..
Finished...
Filename: test.py

Line #    Mem usage    Increment   Line Contents
================================================
     5     12.5 MiB      0.0 MiB   @profile
     6                             def lol():
     7     12.5 MiB      0.0 MiB       print "sending request"
     8     35.6 MiB     23.1 MiB       r = requests.get('http://cachefly.cachefly.net/10mb.test')
     9     35.6 MiB      0.0 MiB       print "reading.."
    10     35.6 MiB      0.0 MiB       with open("test.dat", "wb") as f:
    11     35.6 MiB      0.0 MiB        f.write(r.content)
    12     35.6 MiB      0.0 MiB       print "Finished..."

** Iteration 1 **

Iteration 1
sending request
reading..
Finished...
Filename: test.py

Line #    Mem usage    Increment   Line Contents
================================================
     5     35.6 MiB      0.0 MiB   @profile
     6                             def lol():
     7     35.6 MiB      0.0 MiB       print "sending request"
     8     36.3 MiB      0.7 MiB       r = requests.get('http://cachefly.cachefly.net/10mb.test')
     9     36.3 MiB      0.0 MiB       print "reading.."
    10     36.3 MiB      0.0 MiB       with open("test.dat", "wb") as f:
    11     36.3 MiB      0.0 MiB        f.write(r.content)
    12     36.3 MiB      0.0 MiB       print "Finished..."

The memory usage does not grow with every iteration, but it does continue to creep up with requests.get being the culprit that increases memory usage.

By ** Iteration 99 ** this is what the memory profile looks like.

Iteration 99
sending request
reading..
Finished...
Filename: test.py

Line #    Mem usage    Increment   Line Contents
================================================
     5     40.7 MiB      0.0 MiB   @profile
     6                             def lol():
     7     40.7 MiB      0.0 MiB       print "sending request"
     8     40.7 MiB      0.0 MiB       r = requests.get('http://cachefly.cachefly.net/10mb.test')
     9     40.7 MiB      0.0 MiB       print "reading.."
    10     40.7 MiB      0.0 MiB       with open("test.dat", "wb") as f:
    11     40.7 MiB      0.0 MiB        f.write(r.content)
    12     40.7 MiB      0.0 MiB       print "Finished..."

Memory usage doesn’t drop unless the program is terminated.

Is there a bug or is it user error?

About this issue

  • Original URL
  • State: closed
  • Created 11 years ago
  • Comments: 54 (36 by maintainers)

Commits related to this issue

Most upvoted comments

so, what is the solution of this issue ?

There’s been no further complaints of this arising and I think we’ve done our best on this. I’m happy to reopen it and reinvestigate if necessary

I’m sitting in an airport right now and I’ll be on a plain for several hours soon. I probably won’t be able to get to this tonight or until potentially later this week (if not next weekend/week). So far though, I tried using release_conn on the HTTPResponse we receive back. I checked with gc.get_referents what the Response object has that may be failing to be GC’d. It has the original httplib HTTPResponse (stored as _original_response and that (from what get_referents reported) only has an email Message (for the headers) and everything else is either a string or dictionary (or maybe lists). If it is sockets, I don’t see where they wouldn’t be garbage collected.

Also, using Session#close (I made the code use sessions instead of the functional API first) doesn’t help (and that should clear the PoolManagers which clear the connection pools). So the other thing that was interesting was that PoolManager#connection_from_url would add ~0.8 MB (give or take 0.1) the first few times it was called. So that adds ~3MB but the rest of it comes from conn.urlopen in HTTPAdapter#send. The bizarre thing is that gc.garbage has some odd elements if you use gc.set_debug(gc.DEBUG_LEAK). It has something like [[[...], [...], [...], None], [[...], [...], [...], None], [[...], [...], [...], None], [[...], [...], [...], None]] and as you’d expect gc.garbage[0] is gc.garbage[0][0] so that information is absolutely useless. I’ll have to experiment with objgraph when I get the chance.

So I took @mhjohnson’s previous suggestion and used objgraph to figure out where the other reference was, but objgraph can’t seem to find it. I added:

objgraph.show_backrefs([r.raw._original_response], filename='requests.png')

In the script 2 comments above and got the following: requests which only shows that there would be 2 references to it. I wonder if there’s something up with how sys.getrefcount works that’s unreliable.