pywt: Memory leak

It looks like pywt has a memory leak. When I process more than 200K small images (<10 KB each) a process takes all 16GB of my memory and stops (or slows down).

  • Code

whash() function from imagehash lib: https://github.com/JohannesBuchner/imagehash/pull/23/commits/da9386da76d3d32f6ba41160c7ef284102df36fc

Reproduction code snippet:

res = {}
i = 0
for filename in onlyfiles:
    fname = os.path.join(filedir, filename)
    try:
        image = PIL.Image.open(fname)
        whash = imagehash.whash(image)
        value = whash.hash.flatten()
        res[int(filename[:-4])] = sum(1<<i for i, b in enumerate(value) if b)
    except Exception as e:
        print("Error. File " + fname + ": ", e )

Btw… imagehash.phash() function doesn’t use pywt and doesn’t have this issue.

  • Data

Image subset from Avito competition: https://www.kaggle.com/c/avito-duplicate-ads-detection

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Comments: 19 (7 by maintainers)

Most upvoted comments

That looks like the right place (if you are using pillow instead of the old PIL). I haven’t used it much to be honest. Closing this now, thanks for reporting and following up 😃