python-hyperscan: Memory leak in Database object when compiling, dumping and loading.

Thanks for the great library. We discovered a memory leak when a Database object is compiled, dumped, and loaded multiple times. This is increasingly worse when the database is growing larger. Even creating a new Database object every time will result in a leak.

Compiling one pattern in the same Database object:

import hyperscan as hs

db = hs.Database(mode=hs.HS_MODE_BLOCK)
for i in range(100000):
    db.compile(expressions=[b'test'], ids=[1], flags=[hs.HS_FLAG_ALLOWEMPTY])

case1

Creating a new Database object every compile, slows the leak down.

import hyperscan as hs

for i in range(100000):
    db = hs.Database(mode=hs.HS_MODE_BLOCK)
    db.compile(expressions=[b'test'], ids=[1], flags=[hs.HS_FLAG_ALLOWEMPTY])

case2

But when the Database object is dumped (and loaded as well) it speeds the memory usage up.

import hyperscan as hs

db = hs.Database(mode=hs.HS_MODE_BLOCK)
for i in range(100000):
    db.compile(expressions=[b'test'], ids=[1], flags=[hs.HS_FLAG_ALLOWEMPTY])
    b = hs.dumpb(db)

case3

I tried to dig into the C code to find the problem, but I do not have enough expertise to solve the issue. Can someone help or point me in the right direction. Thanks!

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 3
  • Comments: 15 (7 by maintainers)

Commits related to this issue

Most upvoted comments

ahhhh looks like the leak is with scratch not being freed now. here’s a quick workaround while I push and release a fix:

import hyperscan as hs

db = hs.Database(mode=hs.HS_MODE_BLOCK)

db.compile(expressions=[b'test'], ids=[1], flags=[hs.HS_FLAG_ALLOWEMPTY])
b = hs.dumpb(db)

for i in range(100000):
    del db.scratch  # dealloc scratch anytime you decref db
    db = hs.loadb(b)

This solves the issues, thanks a lot!

You are a wizard @darvid πŸ˜„

Hi @darvid , I was just testing this with the latest version available from pypi, which seems to be v0.3.2. So I wanted to let you know v0.3.3 is not published on pypi yet πŸ˜ƒ