python-hyperscan: Memory leak in Database object when compiling, dumping and loading.
Thanks for the great library. We discovered a memory leak when a Database object is compiled, dumped, and loaded multiple times. This is increasingly worse when the database is growing larger. Even creating a new Database object every time will result in a leak.
Compiling one pattern in the same Database object:
import hyperscan as hs
db = hs.Database(mode=hs.HS_MODE_BLOCK)
for i in range(100000):
db.compile(expressions=[b'test'], ids=[1], flags=[hs.HS_FLAG_ALLOWEMPTY])
Creating a new Database object every compile, slows the leak down.
import hyperscan as hs
for i in range(100000):
db = hs.Database(mode=hs.HS_MODE_BLOCK)
db.compile(expressions=[b'test'], ids=[1], flags=[hs.HS_FLAG_ALLOWEMPTY])
But when the Database object is dumped (and loaded as well) it speeds the memory usage up.
import hyperscan as hs
db = hs.Database(mode=hs.HS_MODE_BLOCK)
for i in range(100000):
db.compile(expressions=[b'test'], ids=[1], flags=[hs.HS_FLAG_ALLOWEMPTY])
b = hs.dumpb(db)
I tried to dig into the C code to find the problem, but I do not have enough expertise to solve the issue. Can someone help or point me in the right direction. Thanks!
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 3
- Comments: 15 (7 by maintainers)
Commits related to this issue
- :bug: fix memory leak (#46), improve error handling (#41), drop support for Py3.6 — committed to darvid/python-hyperscan by darvid 2 years ago
- π fix: fix memory leak in loadb (#46) and minor doc tweaks — committed to darvid/python-hyperscan by darvid a year ago
ahhhh looks like the leak is with scratch not being freed now. hereβs a quick workaround while I push and release a fix:
This solves the issues, thanks a lot!
You are a wizard @darvid π
Hi @darvid , I was just testing this with the latest version available from pypi, which seems to be v0.3.2. So I wanted to let you know v0.3.3 is not published on pypi yet π