density: Problems with the density_context by multiple uses

I ran a profiling test which showed that calloc is the top CPU consumer. I saw in density_allocate_context, there is a line:

memset(context->dictionary, 0, context->dictionary_size);

(It turned out that gcc optimized the combination of x=malloc(y) and memset(x,0,y) into single calloc(x) call.)

So the problem was that every time I use the compress or decompress routine, density must clear (1<<16)*(2+1)*4 = 786432 bytes. That is quite a burden for all machines.

I see there is a density_compress_with_context function which seems to allow a pre-allocated context to be used by multiple times, but since the dictionary inside the context has been changed, the output (compressed buffer) cannot be decompressed correctly.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 23 (14 by maintainers)

Commits related to this issue

Most upvoted comments

Hi Guillaume, I think I have another solution 😃 Instead of shrinking dictionary size you can use flags to denote which dictionary entries are valid and which are possibly dirty. A bit flag consume 32x less than 32-bit value, thus you would need to only clear 24 KiB of flags at the beginning. The drawback is of course overhead of flags management. But there’s also a solution - make the dirty (i.e. flag based) mode adaptive and after processing e.g. 20 KB of data zero all unused dictionary entries and switch to original mode that doesn’t use flags.

I tried the approach of allocating a huge size of memory, using it as an array of memory block with size of each one is 768KB, managing it with a bitmap and clearing the dirty ones with another thread.

In the case of InnoDB page compression under a heavily loaded server, I have to allocate over 4GB memory to catch up the pace of memory demand, with the zeroing thread eating up 30% time of a core.

Perhaps the reduction of dictionary size is the final solution.