libelektra: multiresolver: crash when used with cache

src/plugins/resolver/resolver.c:1175 seems to cause a crash in some situations.

Tried to reproduce. Does not work:

# create a new folder to not mess up with existing data
mkdir x
cd x

# create two mountpoints
kdb mount `pwd`/csv system/tests/csv/lists/cur csvstorage header=colname,columns/index=student/id
kdb mount -R multifile -c storage="ini",pattern="*/*",resolver="resolver" `pwd`/multi system/tests/multi

# create a csv file
echo "student/id,ue/5/kreuzerl" >> csv
echo "01234567,X" >> csv

# create a multiresolver directory
mkdir -p multi/pool
cd multi/pool
echo "[]" >> 01234567 >> 01234568 >> 01234569
echo "[student]" >> 01234567 >> 01234568 >> 01234569
echo "id = 01234567" >> 01234567
echo "[ue/5]" >> 01234567

 # create caches
kdb ls system/tests > /dev/null
kdb ls system/tests/multi/pool > /dev/null
kdb ls system/tests/csv/lists/cur > /dev/null

# now do something directly on the files
rm 01234569
touch 01234566
echo "kreuzerl = O" >> 01234567
echo "[something]" >> 01234567 >> 01234568
echo ""  >> 01234567 >> 01234568
echo ""  >> 01234568

# trigger
kdb cp -rf system/tests/csv/lists/cur system/tests/multi/pool

# debug
kdb export system/tests mini
tail *

kdb umount system/tests/csv/lists/cur
kdb umount system/tests/multi
cd ../../..
rm -r x

Did not work:

# create a new folder to not mess up with existing data
mkdir x
cd x

# create two mountpoints
kdb mount `pwd`/csv system/tests/csv csvstorage header=colname,columns/index=sec/somekey 
kdb mount -R multifile -c storage="ini",pattern="*",resolver="resolver" `pwd`/multi system/tests/multi

# create a csv file
echo "sec/somekey,othersec/deep/otherkey" >> csv
echo "a,data2a" >> csv
# echo "b,data2b" >> csv # but do not write in the other

# create a multiresolver directory
mkdir multi
cd multi
echo "[sec]" >> a >> b
echo "somekey = a" >> a
echo "somekey = b" >> b
echo "" >> a >> b
echo "" >> a >> b

kdb cp -rf system/tests/csv system/tests/multi

kdb umount system/tests/csv
kdb umount system/tests/multi

The problem seems to be filename=0x2 <error: Cannot access memory at address 0x2>, most likely set wrongly by the multiresolver?

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007f902d39542a in __GI_abort () at abort.c:89
#2  0x0000564e84a716fc in catchSignal (signum=<optimized out>) at /home/jenkins/workspace/libelektra_master-Q2SIBK3KE2NBEMJ4WVGJXAXCSCB77DUBUULVLZDKHQEV3WNDXBMA@2/libelektra/src/tools/kdb/main.cpp:110
#3  <signal handler called>
#4  strlen () at ../sysdeps/x86_64/strlen.S:106
#5  0x00007f902d3a9da8 in _IO_vfprintf_internal (s=s@entry=0x7fffc1371d80, format=<optimized out>, format@entry=0x7f902c448ddd "the file \"%s\" because of \"%s\"", ap=ap@entry=0x7fffc1371f48) at vfprintf.c:1637
#6  0x00007f902d457cf6 in ___vsnprintf_chk (s=0x564e85b9e3b0 "the file \"o-\220\177", maxlen=<optimized out>, maxlen@entry=512, flags=flags@entry=1, slen=slen@entry=18446744073709551615, 
    format=format@entry=0x7f902c448ddd "the file \"%s\" because of \"%s\"", args=args@entry=0x7fffc1371f48) at vsnprintf_chk.c:63
#7  0x00007f902dc9cd34 in vsnprintf (__ap=0x7fffc1371f48, __fmt=0x7f902c448ddd "the file \"%s\" because of \"%s\"", __n=512, __s=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/stdio2.h:77
#8  elektraVFormat (format=format@entry=0x7f902c448ddd "the file \"%s\" because of \"%s\"", arg_list=arg_list@entry=0x7fffc1371f48)
    at /home/jenkins/workspace/libelektra_master-Q2SIBK3KE2NBEMJ4WVGJXAXCSCB77DUBUULVLZDKHQEV3WNDXBMA@2/libelektra/src/libs/elektra/internal.c:430
#9  0x00007f902c43f3bb in elektraAddWarningf36 (warningKey=warningKey@entry=0x564e85b09330, reason=0x7f902c448ddd "the file \"%s\" because of \"%s\"", 
    file=0x7f902c448578 "/home/jenkins/workspace/libelektra_master-Q2SIBK3KE2NBEMJ4WVGJXAXCSCB77DUBUULVLZDKHQEV3WNDXBMA@2/libelektra/src/plugins/resolver/resolver.c", line=0x7f902c448dd8 "1175", line=0x7f902c448dd8 "1175", 
    file=0x7f902c448578 "/home/jenkins/workspace/libelektra_master-Q2SIBK3KE2NBEMJ4WVGJXAXCSCB77DUBUULVLZDKHQEV3WNDXBMA@2/libelektra/src/plugins/resolver/resolver.c", reason=0x7f902c448ddd "the file \"%s\" because of \"%s\"")
    at /home/jenkins/workspace/libelektra_master-Q2SIBK3KE2NBEMJ4WVGJXAXCSCB77DUBUULVLZDKHQEV3WNDXBMA@2/libelektra/obj-x86_64-linux-gnu/src/include/kdberrors.h:3458
#10 0x00007f902c43f4a4 in elektraUnlinkFile (filename=0x2 <error: Cannot access memory at address 0x2>, parentKey=parentKey@entry=0x564e85b09330)
    at /home/jenkins/workspace/libelektra_master-Q2SIBK3KE2NBEMJ4WVGJXAXCSCB77DUBUULVLZDKHQEV3WNDXBMA@2/libelektra/src/plugins/resolver/resolver.c:1175
#11 0x00007f902c440cc5 in libelektra_resolver_fm_hpu_b_fm_hpu_b_LTX_elektraPluginerror (handle=<optimized out>, r=<optimized out>, parentKey=0x564e85b09330)
    at /home/jenkins/workspace/libelektra_master-Q2SIBK3KE2NBEMJ4WVGJXAXCSCB77DUBUULVLZDKHQEV3WNDXBMA@2/libelektra/src/plugins/resolver/resolver.c:1191
#12 0x00007f9029b7cb2d in elektraMultifileError (handle=0x564e85a68a10, returned=0x564e85b80490, parentKey=0x564e85b09330)
    at /home/jenkins/workspace/libelektra_master-Q2SIBK3KE2NBEMJ4WVGJXAXCSCB77DUBUULVLZDKHQEV3WNDXBMA@2/libelektra/src/plugins/multifile/multifile.c:900
#13 0x00007f902deb2bf6 in elektraSetRollback (parentKey=0x564e85b09330, split=0x564e85b7b520) at /home/jenkins/workspace/libelektra_master-Q2SIBK3KE2NBEMJ4WVGJXAXCSCB77DUBUULVLZDKHQEV3WNDXBMA@2/libelektra/src/libs/elektra/kdb.c:1331
#14 kdbSet (handle=0x564e85a3dce0, ks=0x564e85b0c0c0, parentKey=0x564e85b09330) at /home/jenkins/workspace/libelektra_master-Q2SIBK3KE2NBEMJ4WVGJXAXCSCB77DUBUULVLZDKHQEV3WNDXBMA@2/libelektra/src/libs/elektra/kdb.c:1564
#15 0x0000564e84a3fc59 in kdb::KDB::set (parentKey=..., returned=..., this=0x564e85a3dc78)
    at /home/jenkins/workspace/libelektra_master-Q2SIBK3KE2NBEMJ4WVGJXAXCSCB77DUBUULVLZDKHQEV3WNDXBMA@2/libelektra/src/bindings/cpp/include/kdb.hpp:229
#16 CpCommand::execute (this=0x564e85a3dc70, cl=...) at /home/jenkins/workspace/libelektra_master-Q2SIBK3KE2NBEMJ4WVGJXAXCSCB77DUBUULVLZDKHQEV3WNDXBMA@2/libelektra/src/tools/kdb/cp.cpp:111
#17 0x0000564e84a23659 in main (argc=<optimized out>, argv=0x7fffc1372bd8) at /home/jenkins/workspace/libelektra_master-Q2SIBK3KE2NBEMJ4WVGJXAXCSCB77DUBUULVLZDKHQEV3WNDXBMA@2/libelektra/src/tools/kdb/main.cpp:198

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 1
  • Comments: 24 (24 by maintainers)

Most upvoted comments

No. To be honest, I didn’t think about such problems at all when I started working on the cache. In general, it sounds like you’re asking about the halting problem, but one can always “do some testing” and see what happens.

If the problems are about internal state then there is not much you can do. With @vLesk changes we will hopefully get all plugins, including resolvers, stateless.

I know that the size is larger. The cache is the same as the in-memory data structure. Unused parts of the keyset array etc. are written to disk as-is.

I think the size is quite good. Obviously there are many KeySets with ony a few elements (as INI adds meta data to every key).

It would still be nice to know why/where it crashes.

I also tested with multiresolver+ni, the crashes also appear then. So the bug seems to be in multiresolver. I get output like:

 Sorry, 98 warnings were issued ;(
        Sorry, module resolver issued the warning 36:
        could not unlink file: the file "L�����H�����H�=�1" because of "No such file or directory"

Which looks like some memory corruption. Running everything with valgrind, I get following output:

==10949== Memcheck, a memory error detector
==10949== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==10949== Using Valgrind-3.12.0.SVN and LibVEX; rerun with -h for copyright info
==10949== Command: kdb cp -rf system/lehre/ep2/lists/adhoc/di11a system/lehre/ep2/students/pool
==10949== 
==10949== Warning: invalid file descriptor 1031 in syscall open()
==10949== Invalid read of size 4
==10949==    at 0x6F31CB1: libelektra_resolver_fm_hpu_b_fm_hpu_b_LTX_elektraPluginerror (resolver.c:1184)
==10949==    by 0x97F8B2C: elektraMultifileError (multifile.c:906)
==10949==    by 0x50C1BF5: elektraSetRollback (kdb.c:1331)
==10949==    by 0x50C1BF5: kdbSet (kdb.c:1564)
==10949==    by 0x16A838: set (kdb.hpp:229)
==10949==    by 0x16A838: CpCommand::execute(Cmdline const&) (cp.cpp:111)
==10949==    by 0x14DD63: main (main.cpp:198)
==10949==  Address 0x1140ade0 is 32 bytes inside a block of size 82 free'd
==10949==    at 0x4C2CDDB: free (vg_replace_malloc.c:530)
==10949==    by 0x54D7634: keyClear (key.c:515)
==10949==    by 0x54D77E0: keyDel (key.c:463)
==10949==    by 0x97F8D7D: flagUpdateBackends (multifile.c:842)
==10949==    by 0x97F8D7D: elektraMultifileSet (multifile.c:854)
==10949==    by 0x50C0E20: elektraSetPrepare (kdb.c:1210)
==10949==    by 0x50C0E20: kdbSet (kdb.c:1513)
==10949==    by 0x16A838: set (kdb.hpp:229)
==10949==    by 0x16A838: CpCommand::execute(Cmdline const&) (cp.cpp:111)
==10949==    by 0x14DD63: main (main.cpp:198)
==10949==  Block was alloc'd at
==10949==    at 0x4C2DDCF: realloc (vg_replace_malloc.c:785)
==10949==    by 0x54D6C13: elektraRealloc (internal.c:238)
==10949==    by 0x54D8F80: keyAddName (keyname.c:987)
==10949==    by 0x54D9190: elektraKeySetName (keyname.c:572)
==10949==    by 0x54D7DC3: keyVInit (keyhelpers.c:344)
==10949==    by 0x54D74E4: keyVNew (key.c:215)
==10949==    by 0x54D758D: keyNew (key.c:197)
==10949==    by 0x97F8DAF: flagUpdateBackends (multifile.c:824)
==10949==    by 0x97F8DAF: elektraMultifileSet (multifile.c:854)
==10949==    by 0x50C0E20: elektraSetPrepare (kdb.c:1210)
==10949==    by 0x50C0E20: kdbSet (kdb.c:1513)
==10949==    by 0x16A838: set (kdb.hpp:229)
==10949==    by 0x16A838: CpCommand::execute(Cmdline const&) (cp.cpp:111)
==10949==    by 0x14DD63: main (main.cpp:198)
... and much much more (1.8MB)

I tried to increase the number of file descriptors but this does not seem to have influence. So maybe the multiresolver remembered (via the cache) some file descriptor wrongly?

So the deactivation of multiresolver+ini should fix the problem.

It’s done in #2750, just waiting for the builds to succeed. It would still be nice to know why/where it crashes.

Ok. I won’t have an excessive amount of time to waste on that (right now, maybe later). Let’s talk / take a look at it at our next meeting.

EDIT: I’ll “blacklist” ini inside multifile today. I had another fix waiting too.

Looks good, it does not crash. But it also does not speed-up. Maybe it is disabled because of INI?

But I think we’ll leave it for now. Some things have really amazing speedup, e.g. parsing a huge JSON file is reduced from 2.5sec to 0.17sec. What a pity that the multifile and INI caused so many troubles.

Unfortunately, I have no idea what it could be. The example you gave is quite elaborate but it works fine in our debian stretch docker image.

We already have a few separate issues here:

  1. The trace of the segfault is easy to reproduce by kdb cp-ying some key from another backend into the multifile backend. (which was your first example).

  2. Copying stuff between two files inside one mutifile backend causes a corrupt cache. (#2702)

  3. The complex problem that we can’t easily reproduce.

I can only suggest that we start with the first two which are easy to reproduce, make regression tests, fix and then move on to the third problem.

Thank you for the details!

I just see that the steps above also crash without the cache plugin.

That is what I thought too, but I was not sure. I‘ll look into it nevertheless. As I mentioned there is at least one other critical bug with the cache.

I just see that the steps above also crash without the cache plugin. The original script, however, only crashed with the cache plugin enabled… So maybe it is not the same bug.

Thank you for reporting! I’ll prioritize this and hopefully fix it today.