cayley: bolt: Performance degradation while loading Freebase dump
Description I’m trying to load full Freebase dump into cayley, but load performance is degrading, and it looks like the load process will never finish.
At first I’ve run a series of experiments to determine the fastest way to load the data. Here are the results (cells contain minutes needed to load the number of quads from the column header). As we can see Bolt + pq + nosync + 2.5M
had the best performance (not sure though that nosync contributed at all)
| 5M | 10M | 15M | 20M | 25M – | – | – | – | – | – Bolt + nq + 10k | 2 | 7 | 16 | 20 | 29 Bolt + nq + 50k | 2 | 5 | 13 | 15 | 23 Bolt + nq + 100k | 2 | 5 | 12 | 14 | 20 Bolt + nq + 200k | 2 | 5 | 10 | 12 | 17 Bolt + nq + 500k | 2 | 5 | 8 | 10 | 13 Bolt + nq + 500k | 2 | 5 | 8 | 9 | 12 Bolt + nq + 1.25M | 2 | 5 | 7 | 9 | 12 Bolt + nq + 2.5M | 2 | 5 | 7 | 9 | 12 Bolt + nq + 5M | 3 | 5 | 8 | 9 | 12 Bolt + nq + 1M + nosync | 2 | 5 | 7 | 9 | 12 Bolt + nq + 2.5M + nosync | 2 | 5 | 7 | 9 | 11 Bolt + pq.gz + 1.25M | 2 | 4 | 7 | 8 | 10 Bolt + pq + nosync + 1.25M | 2 | 4 | 6 | 7 | 10 Bolt + pq + nosync + 2.5M | 2 | 4 | 6 | 7 | 9 Leveldb + nq + buffer 20 + 10k | 4 | 12 | 26 | | Leveldb + pq.gz + buffer 20M + 1.25M | 1 | 8 | 16 | 18 | 27 Leveldb + nq + buffer 20M + 5M | 2 | 18 | | | Leveldb + pq.gz + buffer 200M + 1.25M | 1 | 8 | 16 | 18 | 27 Leveldb + pq.gz + buffer 1G + 1.25M | 1 | 8 | 16 | 18 | 27 Leveldb + pq.gz + buffer 1G + 500k | 1 | 5 | 11 | 13 | 19 Leveldb + pq.gz + buffer 4G + 500k | 1 | 5 | 11 | 13 | 19 Leveldb + pq.gz + buffer 4G + 1.25M | 1 | 8 | 16 | 18 | 27 Leveldb + pq.gz + buffer 4G + cache 200M + 1.25M | 1 | 8 | 16 | 18 | 28
Steps to reproduce the issue:
cayley load -c bolt.yml --verbose=3 -i freebase.pq
- bolt.yml:
store:
backend: bolt
address: bolt
load:
batch: 2500000
Received results:
At first, everything was ok, but then load started to slow down. The graph can demonstrate it better:
Then I’ve decided to make smaller batches and add nosync
:
cayley load -c bolt.yml --verbose=3 -i freebase.pq
- bolt.yml:
store:
backend: bolt
address: bolt
options:
nosync: true
load:
batch: 500000
Things became a bit better, but not for long:
htop
says the process consumes more and more memory (58gb after 2 days)
Expected results: Freebase loaded in less than infinity
Output of cayley version
or commit hash:
Cayley version: 0.7.5
Git commit hash: cf576babb7db
Environment details: CPU: 8 x Intel® Xeon® CPU @ 2.30GHz Memory: 29Gb OS: Ubuntu 16.04.6 LTS Disk: SSD
Backend database: (database and version)
Bolt
, not sure about version - cayley handled that for me
So the question is - am I doing it wrong or is there some bug?
About this issue
- Original URL
- State: open
- Created 5 years ago
- Reactions: 2
- Comments: 19 (10 by maintainers)
Just a follow up, but does now work to import freebase on Cayley?
@hubyhuby Note that Bolt pre-allocates some space, so not all of it is actually used by the database.
@manishrjain Thanks for a suggestion, we will definitely consider that API 😃
The problem is not with Badger specifically, as I mentioned. It’s a problem with Cayley’s legacy write API which tries to be suitable both for batch uploads and regular transactions. The new importer can take advantage of a WriteBatch in Badger directly and it work really well so far!
Thanks for taking the time to test the new version!
I think we may give up for now on Badger. I’ll need to change the way how we import data into it to avoid those large transaction.
We are getting somewhere. But it seems like the SP index may also build up over time. I wonder what in the Freebase schema may cause it? I guess we will find out after a successful import 😃
In any case, I will get back to you after making a few more changes.
First, I want to add instrumentation, so me can get more insights from the import process. I will add a “time per batch” metric, so we can get the same graph as you’ve built from Prometheus/Grafana directly. Also, I’m particularly interested in seeing the sizes of those index entries on each index. Also, things like lookup durations may be useful to speed up the import further.
And second, I will add an option to set custom indexes for KV, so we can try “SPO” index instead of “SP”. Since the import was able to proceed further than before, I think the “O” index was indeed the case for previous OOM.
@eawer I made a few changes to indexing in https://github.com/cayleygraph/cayley/pull/816. It should help with the OOM issue, but at the same time it will increase the size of the index on disk, so write performance may suffer. But it’s hard to make any statements because the new indexing strategy should also improve quad lookup performance, which may remove the need for multiple reads during the import process.
I will continue looking into it and will run a few tests locally as well. But the help with the testing is highly appreciated.
Thanks a lot for testing it @eawer. This definitely looks like a memory issue in the new code path. I will investigate it.