oxigraph: Bulk loading error on M1 Mac

The following error occurs (reproducibly) when loading millions++ of triples:

❯ /Users/jpslewis/.cargo/bin/oxigraph_server --location oxidata_4 load --file ~/git/wikidata_test/lei_wikidata_full.nt
1643388 triples loaded in 7s (234769 t/s) from /Users/jpslewis/git/wikidata_test/lei_wikidata_full.nt
4930374 triples loaded in 12s (410864 t/s) from /Users/jpslewis/git/wikidata_test/lei_wikidata_full.nt
9860954 triples loaded in 20s (493047 t/s) from /Users/jpslewis/git/wikidata_test/lei_wikidata_full.nt
14791594 triples loaded in 28s (528271 t/s) from /Users/jpslewis/git/wikidata_test/lei_wikidata_full.nt
19722310 triples loaded in 36s (547841 t/s) from /Users/jpslewis/git/wikidata_test/lei_wikidata_full.nt
24652992 triples loaded in 44s (560295 t/s) from /Users/jpslewis/git/wikidata_test/lei_wikidata_full.nt
27939991 triples loaded in 49s (570203 t/s) from /Users/jpslewis/git/wikidata_test/lei_wikidata_full.nt
32870680 triples loaded in 58s (566735 t/s) from /Users/jpslewis/git/wikidata_test/lei_wikidata_full.nt
37801386 triples loaded in 66s (572748 t/s) from /Users/jpslewis/git/wikidata_test/lei_wikidata_full.nt
42732036 triples loaded in 74s (577459 t/s) from /Users/jpslewis/git/wikidata_test/lei_wikidata_full.nt
47661723 triples loaded in 82s (581240 t/s) from /Users/jpslewis/git/wikidata_test/lei_wikidata_full.nt
50948668 triples loaded in 87s (585616 t/s) from /Users/jpslewis/git/wikidata_test/lei_wikidata_full.nt
55879234 triples loaded in 96s (582075 t/s) from /Users/jpslewis/git/wikidata_test/lei_wikidata_full.nt
60809825 triples loaded in 104s (584709 t/s) from /Users/jpslewis/git/wikidata_test/lei_wikidata_full.nt
65740506 triples loaded in 112s (586968 t/s) from /Users/jpslewis/git/wikidata_test/lei_wikidata_full.nt
70671206 triples loaded in 121s (584059 t/s) from /Users/jpslewis/git/wikidata_test/lei_wikidata_full.nt
73958333 triples loaded in 126s (586970 t/s) from /Users/jpslewis/git/wikidata_test/lei_wikidata_full.nt
78888935 triples loaded in 135s (584362 t/s) from /Users/jpslewis/git/wikidata_test/lei_wikidata_full.nt
83819560 triples loaded in 144s (582080 t/s) from /Users/jpslewis/git/wikidata_test/lei_wikidata_full.nt
88750277 triples loaded in 153s (580067 t/s) from /Users/jpslewis/git/wikidata_test/lei_wikidata_full.nt
Error: Custom { kind: Other, error: ErrorStatus { code: 5, subcode: 0, severity: 3, message: "IO error: While open a file for random read: oxidata_4/001261.sst: Too many open files" } }

About this issue

  • Original URL
  • State: open
  • Created 2 years ago
  • Comments: 20 (13 by maintainers)

Most upvoted comments

@jeremiahpslewis Thank you! I’m going to investigate and I will post here my progress.

It seems to me that a lot of databases just state in their manual “if you encounter this error, just raise the limit using ulimit”, I might maybe just do that if I don’t find a solution quickly. There are still so many other things to fix/improve in Oxigraph.

@jeremiahpslewis Thank you! I just pushed v0.3.1 with this change.

I have found the root cause of the error: RocksDB is hapilly keeping every file opened and the bulk loader tends to generate a lot of files. I have change RocksDB configuration to not open more file than what is allowed: 42f316f7db23c777ed15349356ce14e61f986b6c (and keep some files for other processes like the bulk loader)

Hi! It’s because the bulk loader parallelises too much and so opens too many files. I should figure out get dynamically the maximal number of opened files and limit the parallelism based on that.

PS: your mac beats by a large magin the best loading speed with Oxigraph I have ever seen, congrats!