scryer-prolog: Much slower than SWI checking [wn_valid] from wordnet-prolog
I have forked ekaf/wordnet-prolog and made a few changes needed to get a validity check to run in Scryer.
Whereas swipl does [wn_valid]. in about 13 seconds, I find that scryer-prolog (“v0.9.1-51-gfd191285”) takes 1 hour. All but 24 seconds of this is spent in goal check_keys/0:
check_keys:-
writeln('Searching for ambiguous sense keys'),
findall(X, multikey(X), L),
list_to_set(L,S),
member(K,S),
format("Ambiguous key: ~w\n",[K]),
findall(I, sk(I,_,K), IL),
list_to_set(IL,IS),
member(I,IS),
g(I,G),
format(" ->~w (~w)\n",[I,G]),
false.
check_keys:-
ok,
nl.
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 2
- Comments: 20 (6 by maintainers)
With the above application of
term_expansion/2, I can close this issue.Cowabunga:
Result:
… or just compare
?- multikey(X), false.May I suggest to use the GNU Prolog repository to file and discuss GNU Prolog-specific issues:
https://github.com/didoudiaz/gprolog ?
Would you mind reducing the size of your finding? For example, you could introduce
falseafter some goal and compare that failure slice with both systems. It could be the firstfindall/3alone that makes problems. Also, are you sure that you used the sameapply/2definition? If you can, directly rewrite any goal forapply/2with a fixed list to the correspondingcall/N. Theseapply/2goals are really so 1970sI have added a support for
term_expansion/2in gprolog which works with the above proposed workaround. However, here is an alternative solution based onfindall/setofandkeysort(which are generally efficiently implemented). It avoids to duplication of the largesk/3database (207272 facts) as with theterm_expansion/2solution.To try it: simply replace in
wn_valid.plthe definitons ofmultikey/1andcheckeys/0by the following code:@dcnorris I had to do several changes in the original code from Eric Kafe (too much SWI-oriented). I also implemented the changes for the `term_expansion/2’ solution and for the above solution. I can share them with you if you want.
From the GNU Prolog documentation:
And building on that, you can extend
sk/3to dynamically switch between the different variants, for example (in the case above):sk(X, Y, Z) :- ( nonvar(X) -> sk_1(X, Y, Z) ; nonvar(Z) -> sk_3(Z, X, Y) ; sk_1(X, Y, Z) ).In this way, all your code can use
sk/3, and when Scryer Prolog’s built-in indexing improves, you simply remove these manual workarounds.@dcnorris: Improving ISO compliance for better portability would definitely help already, if only to make such comparisons between different systems easier.
Regarding the auxiliary facts specifically useful for Scryer Prolog and other systems with limited built-in indexing, you may be able to use Scryer Prolog’s
term_expansion/2mechanism to automatically generate, for each defined fact, also its corresponding alternative definition with a different argument order.For example, try:
:- discontiguous(sk_1/3). :- discontiguous(sk_3/3). sk(X, Y, Z) :- sk_1(X, Y, Z). term_expansion(sk(X,Y,Z), [sk_1(X,Y,Z), sk_3(Z,X,Y)]). sk(100001740,1,'entity%1:03:00::'). sk(100001930,1,'physical_entity%1:03:00::'). sk(100002137,1,'abstraction%1:03:00::'). sk(100002137,2,'abstract_entity%1:03:00::'). sk(100002452,1,'thing%1:03:00::'). sk(100002684,1,'object%1:03:00::'). sk(100002684,2,'physical_object%1:03:00::'). ...In this way, you automatically get, for each
sk/3fact that occurs in the file, the correspondingsk_1/3andsk_3/3facts.This seems like an issue with indexing. Specifically, we have:
where
sk/3consists of hundreds of thousands of plain facts.In the highlighted goal above, i.e.,
sk(J, _, K),Jis a variable andKis instantiated. So, a Prolog system with good indexing can useKto quickly detect which facts apply in this case.Scryer Prolog currently only indexes on the first instantiated argument that occurs in any clause of a predicate, so in this case, only the first argument of
sk/3is used for indexing. This leads to quadratic overheadmultikey/1, because for each fact, all facts have to be checked.Such an issue can be worked around “manually”, by adding (or generating) auxiliary facts, say
sk_3/3, where the arguments are reordered to benefit from the engine’s argument indexing if they are instantiated at the time of the call.Ideally of course, the system would automatically generate suitable indices on demand, depending on which arguments are instantiated at the time of the call, so that the fitting clauses can later be found much more quickly.
Related to #192.
It’s all in
multikey/1, then:SWI by comparison:
Here is
multikey/1for quick reference:I am sold on failure slicing!