orientdb: MATCH bug in location existence check

OrientDB Version, operating system, or hardware.

  • v2.0 SNAPSHOT[ ] - .18[ ] .17[ ] .16[ ] .15[ ] .14[ ] .13[ ] .12[ ] .11[ ] .10[ ] .9[ ] .8[ ] .7[ ] .6[ ] .5[ ] .4[ ] .3[ ] .2[ ] .1[ ] .0[ ]
  • v2.1 SNAPSHOT[ ] - .16[ ] .15[ ] .14[ ] .13[ ] .12[ ] .11[ ] .10[ ] .9[ ] .8[ ] .7[ ] .6[ ] .5[ ] .4[ ] .3[ ] .2[ ] .1[ ] .0[ ]
  • v2.2 SNAPSHOT[ ] - .13[x] .rc1[ ] .beta2[ ] .beta1[ ]

Operating System

  • Linux
  • MacOSX
  • Windows
  • Other Unix
  • Other, name?

Expected behavior and actual behavior

Expected: If a traversal location named baz exists in the output, then checking $matched.baz IS null should be false in the MATCH query.

Actual: See below for an instance where data is returned for a location named baz, and yet the query evaluates $matched.baz IS null as true.

Steps to reproduce the problem

Here’s what the formed structure looks like:

Foo ==> Bar ==> Baz
  \
   \=> Far

Setup:

CREATE CLASS Foo EXTENDS V
CREATE CLASS Bar EXTENDS V
CREATE CLASS Baz EXTENDS V
CREATE CLASS Far EXTENDS V
CREATE CLASS Foo_Bar EXTENDS E
CREATE CLASS Bar_Baz EXTENDS E
CREATE CLASS Foo_Far EXTENDS E

CREATE VERTEX Foo SET name = 'foo'
CREATE VERTEX Bar SET name = 'bar'
CREATE VERTEX Baz SET name = 'baz'
CREATE VERTEX Far SET name = 'far'

CREATE EDGE Foo_Bar FROM (SELECT FROM Foo) TO (SELECT FROM Bar)
CREATE EDGE Bar_Baz FROM (SELECT FROM Bar) TO (SELECT FROM Baz)
CREATE EDGE Foo_Far FROM (SELECT FROM Foo) TO (SELECT FROM Far)

Problematic queries:

  • baz must always exist because it’s not optional
MATCH {
    class: Foo,
    as: foo
}.out('Foo_Bar') {
    as: bar
}, {
    class: Bar,
    as: bar
}.out('Bar_Baz') {
    as: baz
}, {
    class: Foo,
    as: foo
}.out('Foo_Far') {
    where: ($matched.baz IS null),
    as: far
}
RETURN $matches

// returns:
// +----+-----+-----+-----+-----+
// |#   |foo  |bar  |far  |baz  |
// +----+-----+-----+-----+-----+
// |0   |#11:0|#12:0|#14:0|#13:0|
// +----+-----+-----+-----+-----+
// 
// Note the existence of a result for "baz".
  • baz might not exist because now it’s optional
MATCH {
    class: Foo,
    as: foo
}.out('Foo_Bar') {
    as: bar
}, {
    class: Bar,
    as: bar
}.out('Bar_Baz') {
    optional: true,
    as: baz
}, {
    class: Foo,
    as: foo
}.out('Foo_Far') {
    where: ($matched.baz IS null),
    as: far
}
RETURN $matches

// returns:
// +----+-----+-----+-----+-----+
// |#   |foo  |bar  |far  |baz  |
// +----+-----+-----+-----+-----+
// |0   |#11:0|#12:0|#14:0|#13:0|
// +----+-----+-----+-----+-----+
// 
// Note the existence of a result for "baz".

This is extremely frustrating, I was hoping that your tests would have caught and prevented such a simple bug from making it into a release. This entire feature is broken at the moment.

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Comments: 28 (27 by maintainers)

Commits related to this issue

Most upvoted comments

If you are not doing late evaluation of $matched until 3.0, I think it’s a better idea to implement the heuristics on top of the topological sort. It’s a much cleaner solution because it’s provably correct and doesn’t need to retry with multiple entry points like you say. Also, those retries can be really expensive – getting a good order on the first try is much more likely to be faster overall.

In 3.0, you’ll be able to update the topological sort with “explicitly ignored dependencies” where $matches conditions evaluate to true early and then are re-evaluated later. Topological sort doesn’t prevent that optimization from being implemented, but it will prevent many issues until 3.0 comes out.

Since late evaluation of $matched isn’t coming until 3.0, please reconsider topological sort + heuristics. Tracking down and fixing correctness errors wastes time for us both, and money for both our companies.

Ok, first fix applied.