orientdb: MATCH bug in location existence check
OrientDB Version, operating system, or hardware.
- v2.0 SNAPSHOT[ ] - .18[ ] .17[ ] .16[ ] .15[ ] .14[ ] .13[ ] .12[ ] .11[ ] .10[ ] .9[ ] .8[ ] .7[ ] .6[ ] .5[ ] .4[ ] .3[ ] .2[ ] .1[ ] .0[ ]
- v2.1 SNAPSHOT[ ] - .16[ ] .15[ ] .14[ ] .13[ ] .12[ ] .11[ ] .10[ ] .9[ ] .8[ ] .7[ ] .6[ ] .5[ ] .4[ ] .3[ ] .2[ ] .1[ ] .0[ ]
- v2.2 SNAPSHOT[ ] - .13[x] .rc1[ ] .beta2[ ] .beta1[ ]
Operating System
- Linux
- MacOSX
- Windows
- Other Unix
- Other, name?
Expected behavior and actual behavior
Expected: If a traversal location named baz exists in the output, then checking $matched.baz IS null should be false in the MATCH query.
Actual: See below for an instance where data is returned for a location named baz, and yet the query evaluates $matched.baz IS null as true.
Steps to reproduce the problem
Here’s what the formed structure looks like:
Foo ==> Bar ==> Baz
\
\=> Far
Setup:
CREATE CLASS Foo EXTENDS V
CREATE CLASS Bar EXTENDS V
CREATE CLASS Baz EXTENDS V
CREATE CLASS Far EXTENDS V
CREATE CLASS Foo_Bar EXTENDS E
CREATE CLASS Bar_Baz EXTENDS E
CREATE CLASS Foo_Far EXTENDS E
CREATE VERTEX Foo SET name = 'foo'
CREATE VERTEX Bar SET name = 'bar'
CREATE VERTEX Baz SET name = 'baz'
CREATE VERTEX Far SET name = 'far'
CREATE EDGE Foo_Bar FROM (SELECT FROM Foo) TO (SELECT FROM Bar)
CREATE EDGE Bar_Baz FROM (SELECT FROM Bar) TO (SELECT FROM Baz)
CREATE EDGE Foo_Far FROM (SELECT FROM Foo) TO (SELECT FROM Far)
Problematic queries:
bazmust always exist because it’s not optional
MATCH {
class: Foo,
as: foo
}.out('Foo_Bar') {
as: bar
}, {
class: Bar,
as: bar
}.out('Bar_Baz') {
as: baz
}, {
class: Foo,
as: foo
}.out('Foo_Far') {
where: ($matched.baz IS null),
as: far
}
RETURN $matches
// returns:
// +----+-----+-----+-----+-----+
// |# |foo |bar |far |baz |
// +----+-----+-----+-----+-----+
// |0 |#11:0|#12:0|#14:0|#13:0|
// +----+-----+-----+-----+-----+
//
// Note the existence of a result for "baz".
bazmight not exist because now it’s optional
MATCH {
class: Foo,
as: foo
}.out('Foo_Bar') {
as: bar
}, {
class: Bar,
as: bar
}.out('Bar_Baz') {
optional: true,
as: baz
}, {
class: Foo,
as: foo
}.out('Foo_Far') {
where: ($matched.baz IS null),
as: far
}
RETURN $matches
// returns:
// +----+-----+-----+-----+-----+
// |# |foo |bar |far |baz |
// +----+-----+-----+-----+-----+
// |0 |#11:0|#12:0|#14:0|#13:0|
// +----+-----+-----+-----+-----+
//
// Note the existence of a result for "baz".
This is extremely frustrating, I was hoping that your tests would have caught and prevented such a simple bug from making it into a release. This entire feature is broken at the moment.
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 28 (27 by maintainers)
Commits related to this issue
- Fix MATCH statement with conditions on $matched Resolves: #6931 — committed to orientechnologies/orientdb by luigidellaquila 8 years ago
- Add test case for pattern matching with $matched conditions Issue #6931 — committed to orientechnologies/orientdb by luigidellaquila 8 years ago
- Fix MATCH statement with conditions on $matched Resolves: #6931 — committed to orientechnologies/orientdb by luigidellaquila 8 years ago
- Add test case for pattern matching with $matched conditions Issue #6931 — committed to orientechnologies/orientdb by luigidellaquila 8 years ago
- Fix check of MATCH WHERE expressions on pattern Related to: #6931 — committed to orientechnologies/orientdb by luigidellaquila 8 years ago
- Fix check of MATCH WHERE expressions on pattern Related to: #6931 — committed to orientechnologies/orientdb by luigidellaquila 8 years ago
- Check circular dependencies on $matched conditions Related to: #6931 — committed to orientechnologies/orientdb by luigidellaquila 8 years ago
- Check circular dependencies on $matched conditions Related to: #6931 — committed to orientechnologies/orientdb by luigidellaquila 8 years ago
- Add test cases for #6931 — committed to orientechnologies/orientdb by luigidellaquila 8 years ago
If you are not doing late evaluation of $matched until 3.0, I think it’s a better idea to implement the heuristics on top of the topological sort. It’s a much cleaner solution because it’s provably correct and doesn’t need to retry with multiple entry points like you say. Also, those retries can be really expensive – getting a good order on the first try is much more likely to be faster overall.
In 3.0, you’ll be able to update the topological sort with “explicitly ignored dependencies” where $matches conditions evaluate to
trueearly and then are re-evaluated later. Topological sort doesn’t prevent that optimization from being implemented, but it will prevent many issues until 3.0 comes out.Since late evaluation of $matched isn’t coming until 3.0, please reconsider topological sort + heuristics. Tracking down and fixing correctness errors wastes time for us both, and money for both our companies.
Ok, first fix applied.