age: MERGE does not set properties involving variables

Note: This task is ONLY assigned to people in the ‘refactor create\merge’ project.

Describe the bug If properties in the MERGE clause is set from a previous variable (i.e. .. MERGE ( {key: some_variable} ) ..), that particular key-value pair is not persisted.

How are you accessing AGE (Command line, driver, etc.)? Command line

What data setup do we need to do?

...
SELECT * FROM cypher('xyz',
$$
CREATE (x:Label1{arr:[1,2,3,4]})
RETURN x 
$$) as (a agtype);
...

What is the necessary configuration info needed? N/A

What is the command that caused the error?

SELECT * FROM cypher('xyz', 
$$
MATCH (x:Label1{arr:[1,2,3,4]})
MERGE (y:Label2{key1:2, key2:x.arr, key3:3})
RETURN y
$$) as (a agtype);

Behavior The key2: x.arr pair is not persisted in the output.

Output:

                                             a                                             
-------------------------------------------------------------------------------------------
 {"id": 1688849860263937, "label": "Label2", "properties": {"key1": 2, "key3": 3}}::vertex
(1 row)

Expected behavior Expected output:

                                             a                                             
-------------------------------------------------------------------------------------------
 {"id": 1688849860263937, "label": "Label2", "properties": {"key1": 2, "key2": [1, 2, 3, 4], "key3": 3}}::vertex
(1 row)

Environment (please complete the following information): PG15.4 Master branch Ubuntu 22.04

Additional context N/A

About this issue

  • Original URL
  • State: closed
  • Created 10 months ago
  • Comments: 17 (13 by maintainers)

Most upvoted comments

It appears that the merge code (including the expression evaluation) is working correctly with what it is given. It is just that sometimes it gets a NULL value instead of the variable x in specific cases. So, I don’t think MERGE is the issue here. Something prior to MERGE is messing up x.

It also appears that the transform logic is working correctly, at least up to the RETURN transform for -

MATCH (x:Label1{arr:[1,2,3,4]}) MERGE (y:Label2{key2: id(x)}) RETURN y

It might be possible the planner is removing the variable x.

I compiled and tested PG11 1.1.0 and this issue exists there as well.

The query also succeeds in these cases -

psql-15.4-5432-pgsql=# SELECT * FROM cypher('xyz', $$ MATCH (x:Label1{arr:[1,2,3,4]})
MERGE (y:Label2{key1:2, key2:x.arr, key3:3}) RETURN x $$) as (a agtype);
                                            a
-----------------------------------------------------------------------------------------
 {"id": 844424930131971, "label": "Label1", "properties": {"arr": [1, 2, 3, 4]}}::vertex
(1 row)

psql-15.4-5432-pgsql=# SELECT * FROM cypher('xyz', $$ MATCH (u) return u$$) as (a agtype);
                                                        a
-----------------------------------------------------------------------------------------------------------------
 {"id": 844424930131971, "label": "Label1", "properties": {"arr": [1, 2, 3, 4]}}::vertex
 {"id": 1125899906842636, "label": "Label2", "properties": {"key1": 2, "key2": [1, 2, 3, 4], "key3": 3}}::vertex
(2 rows)

psql-15.4-5432-pgsql=#
psql-15.4-5432-pgsql=# SELECT * FROM cypher('xyz', $$ MATCH (x:Label1{arr:[1,2,3,4]})
MERGE (y:Label2{key1:2, key2:x.arr, key3:3}) $$) as (a agtype);
 a
---
(0 rows)

psql-15.4-5432-pgsql=# SELECT * FROM cypher('xyz', $$ MATCH (u) return u$$) as (a agtype);
                                                        a
-----------------------------------------------------------------------------------------------------------------
 {"id": 844424930131972, "label": "Label1", "properties": {"arr": [1, 2, 3, 4]}}::vertex
 {"id": 1125899906842637, "label": "Label2", "properties": {"key1": 2, "key2": [1, 2, 3, 4], "key3": 3}}::vertex
(2 rows)

psql-15.4-5432-pgsql=#

@jrgemignani Thanks for the PR. The issue is resolved.

@rafsun42 PR #1441 was created to address this issue.

I traced it through the code and it appears that the variable from the MATCH is not getting to the MERGE when the variable isn’t present in the RETURN list. Unless, there isn’t a RETURN, then it works.

@ksheroz @CapnSpek

Query plans are usually useful. However, for this issue I would prefer using a debugger.

I think this file-src/backend/executor/cypher_merge.c may be worthwhile to step through with a debugger.

Some observations:

  • The same query with CREATE instead of MERGE works just fine
SELECT * FROM cypher('xyz', 
$$
MATCH (x:Label1{arr:[1,2,3,4]})
CREATE (y:Label2{key1:2, key2:x.arr, key3:3})
RETURN y
$$) as (a agtype);
  • The same query with MERGE twice also works just fine
SELECT * FROM cypher('xyz', 
$$
MATCH (x:Label1{arr:[1,2,3,4]})
MERGE (y:Label2{key1:2, key2:x.arr, key3:3})
MERGE (z:Label2{key1:2, key2:x.arr, key3:3})
RETURN y
$$) as (a agtype);

Next, let us analyze the Query Plan Trees for MATCH-MERGE, MATCH-CREATE, and MATCH-MERGE-MERGE queries.

  • Query plan for MATCH-MERGE:
 Custom Scan (Cypher Merge)  (cost=0.00..0.00 rows=0 width=32)
   ->  Subquery Scan on _age_default_alias_previous_cypher_clause  (cost=0.00..86.02 rows=1 width=32)
         ->  Nested Loop Left Join  (cost=0.00..86.00 rows=1 width=96)
               Join Filter: (z.properties @> agtype_build_map('key1'::text, '2'::agtype, 'key2'::text, agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(x.id, _label_name('18201'::oid, x.id), x.
properties), '"arr"'::agtype]), 'key3'::text, '3'::agtype))
               ->  Seq Scan on "Label1" x  (cost=0.00..31.00 rows=1 width=40)
                     Filter: (properties @> agtype_build_map('arr'::text, agtype_build_list('1'::agtype, '2'::agtype, '3'::agtype, '4'::agtype)))
               ->  Seq Scan on "Label2" z  (cost=0.00..28.00 rows=1200 width=64)
(7 rows)
  • Query plan for MATCH-CREATE:
 Custom Scan (Cypher Create)  (cost=0.00..0.00 rows=0 width=32)
   ->  Subquery Scan on _age_default_alias_previous_cypher_clause  (cost=0.00..31.03 rows=1 width=32)
         ->  Seq Scan on "Label1" x  (cost=0.00..31.02 rows=1 width=160)
               Filter: (properties @> agtype_build_map('arr'::text, agtype_build_list('1'::agtype, '2'::agtype, '3'::agtype, '4'::agtype)))
(4 rows)
  • Query plan for MATCH-MERGE-MERGE:
 Custom Scan (Cypher Merge)  (cost=0.00..0.00 rows=0 width=32)
   ->  Subquery Scan on _age_default_alias_previous_cypher_clause  (cost=0.00..49.02 rows=1 width=32)
         ->  Nested Loop Left Join  (cost=0.00..49.01 rows=1 width=160)
               Join Filter: (z.properties @> agtype_build_map('key1'::text, '2'::agtype, 'key2'::text, agtype_access_operator(VARIADIC ARRAY[_age_default_alias_previous_cypher_clause_1.x, '"arr"'::agtype
]), 'key3'::text, '3'::agtype))
               ->  Custom Scan (Cypher Merge)  (cost=0.00..0.00 rows=0 width=64)
                     ->  Subquery Scan on _age_default_alias_previous_cypher_clause_1  (cost=0.00..86.02 rows=1 width=64)
                           ->  Nested Loop Left Join  (cost=0.00..86.01 rows=1 width=96)
                                 Join Filter: (y.properties @> agtype_build_map('key1'::text, '2'::agtype, 'key2'::text, agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(x.id, _label_name('1820
1'::oid, x.id), x.properties), '"arr"'::agtype]), 'key3'::text, '3'::agtype))
                                 ->  Seq Scan on "Label1" x  (cost=0.00..31.00 rows=1 width=40)
                                       Filter: (properties @> agtype_build_map('arr'::text, agtype_build_list('1'::agtype, '2'::agtype, '3'::agtype, '4'::agtype)))
                                 ->  Seq Scan on "Label2" y  (cost=0.00..28.00 rows=1200 width=64)
               ->  Seq Scan on "Label2" z  (cost=0.00..28.00 rows=1200 width=64)
(12 rows)

Cannot draw a conclusion

@ksheroz What do you think?

@rafsun42 kindly assign the task to me

Kindly assign the project to me. Thanks.

@rafsun42 Assign it to me please