mage: [BUG] Memory limit exceeded when adding distances from `distance_calculator.multiple()` as relationship properties

Describe the bug It is significantly slower when you run distance_calculator.single() instead of distance_calculator.multiple(). Why is that, and what is the best approach in such a situation? Also, if I want to use those distances and set them as properties of an edge, I run into

Query failed: Memory limit exceeded! Atempting to allocate a chunk of 3.00GiB which would put the current use to 12.35GiB, while the maximum allowed size for allocation is set to 9.72GiB.

The query is not working even with a 32GB instance.

To Reproduce Steps to reproduce the behavior:

  1. Import the data (save the below file as json, copy it to Memgraph and run CALL import_util.json("/usr/lib/memgraph/query_modules/export.json");. export.txt

  2. Run

MATCH (o) - [r:SHIPS_TO] -> (d)
WITH collect(o) as os, collect(d) as ds
CALL distance_calculator.multiple(os, ds, 'km') 
YIELD distances
RETURN distances;
  1. Run
MATCH (o) - [r:SHIPS_TO] -> (d)
CALL distance_calculator.single(o, d, 'km') 
YIELD distance
RETURN distance;
**Expected behavior**
A clear and concise description of what you expected to happen.

Compare the execution time of the two above queries. Why is the distance_calculator.single() significantly slower? Screenshot 2022-12-22 at 16 25 46

Another issue appears when I run distance_calculator.multiple() and I want to add the distance to the relationship between the nodes:

MATCH (o:Origin) --> (d:Destination)
WITH collect(o) as os, collect(d) as ds
CALL distance_calculator.multiple(os, ds, 'km')
YIELD distances
WITH os, ds, distances, range(0, size(distances) - 1, 1) AS iterator
UNWIND iterator AS idx 
WITH os[idx] as o, ds[idx] as d, distances[idx] as distance
MERGE (o)-[:SHIPS_TO {distance: distance}]->(d)
RETURN o, d, distance;

I get: Query failed: Memory limit exceeded! Atempting to allocate a chunk of 3.00GiB which would put the current use to 12.35GiB, while the maximum allowed size for allocation is set to 9.72GiB.

Is there any better way to write such a Cypher query?

Additional context This question was raised on Memgraph’s Discord server.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 1
  • Comments: 17 (2 by maintainers)

Most upvoted comments

I see, thanks @gitbuda! Maybe you could add some info about controlling memory usage to the Query failed: Could not allocate memory! error message.

Actually, we have a pull request on MAGE as a work in progress so feel free to check it out. But it probably won’t work as there is a bug when getting Properties from Node we are fixing on the Memgraph side.

Q: It is significantly slower when you run distance_calculator.single() instead of distance_calculator.multiple(). Why is that, and what is the best approach in such a situation?

A: The issue lies in the number of executions the Python script is called for calculating distances. the distance_calculator.single() takes into account only the distance between 2 points, and therefore it is very time consuming since for every row of the MATCH (o) - [r:SHIPS_TO] -> (d) the query module is called. It might be more readable, however it shows on the performance. On the other hand, when we collect inputs and outputs into 2 lists of the same size, the distance_calculator.multiple() is only called once and everything is executed at one go.