sorry-cypress: When viewing recent builds error is logged: exceeded memory limit for $group, but didn't allow external sort

Before opening, please confirm:

Environment information

``` - sorry-cypress version: `2.5.1` - platform: `docker` - service: `api` ```

Describe the bug

We have 297 builds with over 700 tests each with 2 repeats (up to 3 runs total) and it seems mongo cannot cope with a group query taking more than 100MB RAM when viewing “Recent Builds” section, which in fact I was never able to view as it was added when we already had collected a lot of builds.

The UI shows: Error loading data The api log shows:

2023-02-25T10:27:43.884+0000 I COMMAND [conn16] command sorry-cypress.runs command: aggregate { aggregate: "runs", pipeline: [ { $match: {} }, { $sort: { _id: -1 } }, { $limit: 1000 }, { $group: { _id: "$meta.ciBuildId", runs: { $push: "$$ROOT" }, runId: { $min: "$_id" } } }, { $sort: { runId: -1 } }, { $limit: 20 }, { $addFields: { ciBuildId: "$_id", createdAt: { $arrayElemAt: [ "$runs.createdAt", -1 ] }, updatedAt: { $max: "$runs.progress.updatedAt" } } } ], cursor: {}, lsid: { id: UUID("12345678-1234-1234-1234-123456781234") }, $db: "sorry-cypress" } planSummary: IXSCAN { _id: 1 } numYields:17 queryHash:96B27E33 planCacheKey:96B27E33 ok:0 errMsg:"Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in." errName:Location16945 errCode:16945 reslen:188 locks:{ ReplicationStateTransition: { acquireCount: { w: 39 } }, Global: { acquireCount: { r: 39 } }, Database: { acquireCount: { r: 39 } }, Collection: { acquireCount: { r: 39 } }, Mutex: { acquireCount: { r: 22 } } } storage:{} protocol:op_msg 301ms

It seems the error is self-resolving, adding allowDiskUse to the respective Recent Builds aggregating query should fix it.

Expected behavior

Recent Builds shown

Reproduction steps

Collect a lot of test runs and open Recent Builds.

Full log and debug output

2023-02-25T10:27:33.211+0000 I  COMMAND  [conn19] command sorry-cypress.runs command: aggregate { aggregate: "runs", pipeline: [ { $match: {} }, { $sort: { _id: -1 } }, { $limit: 1000 }, { $group: { _id: "$meta.ciBuildId", runs: { $push: "$$ROOT" }, runId: { $min: "$_id" } } }, { $sort: { runId: -1 } }, { $limit: 20 }, { $addFields: { ciBuildId: "$_id", createdAt: { $arrayElemAt: [ "$runs.createdAt", -1 ] }, updatedAt: { $max: "$runs.progress.updatedAt" } } } ], cursor: {}, lsid: { id: UUID("12345678-1234-1234-1234-123456781234") }, $db: "sorry-cypress" } planSummary: IXSCAN { _id: 1 } numYields:16 queryHash:96B27E33 planCacheKey:96B27E33 ok:0 errMsg:"Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in." errName:Location16945 errCode:16945 reslen:188 locks:{ ReplicationStateTransition: { acquireCount: { w: 38 } }, Global: { acquireCount: { r: 38 } }, Database: { acquireCount: { r: 38 } }, Collection: { acquireCount: { r: 38 } }, Mutex: { acquireCount: { r: 22 } } } storage:{} protocol:op_msg 315ms
2023-02-25T10:27:38.545+0000 I  COMMAND  [conn3] command sorry-cypress.runs command: aggregate { aggregate: "runs", pipeline: [ { $match: {} }, { $sort: { _id: -1 } }, { $limit: 1000 }, { $group: { _id: "$meta.ciBuildId", runs: { $push: "$$ROOT" }, runId: { $min: "$_id" } } }, { $sort: { runId: -1 } }, { $limit: 20 }, { $addFields: { ciBuildId: "$_id", createdAt: { $arrayElemAt: [ "$runs.createdAt", -1 ] }, updatedAt: { $max: "$runs.progress.updatedAt" } } } ], cursor: {}, lsid: { id: UUID("12345678-1234-1234-1234-123456781234") }, $db: "sorry-cypress" } planSummary: IXSCAN { _id: 1 } numYields:16 queryHash:96B27E33 planCacheKey:96B27E33 ok:0 errMsg:"Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in." errName:Location16945 errCode:16945 reslen:188 locks:{ ReplicationStateTransition: { acquireCount: { w: 38 } }, Global: { acquireCount: { r: 38 } }, Database: { acquireCount: { r: 38 } }, Collection: { acquireCount: { r: 38 } }, Mutex: { acquireCount: { r: 22 } } } storage:{} protocol:op_msg 302ms
2023-02-25T10:27:43.884+0000 I  COMMAND  [conn16] command sorry-cypress.runs command: aggregate { aggregate: "runs", pipeline: [ { $match: {} }, { $sort: { _id: -1 } }, { $limit: 1000 }, { $group: { _id: "$meta.ciBuildId", runs: { $push: "$$ROOT" }, runId: { $min: "$_id" } } }, { $sort: { runId: -1 } }, { $limit: 20 }, { $addFields: { ciBuildId: "$_id", createdAt: { $arrayElemAt: [ "$runs.createdAt", -1 ] }, updatedAt: { $max: "$runs.progress.updatedAt" } } } ], cursor: {}, lsid: { id: UUID("12345678-1234-1234-1234-123456781234") }, $db: "sorry-cypress" } planSummary: IXSCAN { _id: 1 } numYields:17 queryHash:96B27E33 planCacheKey:96B27E33 ok:0 errMsg:"Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in." errName:Location16945 errCode:16945 reslen:188 locks:{ ReplicationStateTransition: { acquireCount: { w: 39 } }, Global: { acquireCount: { r: 39 } }, Database: { acquireCount: { r: 39 } }, Collection: { acquireCount: { r: 39 } }, Mutex: { acquireCount: { r: 22 } } } storage:{} protocol:op_msg 301ms

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 2
  • Comments: 16 (7 by maintainers)

Most upvoted comments

Just an update - v2.5.8 still has the problem:

2023-08-07T06:09:09.726+0000 I COMMAND [conn16] command sorry-cypress.runs command: aggregate { aggregate: "runs", pipeline: [ { $match: {} }, { $sort: { _id: -1 } }, { $limit: 1000 }, { $group: { _id: "$meta.ciBuildId", runs: { $push: "$$ROOT" }, runId: { $min: "$_id" } } }, { $sort: { runId: -1 } }, { $limit: 20 }, { $addFields: { ciBuildId: "$_id", createdAt: { $arrayElemAt: [ "$runs.createdAt", -1 ] }, updatedAt: { $max: "$runs.progress.updatedAt" } } } ], cursor: {}, lsid: { id: UUID("59a1d319-089f-4744-8dd6-6841a595f688") }, $db: "sorry-cypress" } planSummary: IXSCAN { _id: 1 } numYields:15 queryHash:96B27E33 planCacheKey:96B27E33 ok:0 errMsg:"Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in." errName:Location16945 errCode:16945 reslen:188 locks:{ ReplicationStateTransition: { acquireCount: { w: 36 } }, Global: { acquireCount: { r: 36 } }, Database: { acquireCount: { r: 36 } }, Collection: { acquireCount: { r: 36 } }, Mutex: { acquireCount: { r: 21 } } } storage:{} protocol:op_msg 267ms (with or without the following settings for sorry-cypress-api):

# CI_BUILD_BATCH_SIZE: 5000
# PAGE_ITEMS_LIMIT: 100

That would be great, I will test the docker image as soon as it is released.

We have only 1 project, with currently over 300 runs, each with almost 800 specs with up to 3 attempts.