dagster: `dagster-webserver` memory leak
Dagster version
1.5.13
Whatβs the issue?
dagster-webserver 1.5.13 seems to have some kind of memory leak. Since we updated to that version, we can observe a steady increase in memory usage over the last couple of weeks.
- The increase in memory usage correlates to the change of version, without any other change being introduced.
- We observe the same behaviour on 2 different GKE clusters.
- Reverting to
1.5.12resolves the issue.
What did you expect to happen?
No response
How to reproduce?
No response
Deployment type
Dagster Helm chart
Deployment details
No response
Additional information
No response
Message from the maintainers
Impacted by this issue? Give it a π! We factor engagement into prioritization.
About this issue
- Original URL
- State: open
- Created 6 months ago
- Reactions: 8
- Comments: 22 (8 by maintainers)
How did that impact your memory usage? Technically youβll still retain ticks for up to 365 days, thus you should not see a change in behavior in just a few days. Or did I miss something?
Iβve applied a similar setting on my deployment as well (way stricter than yours, for testing) and my memory is still going up, same as before.
We are also having the same issue on 1.6.0, also ECS/Fargate
@jvyoralek : No, we found out that itβs not working for us either. The initial indication that it was working was probably just a fluke.
Has anyone had success with the solution recommended by @stasharrofi ?
We have made changes, but it appears that the memory usage is still increasing.
I see anyio 4.3 in log
EDIT: We found out that the following is actually not working. The initial indication might have just been a fluke.
~We were having this issue and I believe that we have found the root cause to be a bug in
anyiowhich leaked processes. The bug was introduced in4.1.0and fixed in4.3.0(last week): https://github.com/agronholm/anyio/issues/669~~Dagster has a dependency on
anyiothrough the following chain:dagit --> dagster-webserver --> starlette --> anyioand I believe that this issue started to appear for people whenever they rebuilt their Dagster image during the time that bug was present because a newer but buggy version ofanyiowould have been included in their docker image.~~So, the solution could be to either explicitly require
anyio >= 4.3.0or to wait until people rebuild their docker images and automatically get the bug-fixed version.~I was on version 1.5.10
@aaaaahaaaaa did you find any reason why memory started growing? We have a similar issue and switching between versions didnβt help yet - tried from 1.5.14 to 1.5.12.
The memory increase is quite noticeable, showing up even in daily granularity.
This issue seems to be isolated to the webserver component. Both the daemon and code servers are exhibiting stable memory usage. We are operating these as three separate containers within AWS ECS.
We have only one scheduled job active, no sensors, auto-materialized so far. Assets are loaded from dbt.