ray: [Serve] Memory leak after upgrading from 2.8.1 to 2.9.0

What happened + What you expected to happen

After upgrading Ray version from 2.8.1 to 2.9.0 we noticed a memory leak ray_node_mem_used on the head node. image

Versions / Dependencies

2.9.0

Reproduction script

Load is 200 QPS

import logging

from fastapi import FastAPI, Request
from fastapi.encoders import jsonable_encoder
from ray import serve
from starlette.responses import JSONResponse

logger = logging.getLogger("ray.serve")

app = FastAPI()


@serve.deployment
@serve.ingress(app)
class ModelServer:

    def __init__(self):
        logger.info("Initialized")


    @app.post("/inference")
    def inference(self, request: Request) -> JSONResponse:

        response = {
            "result": "OK"
        }

        return JSONResponse(content=jsonable_encoder(response))

esp_model_app = ModelServer.bind()

Issue Severity

High: It blocks me from completing my task.

About this issue

  • Original URL
  • State: closed
  • Created 6 months ago
  • Comments: 17 (13 by maintainers)

Most upvoted comments

I retried the repro with the fix. There’s no longer a memory leak:

Screenshot 2024-01-09 at 11 37 29 AM

I’ve applied the suggested workaround, and can confirm things looking better with it: image

Looking forward to the final fix. Thanks @shrekris-anyscale @rickyyx for the extra efforts with the investigation

The memory looks stable after removing it. I’ll sync with @rickyyx to see how to fix it.

This logic doesn’t exist in 2.8.1. So high probably it’s because of this.

I’ll take a look.