hyper: Memory leak on high number of concurrent connections

Hyper version: 0.12.25

Context

Server

  • Run any hyper/hyper derived (warp, tide, etc) server, that just serves a simple hello text response.
use warp::Filter;

fn main() -> Result<(), exitfailure::ExitFailure> {
    let hello = warp::any().map(|| "Hello there!");
    info!("starting");
    let addr: std::net::SocketAddr = "127.0.0.1:9000".parse()?;
    warp::serve(hello)
        .run(addr);
    Ok(())
}

(For brevity, I’ve used the warp app, but a basic hyper sample works as well - Here’s a simple hello world servers off several hyper based servers (tide, warp, raw hyper, etc) that I used to test: https://github.com/quixon-labs/playground-rs)

Boom! (Stress tests)

  • Start bombarding the server with multiple concurrent connections, increasing them gradually.
for concurrency in 100 500 1000 2000 5000 10000 15000 20000 30000; do 
   h2load -D10 -c "$concurrency" http://localhost:9000/hello
   sleep 5s
done

Memory observation

  • Observe the memory rise up steadily but in sporadic intervals, hitting anywhere between 130MB to 300MB. (In my tests, the release builds significantly had higher usage, since they are able to handle a higher load, and lesser connections end up being dropped as a result).
  • This keeps increasing definitively as the concurrency increases, or sporadically otherwise. The only deterministic fact I’m able to observe is that it remains consistent without increased concurrency.
  • But they never come back down once they go up, as one might expect even when there is no load and after the system has had the enough time to play catch up closing all TCP connections, settling IO, kernel swapping, whatever, if any etc.

Screenshot from 2019-04-06 01-36-39

It doesn’t go beyond 5 threads. Note, when tokio blocking functions are used, this can rise up all the way to 100 threads (which don’t go back down yet) by default causing a significant memory bump. So, used a simpler example to sidestep this as the cause.

Heap profiles

  • Running this with valgrid (massif):

Here’s the relevant bit:

(I’m not entirely sure what exactly is causing this. My first guess was some thread_local cache per connection, but couldn’t really see any evidence to co-relate that yet). I’m also not sure if tokio or hyper is the cause. But it is consistently noticeable in all hyper based servers.

--------------------------------------------------------------------------------
  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
 62 83,195,458,515      213,409,040      212,742,984       666,056            0
99.69% (212,742,984B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->41.87% (89,363,456B) 0x958C5A: alloc::alloc::alloc (alloc.rs:75)
| ->41.76% (89,112,576B) 0x958A40: <alloc::alloc::Global as core::alloc::Alloc>::alloc (alloc.rs:151)
| | ->41.76% (89,112,576B) 0x95A490: alloc::raw_vec::RawVec<T,A>::allocate_in (raw_vec.rs:95)
| |   ->41.76% (89,112,576B) 0x95A284: alloc::raw_vec::RawVec<T>::with_capacity (raw_vec.rs:139)
| |     ->41.76% (89,112,576B) 0x957002: alloc::vec::Vec<T>::with_capacity (vec.rs:355)
| |       ->41.76% (89,112,576B) 0x955CF4: bytes::bytes::Inner::reserve (bytes.rs:2261)
| |       | ->41.76% (89,112,576B) 0x9542E6: bytes::bytes::BytesMut::reserve (bytes.rs:1444)
| |       |   ->41.76% (89,112,576B) 0x291A81: hyper::proto::h1::io::Buffered<T,B>::read_from_io (io.rs:175)
| |       |     ->41.76% (89,112,576B) 0x293D6C: hyper::proto::h1::io::Buffered<T,B>::parse (io.rs:160)
| |       |       ->41.76% (89,112,576B) 0x29B18F: hyper::proto::h1::conn::Conn<I,B,T>::read_head (conn.rs:137)
| |       |         ->41.76% (89,112,576B) 0x217ADA: hyper::proto::h1::dispatch::Dispatcher<D,Bs,I,T>::poll_read_head (dispatch.rs:200)
| |       |           ->41.76% (89,112,576B) 0x218D96: hyper::proto::h1::dispatch::Dispatcher<D,Bs,I,T>::poll_read (dispatch.rs:136)
| |       |             ->41.76% (89,112,576B) 0x214EEF: hyper::proto::h1::dispatch::Dispatcher<D,Bs,I,T>::poll_inner (dispatch.rs:100)
| |       |               ->41.76% (89,112,576B) 0x214A67: hyper::proto::h1::dispatch::Dispatcher<D,Bs,I,T>::poll_catch (dispatch.rs:86)
| |       |                 ->41.76% (89,112,576B) 0x21471E: <hyper::proto::h1::dispatch::Dispatcher<D,Bs,I,T> as futures::future::Future>::poll (dispatch.rs:348)
| |       |                   ->41.76% (89,112,576B) 0x249788: <futures::future::either::Either<A,B> as futures::future::Future>::poll (either.rs:35)
| |       |                     ->41.76% (89,112,576B) 0x28AE00: futures::future::option::<impl futures::future::Future for core::option::Option<F>>::poll (option.rs:12)
| |       |                       ->41.76% (89,112,576B) 0x24E556: <hyper::server::conn::upgrades::UpgradeableConnection<I,S,E> as futures::future::Future>::poll (conn.rs:934)
| |       |                         ->41.76% (89,112,576B) 0x241D49: <hyper::server::conn::spawn_all::NewSvcTask<I,N,S,E,W> as futures::future::Future>::poll (conn.rs:874)
| |       |                           ->41.76% (89,112,576B) 0x91EBE3: <alloc::boxed::Box<F> as futures::future::Future>::poll (mod.rs:113)
| |       |                             ->41.76% (89,112,576B) 0x7025E2: futures::task_impl::Spawn<T>::poll_future_notify::{{closure}} (mod.rs:326)
| |       |                               ->41.76% (89,112,576B) 0x70285F: futures::task_impl::Spawn<T>::enter::{{closure}} (mod.rs:396)
| |       |                                 ->41.76% (89,112,576B) 0x72E3C5: futures::task_impl::std::set (mod.rs:78)
| |       |                                   ->41.76% (89,112,576B) 0x7027FD: futures::task_impl::Spawn<T>::enter (mod.rs:396)
| |       |                                     ->41.76% (89,112,576B) 0x702396: futures::task_impl::Spawn<T>::poll_fn_notify (mod.rs:288)
| |       |                                       ->41.76% (89,112,576B) 0x702586: futures::task_impl::Spawn<T>::poll_future_notify (mod.rs:326)
| |       |                                         ->41.76% (89,112,576B) 0x70C200: tokio_threadpool::task::Task::run::{{closure}} (mod.rs:145)
| |       |                                           ->41.76% (89,112,576B) 0x715FE7: core::ops::function::FnOnce::call_once (function.rs:231)
| |       |                                             ->41.76% (89,112,576B) 0x704497: <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once (panic.rs:309)
| |       |                                               ->41.76% (89,112,576B) 0x7020D1: std::panicking::try::do_call (panicking.rs:293)
| |       |                                                 
| |       ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
| |       
| ->00.12% (250,880B) in 1+ places, all below ms_print's threshold (01.00%)
| 
->41.76% (89,112,618B) 0x828E4A: alloc::alloc::alloc (alloc.rs:75)
| ->41.76% (89,112,618B) 0x828C30: <alloc::alloc::Global as core::alloc::Alloc>::alloc (alloc.rs:151)
|   ->41.76% (89,112,618B) 0x827A90: alloc::raw_vec::RawVec<T,A>::allocate_in (raw_vec.rs:95)
|     ->41.76% (89,112,618B) 0x8278D4: alloc::raw_vec::RawVec<T>::with_capacity (raw_vec.rs:139)
|       ->41.76% (89,112,618B) 0x828852: alloc::vec::Vec<T>::with_capacity (vec.rs:355)
|         ->41.76% (89,112,576B) 0x290E55: hyper::proto::h1::io::WriteBuf<B>::new (io.rs:413)
|         | ->41.76% (89,112,576B) 0x292974: hyper::proto::h1::io::Buffered<T,B>::new (io.rs:65)
|         |   ->41.76% (89,112,576B) 0x29927E: hyper::proto::h1::conn::Conn<I,B,T>::new (conn.rs:39)
|         |     ->41.76% (89,112,576B) 0x1DD2A7: hyper::server::conn::Http<E>::serve_connection (conn.rs:374)
|         |       ->41.76% (89,112,576B) 0x1DFB9A: <hyper::server::conn::Connecting<I,F,E> as futures::future::Future>::poll (conn.rs:732)
|         |         ->41.76% (89,112,576B) 0x241ADF: <hyper::server::conn::spawn_all::NewSvcTask<I,N,S,E,W> as futures::future::Future>::poll (conn.rs:864)
|         |           ->41.76% (89,112,576B) 0x91EBE3: <alloc::boxed::Box<F> as futures::future::Future>::poll (mod.rs:113)
|         |             ->41.76% (89,112,576B) 0x7025E2: futures::task_impl::Spawn<T>::poll_future_notify::{{closure}} (mod.rs:326)
|         |               ->41.76% (89,112,576B) 0x70285F: futures::task_impl::Spawn<T>::enter::{{closure}} (mod.rs:396)
|         |                 ->41.76% (89,112,576B) 0x72E3C5: futures::task_impl::std::set (mod.rs:78)
|         |                   ->41.76% (89,112,576B) 0x7027FD: futures::task_impl::Spawn<T>::enter (mod.rs:396)
|         |                     ->41.76% (89,112,576B) 0x702396: futures::task_impl::Spawn<T>::poll_fn_notify (mod.rs:288)
|         |                       ->41.76% (89,112,576B) 0x702586: futures::task_impl::Spawn<T>::poll_future_notify (mod.rs:326)
|         |                         ->41.76% (89,112,576B) 0x70C200: tokio_threadpool::task::Task::run::{{closure}} (mod.rs:145)
|         |                           ->41.76% (89,112,576B) 0x715FE7: core::ops::function::FnOnce::call_once (function.rs:231)
|         |                             ->41.76% (89,112,576B) 0x704497: <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once (panic.rs:309)
|         |                               ->41.76% (89,112,576B) 0x7020D1: std::panicking::try::do_call (panicking.rs:293)
|         |                                 ->41.76% (89,112,576B) 0x979428: __rust_maybe_catch_panic (lib.rs:87)
|         |                                   ->41.76% (89,112,576B) 0x701E0E: std::panicking::try (panicking.rs:272)
|         |                                     ->41.76% (89,112,576B) 0x70457F: std::panic::catch_unwind (panic.rs:388)
|         |                                       ->41.76% (89,112,576B) 0x70BC84: tokio_threadpool::task::Task::run (mod.rs:130)
|         |                                         ->41.76% (89,112,576B) 0x722C0D: tokio_threadpool::worker::Worker::run_task2 (mod.rs:567)
|         |                                           ->41.76% (89,112,576B) 0x7224D2: tokio_threadpool::worker::Worker::run_task (mod.rs:459)
|         |                                             ->41.44% (88,432,640B) 0x721B51: tokio_threadpool::worker::Worker::try_run_owned_task (mod.rs:390)
|         |                                             | ->41.44% (88,432,640B) 0x72158B: tokio_threadpool::worker::Worker::try_run_task (mod.rs:297)
|         |                                             |   
|         |                                             ->00.32% (679,936B) in 1+ places, all below ms_print's threshold (01.00%)
|         |                                             
|         ->00.00% (42B) in 1+ places, all below ms_print's threshold (01.00%)

Here’s the full massif profile: massif.out.zip

Note: The above massif profile is from a cargo build without release optimisations which caused the memory levels to stay much lower. Running this on release builds atleast double the usage as compared to the debug builds.


Edit:

I’m attaching here, another massif profile (release build) that’s probably more interesting:

--------------------------------------------------------------------------------
  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
 79  7,621,402,146      408,363,480      407,058,828     1,304,652            0
99.68% (407,058,828B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->41.75% (170,500,096B) 0x184EC8: hyper::proto::h1::io::Buffered<T,B>::new (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
| ->41.75% (170,500,096B) 0x186BAC: <hyper::server::conn::spawn_all::NewSvcTask<I,N,S,E,W> as futures::future::Future>::poll (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|   ->41.75% (170,500,096B) 0x2D3CB0: futures::task_impl::std::set (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|     ->41.75% (170,500,096B) 0x2D48A4: futures::task_impl::Spawn<T>::poll_future_notify (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|       ->41.75% (170,500,096B) 0x2D78DF: _ZN3std9panicking3try7do_call17hf61819960f116aefE.llvm.11685027487046046852 (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|         ->41.75% (170,500,096B) 0x347D38: __rust_maybe_catch_panic (lib.rs:87)
|           ->41.75% (170,500,096B) 0x2D7E7E: tokio_threadpool::task::Task::run (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|             ->41.75% (170,500,096B) 0x2D0463: tokio_threadpool::worker::Worker::run_task (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|               ->39.95% (163,135,488B) 0x2CF95C: tokio_threadpool::worker::Worker::run (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|               | ->39.95% (163,135,488B) 0x2B3705: tokio_trace_core::dispatcher::with_default (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|               |   ->39.95% (163,135,488B) 0x2B48F4: std::thread::local::LocalKey<T>::with (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|               |     ->39.95% (163,135,488B) 0x2B4ABC: std::thread::local::LocalKey<T>::with (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|               |       ->39.95% (163,135,488B) 0x2B4C42: std::thread::local::LocalKey<T>::with (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|               |         ->39.95% (163,135,488B) 0x2B8015: tokio::runtime::threadpool::builder::Builder::build::{{closure}} (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|               |           ->39.95% (163,135,488B) 0x2D8921: std::thread::local::LocalKey<T>::with (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|               |             ->39.95% (163,135,488B) 0x2D86FC: std::thread::local::LocalKey<T>::with (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|               |               ->39.95% (163,135,488B) 0x2DA56F: std::sys_common::backtrace::__rust_begin_short_backtrace (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|               |                 ->39.95% (163,135,488B) 0x2D789A: _ZN3std9panicking3try7do_call17h20721f9dd88a1855E.llvm.11685027487046046852 (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|               |                   ->39.95% (163,135,488B) 0x347D38: __rust_maybe_catch_panic (lib.rs:87)
|               |                     ->39.95% (163,135,488B) 0x2D259E: <F as alloc::boxed::FnBox<A>>::call_box (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|               |                       ->39.95% (163,135,488B) 0x34715C: std::sys::unix::thread::Thread::new::thread_start (boxed.rs:759)
|               |                         ->39.95% (163,135,488B) 0x487D162: start_thread (pthread_create.c:486)
|               |                           ->39.95% (163,135,488B) 0x49CADED: clone (clone.S:95)
|               |                             
|               ->01.80% (7,364,608B) 0x2CFB67: tokio_threadpool::worker::Worker::run (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                 ->01.80% (7,364,608B) 0x2B3705: tokio_trace_core::dispatcher::with_default (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                   ->01.80% (7,364,608B) 0x2B48F4: std::thread::local::LocalKey<T>::with (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                     ->01.80% (7,364,608B) 0x2B4ABC: std::thread::local::LocalKey<T>::with (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                       ->01.80% (7,364,608B) 0x2B4C42: std::thread::local::LocalKey<T>::with (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                         ->01.80% (7,364,608B) 0x2B8015: tokio::runtime::threadpool::builder::Builder::build::{{closure}} (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                           ->01.80% (7,364,608B) 0x2D8921: std::thread::local::LocalKey<T>::with (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                             ->01.80% (7,364,608B) 0x2D86FC: std::thread::local::LocalKey<T>::with (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                               ->01.80% (7,364,608B) 0x2DA56F: std::sys_common::backtrace::__rust_begin_short_backtrace (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                                 ->01.80% (7,364,608B) 0x2D789A: _ZN3std9panicking3try7do_call17h20721f9dd88a1855E.llvm.11685027487046046852 (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                                   ->01.80% (7,364,608B) 0x347D38: __rust_maybe_catch_panic (lib.rs:87)
|                                     ->01.80% (7,364,608B) 0x2D259E: <F as alloc::boxed::FnBox<A>>::call_box (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                                       ->01.80% (7,364,608B) 0x34715C: std::sys::unix::thread::Thread::new::thread_start (boxed.rs:759)
|                                         ->01.80% (7,364,608B) 0x487D162: start_thread (pthread_create.c:486)
|                                           ->01.80% (7,364,608B) 0x49CADED: clone (clone.S:95)
|                                             
->41.75% (170,500,096B) 0x333E8A: bytes::bytes::Inner::reserve (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
| ->41.75% (170,500,096B) 0x184D6C: hyper::proto::h1::io::Buffered<T,B>::read_from_io (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|   ->41.75% (170,500,096B) 0x18585E: hyper::proto::h1::io::Buffered<T,B>::parse (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|     ->41.75% (170,500,096B) 0x166312: hyper::proto::h1::conn::Conn<I,B,T>::read_head (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|       ->41.75% (170,500,096B) 0x16C6F3: hyper::proto::h1::dispatch::Dispatcher<D,Bs,I,T>::poll_catch (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|         ->41.75% (170,500,096B) 0x1A9B8F: <hyper::server::conn::upgrades::UpgradeableConnection<I,S,E> as futures::future::Future>::poll (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|           ->41.75% (170,500,096B) 0x1875B1: <hyper::server::conn::spawn_all::NewSvcTask<I,N,S,E,W> as futures::future::Future>::poll (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|             ->41.75% (170,500,096B) 0x2D3CB0: futures::task_impl::std::set (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|               ->41.75% (170,500,096B) 0x2D48A4: futures::task_impl::Spawn<T>::poll_future_notify (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                 ->41.75% (170,500,096B) 0x2D78DF: _ZN3std9panicking3try7do_call17hf61819960f116aefE.llvm.11685027487046046852 (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                   ->41.75% (170,500,096B) 0x347D38: __rust_maybe_catch_panic (lib.rs:87)
|                     ->41.75% (170,500,096B) 0x2D7E7E: tokio_threadpool::task::Task::run (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                       ->41.75% (170,500,096B) 0x2D0463: tokio_threadpool::worker::Worker::run_task (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                         ->39.95% (163,135,488B) 0x2CF95C: tokio_threadpool::worker::Worker::run (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                         | ->39.95% (163,135,488B) 0x2B3705: tokio_trace_core::dispatcher::with_default (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                         |   ->39.95% (163,135,488B) 0x2B48F4: std::thread::local::LocalKey<T>::with (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                         |     ->39.95% (163,135,488B) 0x2B4ABC: std::thread::local::LocalKey<T>::with (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                         |       ->39.95% (163,135,488B) 0x2B4C42: std::thread::local::LocalKey<T>::with (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                         |         ->39.95% (163,135,488B) 0x2B8015: tokio::runtime::threadpool::builder::Builder::build::{{closure}} (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                         |           ->39.95% (163,135,488B) 0x2D8921: std::thread::local::LocalKey<T>::with (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                         |             ->39.95% (163,135,488B) 0x2D86FC: std::thread::local::LocalKey<T>::with (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                         |               ->39.95% (163,135,488B) 0x2DA56F: std::sys_common::backtrace::__rust_begin_short_backtrace (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                         |                 ->39.95% (163,135,488B) 0x2D789A: _ZN3std9panicking3try7do_call17h20721f9dd88a1855E.llvm.11685027487046046852 (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                         |                   ->39.95% (163,135,488B) 0x347D38: __rust_maybe_catch_panic (lib.rs:87)
|                         |                     ->39.95% (163,135,488B) 0x2D259E: <F as alloc::boxed::FnBox<A>>::call_box (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                         |                       ->39.95% (163,135,488B) 0x34715C: std::sys::unix::thread::Thread::new::thread_start (boxed.rs:759)
|                         |                         ->39.95% (163,135,488B) 0x487D162: start_thread (pthread_create.c:486)
|                         |                           ->39.95% (163,135,488B) 0x49CADED: clone (clone.S:95)
|                         |                             
|                         ->01.80% (7,364,608B) 0x2CFB67: tokio_threadpool::worker::Worker::run (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                           ->01.80% (7,364,608B) 0x2B3705: tokio_trace_core::dispatcher::with_default (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                             ->01.80% (7,364,608B) 0x2B48F4: std::thread::local::LocalKey<T>::with (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                               ->01.80% (7,364,608B) 0x2B4ABC: std::thread::local::LocalKey<T>::with (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                                 ->01.80% (7,364,608B) 0x2B4C42: std::thread::local::LocalKey<T>::with (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                                   ->01.80% (7,364,608B) 0x2B8015: tokio::runtime::threadpool::builder::Builder::build::{{closure}} (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                                     ->01.80% (7,364,608B) 0x2D8921: std::thread::local::LocalKey<T>::with (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                                       ->01.80% (7,364,608B) 0x2D86FC: std::thread::local::LocalKey<T>::with (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                                         ->01.80% (7,364,608B) 0x2DA56F: std::sys_common::backtrace::__rust_begin_short_backtrace (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                                           ->01.80% (7,364,608B) 0x2D789A: _ZN3std9panicking3try7do_call17h20721f9dd88a1855E.llvm.11685027487046046852 (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                                             ->01.80% (7,364,608B) 0x347D38: __rust_maybe_catch_panic (lib.rs:87)
|                                               ->01.80% (7,364,608B) 0x2D259E: <F as alloc::boxed::FnBox<A>>::call_box (in /home/pvl/workspace/repo/prasannavl/local/rusty/target/release/rusty)
|                                                 ->01.80% (7,364,608B) 0x34715C: std::sys::unix::thread::Thread::new::thread_start (boxed.rs:759)
|                                                   ->01.80% (7,364,608B) 0x487D162: start_thread (pthread_create.c:486)
|                                                     ->01.80% (7,364,608B) 0x49CADED: clone (clone.S:95)

Full massif profile [release build]: massif.release.out.zip

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 3
  • Comments: 51 (17 by maintainers)

Most upvoted comments

I don’t thinkd the issue should be closed without a solution .

As stated before, all updates are in the issue.

I have not been able to reproduce it myself. It’s difficult to pin point that this is actually leaked memory. It can also be a case of the allocator not releasing back to the OS. Does the behavior exist if you use jemalloc instead of the system allocator?

I’m inclined to close this (really this time) unless someone would like to dig more into this themselves, as I can’t trigger it.

@seanmonstar I ran into this issue with Tonic 0.7.2 using 0.14.19 under tokio 1.19.2. The behavior was exactly the same with the system allocator and with jemalloc: the more load (requests/second) you put on the server, the more memory it allocates. That memory is not freed. Decreased load is handled without further allocation, but once the load is increases about the former level, additional memory gets allocated.

running @brianbruggeman test-case (which wrongfully suggests rss increase as memory leaks) multiple times against the same process, process rss didn’t exceed a certain size:

----------------------------------------
Settings:
----------------------------------------
Iterating 10 times
Loading 127.0.0.1:3030
Running for 3s each iteration
Using 100 workers
----------------------------------------
===============================
Iteration: 1 of 10
-------------------------------
Requests/sec: 48387.9238
Latency (worst case): 0.0072 secs
-------------------------------
Starting memory: 11448 KB
Ending memory: 11236 KB
Memory leaked: -212 KB
===============================
Iteration: 2 of 10
-------------------------------
Requests/sec: 69314.8785
Latency (worst case): 0.0041 secs
-------------------------------
Starting memory: 11236 KB
Ending memory: 11312 KB
Memory leaked: 76 KB
===============================
Iteration: 3 of 10
-------------------------------
Requests/sec: 69045.5330
Latency (worst case): 0.0041 secs
-------------------------------
Starting memory: 11312 KB
Ending memory: 11040 KB
Memory leaked: -272 KB
===============================
Iteration: 4 of 10
-------------------------------
Requests/sec: 65886.4996
Latency (worst case): 0.0042 secs
-------------------------------
Starting memory: 11040 KB
Ending memory: 10864 KB
Memory leaked: -176 KB
===============================
Iteration: 5 of 10
-------------------------------
Requests/sec: 64415.6591
Latency (worst case): 0.0042 secs
-------------------------------
Starting memory: 10864 KB
Ending memory: 11004 KB
Memory leaked: 140 KB
===============================
Iteration: 6 of 10
-------------------------------
Requests/sec: 58950.3058
Latency (worst case): 0.0054 secs
-------------------------------
Starting memory: 11004 KB
Ending memory: 10244 KB
Memory leaked: -760 KB
===============================
Iteration: 7 of 10
-------------------------------
Requests/sec: 43750.1205
Latency (worst case): 0.0095 secs
-------------------------------
Starting memory: 10244 KB
Ending memory: 10420 KB
Memory leaked: 176 KB
===============================
Iteration: 8 of 10
-------------------------------
Requests/sec: 60437.2641
Latency (worst case): 0.0047 secs
-------------------------------
Starting memory: 10420 KB
Ending memory: 10560 KB
Memory leaked: 140 KB
===============================
Iteration: 9 of 10
-------------------------------
Requests/sec: 57798.8734
Latency (worst case): 0.0049 secs
-------------------------------
Starting memory: 10560 KB
Ending memory: 10276 KB
Memory leaked: -284 KB
===============================
Iteration: 10 of 10
-------------------------------
Requests/sec: 46854.6004
Latency (worst case): 0.0106 secs
-------------------------------
Starting memory: 10276 KB
Ending memory: 10612 KB
Memory leaked: 336 KB
===============================

----------------------------------------
Settings:
----------------------------------------
Iterating 10 times
Loading 127.0.0.1:3030
Running for 3s each iteration
Using 100 workers
----------------------------------------
===============================
Iteration: 1 of 10
-------------------------------
Requests/sec: 69239.5438
Latency (worst case): 0.0041 secs
-------------------------------
Starting memory: 10612 KB
Ending memory: 10936 KB
Memory leaked: 324 KB
===============================
Iteration: 2 of 10
-------------------------------
Requests/sec: 68014.9964
Latency (worst case): 0.0042 secs
-------------------------------
Starting memory: 10936 KB
Ending memory: 10812 KB
Memory leaked: -124 KB
===============================
Iteration: 3 of 10
-------------------------------
Requests/sec: 65241.9599
Latency (worst case): 0.0044 secs
-------------------------------
Starting memory: 10812 KB
Ending memory: 11176 KB
Memory leaked: 364 KB
===============================
Iteration: 4 of 10
-------------------------------
Requests/sec: 64965.6894
Latency (worst case): 0.0044 secs
-------------------------------
Starting memory: 11176 KB
Ending memory: 11488 KB
Memory leaked: 312 KB
===============================
Iteration: 5 of 10
-------------------------------
Requests/sec: 64656.9231
Latency (worst case): 0.0044 secs
-------------------------------
Starting memory: 11488 KB
Ending memory: 11156 KB
Memory leaked: -332 KB
===============================
Iteration: 6 of 10
-------------------------------
Requests/sec: 62959.1025
Latency (worst case): 0.0045 secs
-------------------------------
Starting memory: 11156 KB
Ending memory: 11476 KB
Memory leaked: 320 KB
===============================
Iteration: 7 of 10
-------------------------------
Requests/sec: 63990.3273
Latency (worst case): 0.0043 secs
-------------------------------
Starting memory: 11476 KB
Ending memory: 11748 KB
Memory leaked: 272 KB
===============================
Iteration: 8 of 10
-------------------------------
Requests/sec: 61384.1161
Latency (worst case): 0.0055 secs
-------------------------------
Starting memory: 11748 KB
Ending memory: 11764 KB
Memory leaked: 16 KB
===============================
Iteration: 9 of 10
-------------------------------
Requests/sec: 65772.4001
Latency (worst case): 0.0041 secs
-------------------------------
Starting memory: 11764 KB
Ending memory: 11460 KB
Memory leaked: -304 KB
===============================
Iteration: 10 of 10
-------------------------------
Requests/sec: 60818.1767
Latency (worst case): 0.0052 secs
-------------------------------
Starting memory: 11460 KB
Ending memory: 11552 KB
Memory leaked: 92 KB
===============================

----------------------------------------
Settings:
----------------------------------------
Iterating 10 times
Loading 127.0.0.1:3030
Running for 3s each iteration
Using 100 workers
----------------------------------------
===============================
Iteration: 1 of 10
-------------------------------
Requests/sec: 69513.3845
Latency (worst case): 0.0043 secs
-------------------------------
Starting memory: 11552 KB
Ending memory: 11312 KB
Memory leaked: -240 KB
===============================
Iteration: 2 of 10
-------------------------------
Requests/sec: 68651.8908
Latency (worst case): 0.0041 secs
-------------------------------
Starting memory: 11312 KB
Ending memory: 11940 KB
Memory leaked: 628 KB
===============================
Iteration: 3 of 10
-------------------------------
Requests/sec: 66253.8193
Latency (worst case): 0.0043 secs
-------------------------------
Starting memory: 11940 KB
Ending memory: 11496 KB
Memory leaked: -444 KB
===============================
Iteration: 4 of 10
-------------------------------
Requests/sec: 63073.4260
Latency (worst case): 0.0047 secs
-------------------------------
Starting memory: 11496 KB
Ending memory: 11412 KB
Memory leaked: -84 KB
===============================
Iteration: 5 of 10
-------------------------------
Requests/sec: 41630.8405
Latency (worst case): 0.0098 secs
-------------------------------
Starting memory: 11412 KB
Ending memory: 11516 KB
Memory leaked: 104 KB
===============================
Iteration: 6 of 10
-------------------------------
Requests/sec: 64138.8079
Latency (worst case): 0.0044 secs
-------------------------------
Starting memory: 11516 KB
Ending memory: 11800 KB
Memory leaked: 284 KB
===============================
Iteration: 7 of 10
-------------------------------
Requests/sec: 65103.9589
Latency (worst case): 0.0043 secs
-------------------------------
Starting memory: 11800 KB
Ending memory: 11800 KB
Memory leaked: 0 KB
===============================
Iteration: 8 of 10
-------------------------------
Requests/sec: 64948.5499
Latency (worst case): 0.0044 secs
-------------------------------
Starting memory: 11800 KB
Ending memory: 12272 KB
Memory leaked: 472 KB
===============================
Iteration: 9 of 10
-------------------------------
Requests/sec: 64483.5093
Latency (worst case): 0.0043 secs
-------------------------------
Starting memory: 12272 KB
Ending memory: 12400 KB
Memory leaked: 128 KB
===============================
Iteration: 10 of 10
-------------------------------
Requests/sec: 65063.2982
Latency (worst case): 0.0043 secs
-------------------------------
Starting memory: 12400 KB
Ending memory: 11956 KB
Memory leaked: -444 KB
===============================

----------------------------------------
Settings:
----------------------------------------
Iterating 10 times
Loading 127.0.0.1:3030
Running for 3s each iteration
Using 100 workers
----------------------------------------
===============================
Iteration: 1 of 10
-------------------------------
Requests/sec: 71430.5112
Latency (worst case): 0.0039 secs
-------------------------------
Starting memory: 11956 KB
Ending memory: 11356 KB
Memory leaked: -600 KB
===============================
Iteration: 2 of 10
-------------------------------
Requests/sec: 68816.3673
Latency (worst case): 0.0042 secs
-------------------------------
Starting memory: 11356 KB
Ending memory: 11292 KB
Memory leaked: -64 KB
===============================
Iteration: 3 of 10
-------------------------------
Requests/sec: 67381.1404
Latency (worst case): 0.0043 secs
-------------------------------
Starting memory: 11292 KB
Ending memory: 11500 KB
Memory leaked: 208 KB
===============================
Iteration: 4 of 10
-------------------------------
Requests/sec: 65377.7233
Latency (worst case): 0.0045 secs
-------------------------------
Starting memory: 11500 KB
Ending memory: 11492 KB
Memory leaked: -8 KB
===============================
Iteration: 5 of 10
-------------------------------
Requests/sec: 61778.3874
Latency (worst case): 0.0049 secs
-------------------------------
Starting memory: 11492 KB
Ending memory: 11068 KB
Memory leaked: -424 KB
===============================
Iteration: 6 of 10
-------------------------------
Requests/sec: 61428.5839
Latency (worst case): 0.0047 secs
-------------------------------
Starting memory: 11068 KB
Ending memory: 11372 KB
Memory leaked: 304 KB
===============================
Iteration: 7 of 10
-------------------------------
Requests/sec: 61004.0751
Latency (worst case): 0.0048 secs
-------------------------------
Starting memory: 11372 KB
Ending memory: 11972 KB
Memory leaked: 600 KB
===============================
Iteration: 8 of 10
-------------------------------
Requests/sec: 60177.3556
Latency (worst case): 0.0049 secs
-------------------------------
Starting memory: 11972 KB
Ending memory: 11360 KB
Memory leaked: -612 KB
===============================
Iteration: 9 of 10
-------------------------------
Requests/sec: 56961.2634
Latency (worst case): 0.0053 secs
-------------------------------
Starting memory: 11360 KB
Ending memory: 11600 KB
Memory leaked: 240 KB
===============================
Iteration: 10 of 10
-------------------------------
Requests/sec: 61224.8200
Latency (worst case): 0.0049 secs
-------------------------------
Starting memory: 11600 KB
Ending memory: 11540 KB
Memory leaked: -60 KB
===============================

----------------------------------------
Settings:
----------------------------------------
Iterating 10 times
Loading 127.0.0.1:3030
Running for 3s each iteration
Using 100 workers
----------------------------------------
===============================
Iteration: 1 of 10
-------------------------------
Requests/sec: 67056.4463
Latency (worst case): 0.0043 secs
-------------------------------
Starting memory: 11540 KB
Ending memory: 11964 KB
Memory leaked: 424 KB
===============================
Iteration: 2 of 10
-------------------------------
Requests/sec: 66026.3226
Latency (worst case): 0.0042 secs
-------------------------------
Starting memory: 11964 KB
Ending memory: 12140 KB
Memory leaked: 176 KB
===============================
Iteration: 3 of 10
-------------------------------
Requests/sec: 66627.3795
Latency (worst case): 0.0044 secs
-------------------------------
Starting memory: 12140 KB
Ending memory: 12108 KB
Memory leaked: -32 KB
===============================
Iteration: 4 of 10
-------------------------------
Requests/sec: 66494.5235
Latency (worst case): 0.0041 secs
-------------------------------
Starting memory: 12108 KB
Ending memory: 11924 KB
Memory leaked: -184 KB
===============================
Iteration: 5 of 10
-------------------------------
Requests/sec: 55494.9673
Latency (worst case): 0.0061 secs
-------------------------------
Starting memory: 11924 KB
Ending memory: 11860 KB
Memory leaked: -64 KB
===============================
Iteration: 6 of 10
-------------------------------
Requests/sec: 46438.2939
Latency (worst case): 0.0071 secs
-------------------------------
Starting memory: 11860 KB
Ending memory: 12332 KB
Memory leaked: 472 KB
===============================
Iteration: 7 of 10
-------------------------------
Requests/sec: 65939.5318
Latency (worst case): 0.0041 secs
-------------------------------
Starting memory: 12332 KB
Ending memory: 12436 KB
Memory leaked: 104 KB
===============================
Iteration: 8 of 10
-------------------------------
Requests/sec: 64543.1936
Latency (worst case): 0.0044 secs
-------------------------------
Starting memory: 12436 KB
Ending memory: 12412 KB
Memory leaked: -24 KB
===============================
Iteration: 9 of 10
-------------------------------
Requests/sec: 65390.4177
Latency (worst case): 0.0041 secs
-------------------------------
Starting memory: 12412 KB
Ending memory: 12596 KB
Memory leaked: 184 KB
===============================
Iteration: 10 of 10
-------------------------------
Requests/sec: 64624.9731
Latency (worst case): 0.0042 secs
-------------------------------
Starting memory: 12596 KB
Ending memory: 11808 KB
Memory leaked: -788 KB
===============================

----------------------------------------
Settings:
----------------------------------------
Iterating 10 times
Loading 127.0.0.1:3030
Running for 3s each iteration
Using 100 workers
----------------------------------------
===============================
Iteration: 1 of 10
-------------------------------
Requests/sec: 72250.3533
Latency (worst case): 0.0038 secs
-------------------------------
Starting memory: 11808 KB
Ending memory: 11624 KB
Memory leaked: -184 KB
===============================
Iteration: 2 of 10
-------------------------------
Requests/sec: 70754.5880
Latency (worst case): 0.0038 secs
-------------------------------
Starting memory: 11624 KB
Ending memory: 11512 KB
Memory leaked: -112 KB
===============================
Iteration: 3 of 10
-------------------------------
Requests/sec: 65921.8890
Latency (worst case): 0.0046 secs
-------------------------------
Starting memory: 11512 KB
Ending memory: 11608 KB
Memory leaked: 96 KB
===============================
Iteration: 4 of 10
-------------------------------
Requests/sec: 64142.0923
Latency (worst case): 0.0043 secs
-------------------------------
Starting memory: 11608 KB
Ending memory: 11408 KB
Memory leaked: -200 KB
===============================
Iteration: 5 of 10
-------------------------------
Requests/sec: 63555.2797
Latency (worst case): 0.0044 secs
-------------------------------
Starting memory: 11408 KB
Ending memory: 11760 KB
Memory leaked: 352 KB
===============================
Iteration: 6 of 10
-------------------------------
Requests/sec: 57089.7501
Latency (worst case): 0.0058 secs
-------------------------------
Starting memory: 11760 KB
Ending memory: 12104 KB
Memory leaked: 344 KB
===============================
Iteration: 7 of 10
-------------------------------
Requests/sec: 48886.2883
Latency (worst case): 0.0069 secs
-------------------------------
Starting memory: 12104 KB
Ending memory: 11560 KB
Memory leaked: -544 KB
===============================
Iteration: 8 of 10
-------------------------------
Requests/sec: 48431.9411
Latency (worst case): 0.0065 secs
-------------------------------
Starting memory: 11560 KB
Ending memory: 12204 KB
Memory leaked: 644 KB
===============================
Iteration: 9 of 10
-------------------------------
Requests/sec: 56847.6384
Latency (worst case): 0.0058 secs
-------------------------------
Starting memory: 12204 KB
Ending memory: 11816 KB
Memory leaked: -388 KB
===============================
Iteration: 10 of 10
-------------------------------
Requests/sec: 63177.3175
Latency (worst case): 0.0047 secs
-------------------------------
Starting memory: 11816 KB
Ending memory: 11936 KB
Memory leaked: 120 KB
===============================

furthermore when attaching a tool like Heaptrack to warp’s process you can see there are no memory leaks: heaptrack warp-test amidst 121.25 GB allocated, only 2.6 MB are reported as leaked, and when you check their location, you can see it’s Buffer’s (tests were run using the default allocator on Linux 5.6.12-arch1-1)

I was having the same issue and switched to the MiMalloc allocator. It releases memory instantly - from 2GB when underload back to 50MB when idle, constrast to the default allocator which peaks at 3GB and 1.7GB when idle. MiMalloc does not only consume less memory but also performs better in my case, bench tested both on my Linux machine and Docker container with resource limited to 1CPU 1GB. I think I’m going to stick with MiMalloc for now.

Did anyone ever resolve this?

If it does appear that the application is explicitly holding onto allocated memory for too long, jemalloc also has profiling features you can use to see what sections of code are performing those allocations: https://github.com/jemalloc/jemalloc/wiki/Use-Case%3A-Leak-Checking

I can reproduce similar behavior with the system allocator as well as mimalloc; but if we’re going to fix this, we’ll need to get a lot more detailed about the specific behavior that exasperates the problem.

I’m curious if this can be reproduced in an application that uses Tokio but not Hyper. Hyper is going to spawn a task for every connection (basically). If we write a program that spawns 10s of thousands of tasks concurrently and then sits for some amount of time does the OS reclaim memory or does the process’s RSS remain at its peak?

I also think that it’s more of a tokio issue rather than hyper and it will be very interesting to try what you’ve suggested. I noticed that it happens to me only when I’m doing some expensive operation, like serializing huge payload before sending it back to the client. My case is that on requests, I fetch some huge payload from some other server (with reqwest). If I don’t do anything with the result and just send back to the client hello for example, the memory will stay alright. If I try to deserialize the payload with serde_json the memory will stack up little by little per every request. Something like this:

pub async fn get_data_route(Extension(client): Extension<Arc<MyClient>>) -> impl IntoResponse {
    // fetching huge payload
    let res = client
        .body(...)
	.send()
	.await
        .unwrap();

   // without this part the memory seems to be stable
   let data = res.json::<Foo>().await.unwrap();

   (StatusCode::OK, Json("hello world"))
}

or it could be an issue with hyper here: https://github.com/hyperium/hyper/blob/master/src/body/to_bytes.rs#L46 since it’s been called from the json method.

It could also be that there’s a general problem with spawning tasks, I’m wondering how it works with tokio_uring

Hello there!

I have the same problem with async-graphql + warp. On Linux, memory doesn’t return to the system. But if I launch program on Windows, memory return to normal (docker-desktop on Windows doesn’t return memory too). jemalloc doesn’t solve a problem.