serving: TFS doesn't handle grpc request deadline/timeout properly

Bug Report

System information

Linux 4.4.121-k8s #1 SMP Sun Mar 11 19:39:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux:
On all 1.3, 1.7 and 1.8 i can observe same behaviours:

Describe the problem

Even i set the deadline/timeout for all the gRPC as 30s, tensorflow serving doesn’t respect that, still queue/run all the requests until it reaches its end;

Thus in my go grpc client i can observe that all requests already returned with panic: rpc error: code = DeadlineExceeded desc = context deadline exceeded, but in TFS model server i still see incoming request for a long time, and the request duration(processing time) keep increasing from the normal around 30s to 1m, 2m, 3m … until it finishes all requests.

What i expect is the TFS model server ends the request and releases the resources immediately when the request already exceeds its deadline or the client was already gone.

Exact Steps to Reproduce

Just use whatever grpc client to send heavy request to TFS model server continuously, like 1 request per second, and last 1 minute like that, and for each request set a short deadline/timeout, then observe whether TFS model server respect the deadline/timeout.

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 31 (8 by maintainers)

Most upvoted comments

Closing due to 14 days of inactivity - please do re-open if you manage to get stack trace for this 😃

misterpeddy on Aug 27, 2019

Hi Inshi,

We haven’t heard from you for 7+ days now. We will close this for now assuming you no longer need help on this issue. If you do need help, please feel free to reopen this issue.

Thanks

chanshah on Jun 21, 2019

we need thread stacks for all threads of the modelserver to help debug background activity (after RPC is terminated/cancelled from the client side).

afaik there is no debian package for pstack and lsstack – so installing these will be non-trivial (assuming they actually work). so please do not use these tools.

i’d recommend install elfutils package (apt install elfutils) on your system (where the modelserver is running) and then run /usr/bin/eu-stack -p <pid-of-modelserver> to dump thread stack of all threads.

ideally you want to collect stack traces (read: run eu-stack) after the RPC is done, but while the CPU usage is still high (on the server). this will help us see what (unnecessary) background activity is still happening.

netfs on Aug 14, 2019

Thanks Inshi. Just as an update, we are able to re-produce the time-out part. Folks are on the way to collect necessary stack traces and doing further investigation.

minglotus-6 on Jun 13, 2019