serving: Timeout issue on long requests

/area networking

Hi, I’m trying to use knative functions to make streaming http requests. It works well but I have an issue with long requests. After 5 minutes I receive a timeout error on queue-proxy and my request is closed.

I’ve tried some solutions but I had other issues: Modify the config-map named “config-defaults” (on knative-serving namespace)

max-revision-timeout-seconds: '21600' #6 hours
revision-timeout-seconds: '21600' # 6 hours

It works well BUT when the pod wants to be terminating, he is stuck in Terminating status. I’ve discovered that knative copied max-revision-timeout-seconds to terminationGracePeriodSeconds. So, I’ve tried to patch my knative ksvc yaml to override the terminationGracePeriodSeconds parameter (normally it is in a PodSpec), but it’s seems impossible to change it through knative.

Can you give me information on how can I set up the configuration for my need please ?

Nicolas

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 7
  • Comments: 23 (9 by maintainers)

Most upvoted comments

In that case, since it’s operator-specific, let’s move the discussion over there… I opened https://github.com/knative/operator/issues/1295 with your question.

You might also want to look into the async component.

cc @AngeloDanducci

Hi, thank you for your answers.

I have continued to inspect and I would like to share what I see.

  • If I request a function that sends to me (as a http response) a streaming flux which takes more than 15Minutes: it works.
  • If I request a function that receives (in a http request) a streaming flux (that I send) and sends me back a result after 15minutes, the connection is cut (depends on timeoutSeconds value) even if i’m sending data (in upload) during this time.
  • If I send and receive flux at the same time, it works but sometimes I have other problems with envoy-istio regarding reliability (upstream_reset_after_response_started{protocol_error})

Maybe knative is only inspecting the response flux to detect timeout? Do you know how the second part can work in knative

thanks @skonto. your solution works for me.