kyma: Knative Services exposed via Istio return 503 after being scaled down to 0
Description
Sometimes the knative service exposed via istio envoy is returning 503 upstream connect error or disconnect/reset before headers. reset reason: connection failure%. This happens occasionally even when the function pod is up and running.
Expected result The function should return properly
Actual result
* Rebuilt URL to: http://sample.serverless-system.34.76.171.216.xip.io/
* Trying 34.76.171.216...
* TCP_NODELAY set
* Connected to sample.serverless-system.34.76.171.216.xip.io (34.76.171.216) port 80 (#0)
> GET / HTTP/1.1
> Host: sample.serverless-system.34.76.171.216.xip.io
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< location: https://sample.serverless-system.34.76.171.216.xip.io/
< date: Mon, 19 Aug 2019 13:13:07 GMT
< server: istio-envoy
< content-length: 0
<
* Connection #0 to host sample.serverless-system.34.76.171.216.xip.io left intact
* Issue another request to this URL: 'https://sample.serverless-system.34.76.171.216.xip.io/'
* Trying 34.76.171.216...
* TCP_NODELAY set
* Connected to sample.serverless-system.34.76.171.216.xip.io (34.76.171.216) port 443 (#1)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
* CAfile: /etc/ssl/cert.pem
CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-CHACHA20-POLY1305
* ALPN, server accepted to use h2
* Server certificate:
* subject: CN=34.76.171.216.xip.io
* start date: Aug 6 13:12:30 2019 GMT
* expire date: Sep 5 13:12:30 2019 GMT
* issuer: CN=34.76.171.216.xip.io
* SSL certificate verify result: self signed certificate (18), continuing anyway.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x7fe05a00d600)
> GET / HTTP/2
> Host: sample.serverless-system.34.76.171.216.xip.io
> User-Agent: curl/7.54.0
> Accept: */*
>
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 503
< content-length: 91
< content-type: text/plain
< date: Mon, 19 Aug 2019 13:13:07 GMT
< server: istio-envoy
<
* Connection #1 to host sample.serverless-system.34.76.171.216.xip.io left intact
upstream connect error or disconnect/reset before headers. reset reason: connection failure%
Workaround
Restart the ingress-gateway pod
kubectl delete pod -l app=istio-ingressgateway -n istio-system
Steps to reproduce
Deploy function-controller in serverless-controller namespace.
apply following
apiVersion: serverless.kyma-project.io/v1alpha1
kind: Function
metadata:
name: sample
labels:
foo: bar
spec:
function: |
module.exports = {
main: function(event, context) {
return 'Hello World'
}
}
functionContentType: "plaintext"
size: "L"
runtime: "nodejs8"
EOF
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 17 (9 by maintainers)
We are also seeing the 503s with the activator complaining about
error roundtripping http://x.x.x.x:80/healthz: context deadline exceeded". At this point we see no errors in the istio-ingressgateway and no other errors apart from1 leaderelection.go:360] Failed to update lock: Operation cannot be fulfilled on leases.coordination.k8s.io "autoscaler-bucket-00-of-01": the object has been modified; please apply your changes to the latest version and try againin the autoscaler. Any leads on how we can dig further?It appears that these issues may be known issues in knative serving. It looks like the 0.8.x releases fix a lot of 503 related connection issues:
https://github.com/knative/serving/issues/4752
https://github.com/knative/serving/issues/4281