kserve: kfserving 0.5.0-rc2 standalone - segmentation violation
/kind bug
What steps did you take and what happened:
- Deploy kfserving-0.5.0-rc2 standalone
- Check that the controller is started and running
NAME READY STATUS
kfserving-controller-manager-0 2/2 Running
- Create a namespace for sklearn
kubectl create namespace myns
- Create InferenceService resource
changing the namespace to match
myns - InferenceService never goes Ready
Check the controller-manager pod status
- It is in CrashLoop and Error
NAME READY STATUS
kfserving-controller-manager-0 1/2 CrashLoopBackOff
Logs from the controller-manager pod
2021/02/02 15:20:54 http: panic serving 172.17.0.1:21148: runtime error: invalid memory address or nil pointer dereference
goroutine 346 [running]:
net/http.(*conn).serve.func1(0xc00079e320)
/usr/local/go/src/net/http/server.go:1772 +0x139
panic(0x1759f60, 0x28a7210)
/usr/local/go/src/runtime/panic.go:975 +0x3e3
github.com/kubeflow/kfserving/pkg/apis/serving/v1alpha2.(*InferenceService).ConvertFrom(0xc000610000, 0x1bccf60, 0xc00042fc00, 0x1bccf60, 0xc00042fc00)
/go/src/github.com/kubeflow/kfserving/pkg/apis/serving/v1alpha2/inferenceservice_conversion.go:219 +0x1183
sigs.k8s.io/controller-runtime/pkg/webhook/conversion.(*Webhook).convertObject(0xc0009557f0, 0x1baa0e0, 0xc00042fc00, 0x1baa020, 0xc000610000, 0x1baa020, 0xc000610000)
/go/pkg/mod/github.com/zchee/sigs.k8s-controller-runtime@v0.6.1-0.20200623114430-46812d3a0a50/pkg/webhook/conversion/conversion.go:142 +0x7bc
sigs.k8s.io/controller-runtime/pkg/webhook/conversion.(*Webhook).handleConvertRequest(0xc0009557f0, 0xc0003bc200, 0xc0009e83c0, 0x0, 0x0)
/go/pkg/mod/github.com/zchee/sigs.k8s-controller-runtime@v0.6.1-0.20200623114430-46812d3a0a50/pkg/webhook/conversion/conversion.go:107 +0x1f8
sigs.k8s.io/controller-runtime/pkg/webhook/conversion.(*Webhook).ServeHTTP(0xc0009557f0, 0x7f315b7e3530, 0xc0000a0050, 0xc000892200)
/go/pkg/mod/github.com/zchee/sigs.k8s-controller-runtime@v0.6.1-0.20200623114430-46812d3a0a50/pkg/webhook/conversion/conversion.go:74 +0x10b
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerInFlight.func1(0x7f315b7e3530, 0xc0000a0050, 0xc000892200)
/go/pkg/mod/github.com/prometheus/client_golang@v1.6.0/prometheus/promhttp/instrument_server.go:40 +0xab
net/http.HandlerFunc.ServeHTTP(0xc000513320, 0x7f315b7e3530, 0xc0000a0050, 0xc000892200)
/usr/local/go/src/net/http/server.go:2012 +0x44
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerCounter.func1(0x1bdb560, 0xc000130000, 0xc000892200)
/go/pkg/mod/github.com/prometheus/client_golang@v1.6.0/prometheus/promhttp/instrument_server.go:100 +0xda
net/http.HandlerFunc.ServeHTTP(0xc0005134d0, 0x1bdb560, 0xc000130000, 0xc000892200)
/usr/local/go/src/net/http/server.go:2012 +0x44
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerDuration.func2(0x1bdb560, 0xc000130000, 0xc000892200)
/go/pkg/mod/github.com/prometheus/client_golang@v1.6.0/prometheus/promhttp/instrument_server.go:76 +0xb2
net/http.HandlerFunc.ServeHTTP(0xc0005135c0, 0x1bdb560, 0xc000130000, 0xc000892200)
/usr/local/go/src/net/http/server.go:2012 +0x44
net/http.(*ServeMux).ServeHTTP(0xc0007bfd80, 0x1bdb560, 0xc000130000, 0xc000892200)
/usr/local/go/src/net/http/server.go:2387 +0x1a5
net/http.serverHandler.ServeHTTP(0xc0004ec0e0, 0x1bdb560, 0xc000130000, 0xc000892200)
/usr/local/go/src/net/http/server.go:2807 +0xa3
net/http.(*conn).serve(0xc00079e320, 0x1bdf560, 0xc0003bc040)
/usr/local/go/src/net/http/server.go:1895 +0x86c
created by net/http.(*Server).Serve
/usr/local/go/src/net/http/server.go:2933 +0x35c
{"level":"info","ts":1612279254.1017942,"logger":"v1beta1Controllers.InferenceService","msg":"Reconciling inference service","apiVersion":"serving.kubeflow.org/v1beta1","isvc":"sklearn-iris"}
{"level":"info","ts":1612279254.202461,"logger":"PredictorReconciler","msg":"Reconciling Predictor","PredictorSpec":{"sklearn":{"storageUri":"gs://seldon-models/sklearn/iris","protocolVersion":"v2","name":"","resources":{}}}}
E0202 15:20:54.202734 1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 251 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x1759f60, 0x28a7210)
/go/pkg/mod/k8s.io/apimachinery@v0.18.8/pkg/util/runtime/runtime.go:74 +0xa3
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
/go/pkg/mod/k8s.io/apimachinery@v0.18.8/pkg/util/runtime/runtime.go:48 +0x82
panic(0x1759f60, 0x28a7210)
/usr/local/go/src/runtime/panic.go:969 +0x166
github.com/kubeflow/kfserving/pkg/apis/serving/v1beta1.(*SKLearnSpec).getContainerV2(0xc00021a9a0, 0xc000a0c080, 0xc, 0x0, 0x0, 0xc000a0c09a, 0x6, 0xc0005668a0, 0x53, 0xc000a28450, ...)
/go/src/github.com/kubeflow/kfserving/pkg/apis/serving/v1beta1/predictor_sklearn.go:118 +0x44c
github.com/kubeflow/kfserving/pkg/apis/serving/v1beta1.(*SKLearnSpec).GetContainer(0xc00021a9a0, 0xc000a0c080, 0xc, 0x0, 0x0, 0xc000a0c09a, 0x6, 0xc0005668a0, 0x53, 0xc000a28450, ...)
/go/src/github.com/kubeflow/kfserving/pkg/apis/serving/v1beta1/predictor_sklearn.go:71 +0x120
github.com/kubeflow/kfserving/pkg/controller/v1beta1/inferenceservice/components.(*Predictor).Reconcile(0xc0003bcb80, 0xc0004e3800, 0x13, 0x1beab80)
/go/src/github.com/kubeflow/kfserving/pkg/controller/v1beta1/inferenceservice/components/predictor.go:84 +0x4fe
github.com/kubeflow/kfserving/pkg/controller/v1beta1/inferenceservice.(*InferenceServiceReconciler).Reconcile(0xc0007bfa00, 0xc0003a95ea, 0x6, 0xc0003a95c0, 0xc, 0xc0006ce2d0, 0xc0004b61b8, 0xc0004b61b0, 0xc0004b61b0)
/go/src/github.com/kubeflow/kfserving/pkg/controller/v1beta1/inferenceservice/controller.go:132 +0x45a
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc00095f950, 0x17bc5c0, 0xc0009fe4e0, 0xc000583e00)
/go/pkg/mod/github.com/zchee/sigs.k8s-controller-runtime@v0.6.1-0.20200623114430-46812d3a0a50/pkg/internal/controller/controller.go:233 +0x161
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc00095f950, 0x203000)
/go/pkg/mod/github.com/zchee/sigs.k8s-controller-runtime@v0.6.1-0.20200623114430-46812d3a0a50/pkg/internal/controller/controller.go:209 +0xae
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc00095f950)
/go/pkg/mod/github.com/zchee/sigs.k8s-controller-runtime@v0.6.1-0.20200623114430-46812d3a0a50/pkg/internal/controller/controller.go:188 +0x2b
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc00061c0b0)
/go/pkg/mod/k8s.io/apimachinery@v0.18.8/pkg/util/wait/wait.go:155 +0x5f
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc00061c0b0, 0x1ba0880, 0xc0006721e0, 0x1, 0xc000044f60)
/go/pkg/mod/k8s.io/apimachinery@v0.18.8/pkg/util/wait/wait.go:156 +0xa3
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00061c0b0, 0x3b9aca00, 0x0, 0x1a11201, 0xc000044f60)
/go/pkg/mod/k8s.io/apimachinery@v0.18.8/pkg/util/wait/wait.go:133 +0x98
k8s.io/apimachinery/pkg/util/wait.Until(0xc00061c0b0, 0x3b9aca00, 0xc000044f60)
/go/pkg/mod/k8s.io/apimachinery@v0.18.8/pkg/util/wait/wait.go:90 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
/go/pkg/mod/github.com/zchee/sigs.k8s-controller-runtime@v0.6.1-0.20200623114430-46812d3a0a50/pkg/internal/controller/controller.go:170 +0x411
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x13f094c]
goroutine 251 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
/go/pkg/mod/k8s.io/apimachinery@v0.18.8/pkg/util/runtime/runtime.go:55 +0x105
panic(0x1759f60, 0x28a7210)
/usr/local/go/src/runtime/panic.go:969 +0x166
github.com/kubeflow/kfserving/pkg/apis/serving/v1beta1.(*SKLearnSpec).getContainerV2(0xc00021a9a0, 0xc000a0c080, 0xc, 0x0, 0x0, 0xc000a0c09a, 0x6, 0xc0005668a0, 0x53, 0xc000a28450, ...)
/go/src/github.com/kubeflow/kfserving/pkg/apis/serving/v1beta1/predictor_sklearn.go:118 +0x44c
github.com/kubeflow/kfserving/pkg/apis/serving/v1beta1.(*SKLearnSpec).GetContainer(0xc00021a9a0, 0xc000a0c080, 0xc, 0x0, 0x0, 0xc000a0c09a, 0x6, 0xc0005668a0, 0x53, 0xc000a28450, ...)
/go/src/github.com/kubeflow/kfserving/pkg/apis/serving/v1beta1/predictor_sklearn.go:71 +0x120
github.com/kubeflow/kfserving/pkg/controller/v1beta1/inferenceservice/components.(*Predictor).Reconcile(0xc0003bcb80, 0xc0004e3800, 0x13, 0x1beab80)
/go/src/github.com/kubeflow/kfserving/pkg/controller/v1beta1/inferenceservice/components/predictor.go:84 +0x4fe
github.com/kubeflow/kfserving/pkg/controller/v1beta1/inferenceservice.(*InferenceServiceReconciler).Reconcile(0xc0007bfa00, 0xc0003a95ea, 0x6, 0xc0003a95c0, 0xc, 0xc0006ce2d0, 0xc0004b61b8, 0xc0004b61b0, 0xc0004b61b0)
/go/src/github.com/kubeflow/kfserving/pkg/controller/v1beta1/inferenceservice/controller.go:132 +0x45a
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc00095f950, 0x17bc5c0, 0xc0009fe4e0, 0xc000583e00)
/go/pkg/mod/github.com/zchee/sigs.k8s-controller-runtime@v0.6.1-0.20200623114430-46812d3a0a50/pkg/internal/controller/controller.go:233 +0x161
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc00095f950, 0x203000)
/go/pkg/mod/github.com/zchee/sigs.k8s-controller-runtime@v0.6.1-0.20200623114430-46812d3a0a50/pkg/internal/controller/controller.go:209 +0xae
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc00095f950)
/go/pkg/mod/github.com/zchee/sigs.k8s-controller-runtime@v0.6.1-0.20200623114430-46812d3a0a50/pkg/internal/controller/controller.go:188 +0x2b
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc00061c0b0)
/go/pkg/mod/k8s.io/apimachinery@v0.18.8/pkg/util/wait/wait.go:155 +0x5f
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc00061c0b0, 0x1ba0880, 0xc0006721e0, 0x1, 0xc000044f60)
/go/pkg/mod/k8s.io/apimachinery@v0.18.8/pkg/util/wait/wait.go:156 +0xa3
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00061c0b0, 0x3b9aca00, 0x0, 0x1a11201, 0xc000044f60)
/go/pkg/mod/k8s.io/apimachinery@v0.18.8/pkg/util/wait/wait.go:133 +0x98
k8s.io/apimachinery/pkg/util/wait.Until(0xc00061c0b0, 0x3b9aca00, 0xc000044f60)
/go/pkg/mod/k8s.io/apimachinery@v0.18.8/pkg/util/wait/wait.go:90 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
/go/pkg/mod/github.com/zchee/sigs.k8s-controller-runtime@v0.6.1-0.20200623114430-46812d3a0a50/pkg/internal/controller/controller.go:170 +0x411
- Deleting the
InfrerenceService/sklearn-irisbrings the controller back to running
What did you expect to happen: Submitting a sklearn sample with 0.5.0-rc2 should succeed and not crash the controller-manager pod
Anything else you would like to add:
Describe on the controller-manager pod
Name: kfserving-controller-manager-0
Namespace: kfserving-system
Priority: 0
Node: minikube/192.168.49.2
Start Time: Tue, 02 Feb 2021 09:57:28 -0500
Labels: control-plane=kfserving-controller-manager
controller-revision-hash=kfserving-controller-manager-855f55cffb
controller-tools.k8s.io=1.0
statefulset.kubernetes.io/pod-name=kfserving-controller-manager-0
Annotations: <none>
Status: Running
IP: 172.17.0.4
IPs:
IP: 172.17.0.4
Controlled By: StatefulSet/kfserving-controller-manager
Containers:
kube-rbac-proxy:
Container ID: docker://2f9c84cd1de7c419dc064aeb5372752ec5b905d52355b926cac9755f0116bc64
Image: gcr.io/kubebuilder/kube-rbac-proxy:v0.4.0
Image ID: docker-pullable://gcr.io/kubebuilder/kube-rbac-proxy@sha256:297896d96b827bbcb1abd696da1b2d81cab88359ac34cce0e8281f266b4e08de
Port: 8443/TCP
Host Port: 0/TCP
Args:
--secure-listen-address=0.0.0.0:8443
--upstream=http://127.0.0.1:8080/
--logtostderr=true
--v=10
State: Running
Started: Tue, 02 Feb 2021 09:57:29 -0500
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-tlwwp (ro)
manager:
Container ID: docker://1741ffc3299df8eea03f949f73f4b9095727bb408ba64e12f41d57cc415ed8c7
Image: gcr.io/kfserving/kfserving-controller:v0.5.0-rc2
Image ID: docker-pullable://gcr.io/kfserving/kfserving-controller@sha256:7fecc482c6d2bf7edb68d7ef56eac6f2b9bb359ce2667d6454e47024c340afcf
Port: 9443/TCP
Host Port: 0/TCP
Command:
/manager
Args:
--metrics-addr=127.0.0.1:8080
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Tue, 02 Feb 2021 10:19:42 -0500
Finished: Tue, 02 Feb 2021 10:20:54 -0500
Ready: False
Restart Count: 9
Limits:
cpu: 100m
memory: 300Mi
Requests:
cpu: 100m
memory: 200Mi
Environment:
POD_NAMESPACE: kfserving-system (v1:metadata.namespace)
SECRET_NAME: kfserving-webhook-server-cert
Mounts:
/tmp/k8s-webhook-server/serving-certs from cert (ro)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-tlwwp (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
cert:
Type: Secret (a volume populated by a Secret)
SecretName: kfserving-webhook-server-cert
Optional: false
default-token-tlwwp:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-tlwwp
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 25m default-scheduler Successfully assigned kfserving-system/kfserving-controller-manager-0 to minikube
Normal Pulled 25m kubelet, minikube Container image "gcr.io/kubebuilder/kube-rbac-proxy:v0.4.0" already present on machine
Normal Created 25m kubelet, minikube Created container kube-rbac-proxy
Normal Started 25m kubelet, minikube Started container kube-rbac-proxy
Normal Pulling 23m (x4 over 25m) kubelet, minikube Pulling image "gcr.io/kfserving/kfserving-controller:v0.5.0-rc2"
Normal Pulled 23m (x4 over 25m) kubelet, minikube Successfully pulled image "gcr.io/kfserving/kfserving-controller:v0.5.0-rc2"
Normal Created 23m (x4 over 25m) kubelet, minikube Created container manager
Normal Started 23m (x4 over 25m) kubelet, minikube Started container manager
Warning BackOff 5m7s (x88 over 24m) kubelet, minikube Back-off restarting failed container
Environment:
- Istio Version: 1.7.6
- Knative Version: 0.20.0
- KFServing Version: 0.5.0-rc2
- Kubeflow version: N/A
- Kfdef:[k8s_istio/istio_dex/gcp_basic_auth/gcp_iap/aws/aws_cognito/ibm]
- Minikube version: 1.16.0
- Kubernetes version: (use
kubectl version): 1.17.4 - OS (e.g. from
/etc/os-release):
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 22 (12 by maintainers)
@adriangonz Can you help take a look?