ray: [KubeRay][Dashboard] Port-forwarding dashboard fails.
What happened + What you expected to happen
Port-forwarding the Ray dashboard fails with KubeRay in at least some Kubernetes setups. I’ve observed the problem on GKE but not on a local KinD cluster.
Start a KubeRay Ray Cluster and attempt to port-forward the Ray head’s dashboard port
kubectl -n ray port-forward service/example-cluster-ray-head 8265:8265
Forwarding from 127.0.0.1:8265 -> 8265
Forwarding from [::1]:8265 -> 8265
Handling connection for 8265
E0317 22:21:58.501309 135973 portforward.go:406] an error occurred forwarding 8265 -> 8265: error forwarding port 8265 to pod 755dce9c462676627f07602de97f7bf9c52ab727336cda0d7d02f0b566c5d292, uid : failed to execute portforward in network namespace "/var/run/netns/cni-76ec8b4c-cc4c-49cb-aa89-f87ec7f5f9e4": failed to dial 8265: dial tcp4 127.0.0.1:8265: connect: connection refused
E0317 22:21:58.501671 135973 portforward.go:234] lost connection to pod
Context: https://discuss.ray.io/t/ray-job-submit-errors-on-kubernetes/5449/4
Versions / Dependencies
Ray master, KubeRay master, GKE and potentially other cloud providers.
Reproduction script
Start a KubeRay Ray Cluster and attempt to port-forward the Ray head’s dashboard port
kubectl -n ray port-forward service/example-cluster-ray-head 8265:8265
Issue Severity
High: It blocks me from completing my task.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 1
- Comments: 19 (13 by maintainers)
Yep, this helped us solve the issue on our k8s deployment. We were only including the link to the whl instead of the
ray[default] @prefix in requirements.txt.