kubernetes: Pod details missing in the `Allocate` DPAPI
Is this a BUG REPORT or FEATURE REQUEST?:
/kind feature
What happened:
Pod details are not passed in the Allocate
RPC and because of this network device plugins are not supported.
Since when network device plugins were being discussed to be supported? Right from the very beginning. Here it is mentioned in the very first use case of device plugin proposal, Infiniband/RDMA .
When did we notice this gap and raised a request for the first time?(~1.8) : While reviewing the very first PR which was adding device plugin support, we raised this gap through review comments. Response that we got: Reason given at that time for not incorporating our request was limited time before code freeze and got suggestion to discuss it more in 1.9
What we did next to explain the requirement?(~1.9)
We then tried to highlight the need with write-ups, Solarflare device plugin PoC and by opening PR.
Response that we got: CNI/sig-network should handle networking stuff and we are open to extend the DPAPI once there is a good user-story which really proves the need for extension and sig-network should also be involved in this.
What we did to prove the use case and involve sig-network? (~1.10) We worked and got involved with network plumbing work group. This group has been working to come up with a multi-network solution, Kubernetes Network Custom Resource Definition De-facto Standard After some analysis we realized that device plugins are a perfect complement to this CRD based defacto standard for certain very important use-cases like SR-IOV and RDMA(Infiniband). We prepared proposal with this idea and shared with sig-network folks. This idea got support from sig-network folks. With Tim discussed the reasons of complexity in the solution because of
- Unix socket involved b/w CNI and DP
This was added in proposal because CNI expects CNI plugin to be a static binary. All agreed to explore gRPC based CNI and plugin communication. - Annotations involved
- Annotations were suggested because of strong push-back in passing poduid. Meanwhile, during all this time, there has been demands to pass details to the DP from other non-networking use-cases as well for example FPGA use-cases, monitoring proposal
- Tim said that he will discuss with Vish to understand his concerns in passing podUID and to see if it is possible to pass PodUId. We still looking for his feedback on this.
Through this issue we want to understand better concerns in passing pod uid to the device plugin
@kubernetes/sig-network-api-reviews @kubernetes/sig-network-bugs @kubernetes/sig-network-feature-requests @thockin @vishh @jiayingz @jeremyeder @dcbw @bowei @derekwaynecarr @dchen1107
/sig node /sig network /area hw-accelerators
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 2
- Comments: 17 (10 by maintainers)
Just to join to the list of supporters: we would be also very interested in this API enhancement (Nokia).
I think CNI definitely has its place in the ecosystem. However, at the same time we also need to acknowledge that the project currently does not account for userspace networking. Also network resources are basically the only remaining resources which are not taken into account when making scheduling decisions .
A Device Plugin based approach would solve both of these issues! Workloads (in our case real, already existing NFVI radio applications) requiring special, finite, high-performance networking solutions cannot be fully satisfied by the CNI approach. They need all of their resources aligned to the same NUMA node (networks included), and they need to know if the resources are even available on a node with said constraints.
When a Pod requires “normal”, infinitely available, kernel-space networking: CNI can take care of those needs in itself. When a Pod requires “special”, finite resources possibly to set-up networking in its own space: CNI needs help. Either we provide this help by totally redoing the CNI project, or we just simply supplement it with something else -> for example with Device Plugins.
I think everyone agrees that the latter approach is more feasible 😃
Thank you @vikaschoudhary16 for opening this issue. We would welcome this feature as it allows us to get closer to network device plugins without hacks. Right now we obtain pod UID using the checkpoint file hack and even though it works, it feels dangerous and is not the most elegant solution.
Is there any progress in discussions outside this issue?