Dpu support
DPU support¶
With the emergence of Data Processing Units (DPUs), NIC vendors can now offer greater hardware acceleration capability, flexibility and security.
It is desirable to leverage DPU in OVN-kubernetes to accelerate networking and secure the network control plane.
A DPU consists of: - Industry-standard, high-performance, software-programmable multi-core CPU - High-performance network interface - Flexible and programmable acceleration engines
Similarly to Smart-NICs, a DPU follows the kernel switchdev model. In this model, every VF/PF net-device on the host has a corresponding representor net-device existing on the embedded CPU.
Any vendor that manufactures a DPU which supports the above model should work with current design.
Design document can be found here.
OVN-Kubernetes in a DPU-Accelerated Environment¶
The ovn-kubernetes deployment will have two parts one on the host and another on the DPU side.
These aforementioned parts are expected to be deployed also on two different Kubernetes clusters, one for the host and another for the DPUs.
Always check the dependencies on the Requirements page
Host Cluster¶
OVN-Kubernetes control plane related component¶
- ovn-cluster-manager
OVN-Kubernetes components on a Standard Host (Non-DPU)¶
- local-nb-ovsdb
- local-sb-ovsdb
- run-ovn-northd
- ovnkube-controller-with-node
- ovn-controller
- ovs-metrics
OVN-Kubernetes component on a DPU-Enabled Host¶
- ovn-node
For detailed configuration of gateway interfaces in DPU host mode, see DPU Gateway Interface Configuration.
DPU Cluster¶
OVN-Kubernetes components¶
- local-nb-ovsdb
- local-sb-ovsdb
- run-ovn-northd
- ovnkube-controller-with-node
- ovn-controller
- ovs-metrics
DPU health monitoring¶
OVN-Kubernetes uses a custom Kubernetes Lease in the ovn-kubernetes namespace to track the health of the DPU side of a trusted deployment.
The DPU host creates the lease and sets an owner reference to the Kubernetes Node, while ovnkube running on the DPU renews the lease on a regular interval.
Two ovnkube-node options control this behavior:
- --dpu-node-lease-renew-interval (seconds, default 10). Set to 0 to disable the health check.
- --dpu-node-lease-duration (seconds, default 40).
If the lease expires, the DPU host CNI server fails ADD requests immediately with DPU Not Ready and the STATUS command returns a CNI error with code 50 (The plugin is not available).
This causes the container runtime to report NetworkReady=false, preventing new workloads from landing on the affected host until the DPU becomes healthy again.