You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* sfnettest example now uses latency profile by default rather than recommended
* Removed trailing slash on `$REGISTRY_BASE` causing invalid duplicate slashes
* Added built-in syntax validation to user-modifiable commands
* Documented containers of Onload Device Plugin pod
* Updated README to reflect recent `setPreload` changes
* Symlinks to existing docs in `docs/`
Copy file name to clipboardExpand all lines: DEVELOPING.md
+8-7Lines changed: 8 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,38 +13,39 @@ and [Kubebuilder](https://kubebuilder.io/).
13
13
14
14
The Onload Operator and Onload Device Plugin consume Onload container images (`onload-user` and either `onload-source` or `onload-module`). You may wish to pre-populate your cluster's container image registry, either with the [official images provided](README.md#provided-images) or [your own builds](README.md#build).
15
15
16
-
## Build and deploy Onload Operator from source
16
+
## Build and deploy from source
17
17
18
18
Configure a development registry and configure cluster for [insecure registries](README.md#insecure-registries)
19
-
if required. Specify the base of the following images:
19
+
if required. Specify the following image locations:
> Replacing `kubectl apply` with `kubectl kustomize` will output a complete YAML manifest file which can be copied to a
140
+
> network that does not have access to this repository.
141
+
136
142
### Onload Device Plugin
137
143
138
144
The Onload Device Plugin implements the [Kubernetes Device Plugin API](https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/)
139
145
to expose a [Kubernetes Resource](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/)
140
146
named `amd.com/onload`.
141
147
142
-
It is distributed as the container image `onload-device-plugin` and is deployed and configured entirely by
143
-
the Onload Operator. Its image location is configured as an environment variable within the Onload Operator deployment
144
-
([see above](#local-onload-operator-images-in-restricted-networks)) and its ImagePullPolicy as part of
145
-
[Onload Custom Resource (CR)](#onload-custom-resource-cr) along with its other customisation properties.
148
+
It is distributed as the container image `onload-device-plugin`. The image location is configured as an environment
149
+
variable within the Onload Operator deployment ([see above](#local-onload-operator-images-in-restricted-networks)) and
150
+
its ImagePullPolicy as part of [Onload Custom Resource (CR)](#onload-custom-resource-cr), along with its other
151
+
customisation properties.
152
+
153
+
The Onload Operator manages an Onload Device Plugin DaemonSet which deploys, to each node selected for acceleration,
154
+
a pod consisting of 3 containers:
155
+
156
+
* Init (`init` container, `onload-user` image)
157
+
-- for copying Onload files to host filesystem and Onload Worker volume.
-- for Kubernetes Device Plugin API; privileged access to Kubernetes API.
146
162
147
163
### Onload Custom Resource (CR)
148
164
@@ -270,7 +286,8 @@ spec:
270
286
amd.com/onload: 1
271
287
```
272
288
273
-
All applications started within the pod environment will be accelerated due to the `LD_PRELOAD` environment variable.
289
+
All applications started within the pod environment will be accelerated due to the `LD_PRELOAD` environment variable
290
+
unless `setPreload: false` is configured in Onload CR.
274
291
275
292
### Resource `amd.com/onload`
276
293
@@ -298,6 +315,17 @@ Binary mounts (if `mountOnload` is true, by default in `/opt/onload/usr/bin/`)
298
315
If you wish to customise where files are mounted in the container's filesystem this can be configured with the fields
299
316
of `spec.devicePlugin` in an Onload CR.
300
317
318
+
> [!IMPORTANT]
319
+
> Kubernetes Device Plugin only affects initial pod scheduling
320
+
>
321
+
> Kubernetes Device Plugin is designed to configure pods once only, at creation time. If the Onload CR is re-applied to
322
+
> the cluster with settings that would change pod environment -- for example, changing the value of `setPreload` --
323
+
> then running pods must be recreated before using these changes.
324
+
>
325
+
> Additionally, Kubernetes does not evict pods when node resources are removed; pods do not automatically have a formal
326
+
> dependency on Onload Device Plugin or Onload Module. This has the advantage that minor Onload Operator behaviour
327
+
> does not affect the workloads its components pre-configured.
328
+
301
329
### Example client-server with sfnettest
302
330
303
331
Please see [config/samples/sfnettest](config/samples/sfnettest).
@@ -348,6 +376,10 @@ Currently the script produces ConfigMaps with a fixed naming structure,
348
376
for example if you want to create a ConfigMap from a profile called
349
377
`name.opf`the generated name will be `onload-name-profile`.
350
378
379
+
## Troubleshooting
380
+
381
+
Please see dedicated [troubleshooting guide](docs/troubleshooting.md).
382
+
351
383
## Build
352
384
353
385
### Onload Module pre-built images
@@ -373,6 +405,9 @@ Please see [DEVELOPING](DEVELOPING.md) documentation.
373
405
Developing Onload Operator does not require building these images as official images are available.
374
406
375
407
If you wish to build these images, please follow ['Distributing as container image' in Onload repository's DEVELOPING](https://github.com/Xilinx-CNS/onload/blob/master/DEVELOPING.md#distributing-as-container-image).
408
+
This includes building debug versions. All Onload images in use must be consistent, in exact commit and build
409
+
parameters. For example, a debug build of `onload-user` must be used with a debug build of `onload-module`. Build
410
+
parameter specification is provided in the sample Onload CRs for the in-cluster build method.
Here, we run a small utility, [`sfnt-pingpong` from sfnettest](https://github.com/Xilinx-CNS/cns-sfnettest), in a client and server pair to demonstrate Onload acceleration.
3
+
Here, we run a small utility, [`sfnt-pingpong` from sfnettest](https://github.com/Xilinx-CNS/cns-sfnettest), in a
4
+
client and server pair to demonstrate Onload acceleration.
5
+
6
+
The sfnettest image is solely focused on its performance utils and thus has a micro shell environment; in-depth network
7
+
inspection should be performed using dedicated software or Onload tools available on the node's host filesystem.
4
8
5
9
## Deploy
6
10
7
-
The example will require customisation to your environment. By default, this will deploy two pods running on nodes named `compute-0` and `compute-1`:
11
+
The following example [client-server.yaml](client-server.yaml) will require customisation to your environment. The
12
+
manifest utilises two separately deployed resources which are recommended as part of a full Onload Operator deployment:
13
+
14
+
* A [Multus network](../../../docs/nad.md)
15
+
-- connects the pods to hardware that supports acceleration.
16
+
* An [Onload profile](../../../README.md#using-onload-profiles)
17
+
-- sets environment variables for the pod which are then consumed by userland Onload running in the container(s).
0 commit comments