-
Notifications
You must be signed in to change notification settings - Fork 2
Synopsys NPX VPX Linux Kernel Drivers Porting Guide
You can use the Buildroot environment to build the required toolchain and Linux system image for the supported platform. These instructions are written based on an ARC HS host processor, with specific instructions for adapting to an ARM host where required.
Linux host environment (desktop or server):
- Linux Ubuntu 20.04 distro or equivalent
Information about packages necessary for Buildroot is available at: https://buildroot.org/downloads/manual/manual.html#requirement
To setup the Buildroot environment, download the Buildroot sources and switch to the stable 2023.11 tag:
git clone https://github.com/buildroot/buildroot.git
cd ./buildroot
git checkout 2023.11
The Synopsys NPX and VPX Linux drivers are available as part of the MetaWare MX toolchain in the form of diff
patches for the Linux 5.15 and 6.6 kernel source trees as well as from the Linux kernel source repository on GitHub:
https://github.com/foss-for-synopsys-dwc-arc-processors/snps-accel-linux
To get the Linux kernel source and apply the patch with the drivers:
git clone https://github.com/torvalds/linux.git
cd linux
git checkout v6.6
cp ~/0001-snps-accel-6.6.patch ./
patch -p1 -i 0001-snps-accel-6.6.patch
To get the Linux kernel tree with the drivers:
git clone https://github.com/foss-for-synopsys-dwc-arc-processors/snps-accel-linux.git
The default branch contains only a README file with brief instructions. Select the required branch manually to access a specific kernel source tree with the NPX and VPX drivers.
Stable branches:
- snps_accel-v5.15 - the Linux 5.15 kernel source trees with the drivers
- snps_accel-v6.6 - the Linux 6.6 kernel source trees with the drivers
Switch to the required branch:
git switch snps_accel-v5.15
The NPX and VPX drivers location in the Linux device tree is show below.
[Linux kernel source tree top directory]
|-- drivers
|-- misc
|-- snps_accel -> ARCSync and accelerator drivers source dir
|-- Kconfig
|-- Makefile
|-- snps_accel_drv.c -> accel helper driver API
|-- snps_accel_drv.h -> accel helper driver internal structures
|-- snps_accel_mem.c -> accel helper driver memory management
|-- snps_accel_mem.h -> accel helper driver mm internal header
|-- snps_arcsync.c -> ARCSync control driver
|-- remoteproc
|-- snps_accel
|-- accel_rproc.c -> VPX/NPX driver in rempteproc framework
|-- accel_rproc.h -> VPX/NPX rproc internal header
|-- npx_config.c -> NPX Cluster Network setup
|-- include
|-- linux
|-- snps_arcsync.h -> ARCSync driver header to share with drivers
|-- uapi
|-- misc
|-- snps_accel.h -> accel helper driver header for user space applications
|-- arch
|-- arc
|-- boot
|-- configs
|--haps_hs_npp_defconfig -> Example of the kernel defconfig for the NPU prototyping
platform with ARC host (NPP)
|-- dtc
|-- zebu_hs_npp.dts -> Example of the DTS file for the NPP running on ZeBu
|-- haps_hs_npp.dts -> Example of the DTS file for the NPP running on HAPS
|-- haps_hs_npx6_8k_vpx.dts -> Example of extended configuration for NPP on HAPS
The NPX and VPX drivers stack implements several drivers to cover all the requirements of user-space runtime such as the Synopsys NN Runtime for NPX for privileged-function access. It includes the following drivers:
-
snps_accel_rproc - driver in the remoteproc framework for setting up VPX and NPX processors, and uploading and starting processor firmware. It enables use of the common remoteproc controls in
/sys/class/remoteproc
. -
snsp_arcsync - platform driver for managing instances of ARCSync IP present in a given SoC. It provides functions to control reset/start/stop/power_on/power_off for NPX and VPX processors and lets you receive notifications by means of ARCSync interrupts to the host. This driver doesn't provide an API for user-space clients.
-
snps_accel_app - platform driver for facilitating user-space runtime access to kernel-space objects such as the accelerator shared-memory region, ARCSync MMIO, notification interrupts, and DMA buffers.
To enable accelerator drivers, select options do 'make menuconfig' and select options as shown below. The options appear if the patch with the drivers is correctly applied.
SNPS_ACCEL_APP=y
SNPS_ACCEL_RPROC=y
It is possible to build the drivers as kernel modules. The NPX and VPX drivers require several extra nodes in the kernel device-tree source file with the description of accelerator resources. The drivers' DTS nodes and their configuration are described in detail below.
If you have a kernel config for your SoC chip or board, a correct device-tree Source file, a file with a prepared rootfs, and a cross toolchain, you can build a Linux image from a Linux source tree:
ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- make -j8 CONFIG_INITRAMFS_SOURCE=../rootfs.cpio
Another way is to use build environments such as Buildroot or YOCTO, to simplify and automate the process of building a complete Linux system for an embedded system using cross-compilation.
The following steps describe how to add Synopsys accelerator drivers to a system Linux image.
In the buildroot environment run make menuconfig
and choose desired Linux kernel version (6.6 used in the example below) :
make snps_archs38_haps_defconfig
make menuconfig
Set the following options:
BR2_LINUX_KERNEL_CUSTOM_VERSION=y
BR2_LINUX_KERNEL_CUSTOM_VERSION_VALUE="6.6"
If you have a patch with the driver, put the diff
patch file in the `buildroot/linux/6.61 directory to allow Buildroot to apply this patch before building.
If you are using the Linux kernel tree with integrated NPX and VPX drivers from GitHub you can create a local.mk
file in your Buildroot directory and specify the path for the directory with the kernel code:
echo "LINUX_OVERRIDE_SRCDIR=path/to/linux/kernel/src" > local.mk
Set the directory containing the accelerator firmware image(s) and extra host apps to copy to the root file system of the image:
BR2_ROOTFS_OVERLAY="$(O)/override"
Set post-build script:
One additional step is needed to copy the driver public UAPI header to the toolchain sysroot
directory to enable the use of application helper driver API headers in user-space applications. To do this, create a post-build script in the Buildroot directory, for example:
echo '#!/bin/sh' > board/synopsys/npp/post-build.sh
echo 'cp $BUILD_DIR/linux-6.6/include/uapi/misc/snps_accel.h $STAGING_DIR/usr/include/misc' >> board/synopsys/npp/post-build.sh
chmod +x board/synopsys/npp/post-build.sh
Then set the ROOTFS_POST_BUILD_SCRIPT option:
BR2_ROOTFS_POST_BUILD_SCRIPT="board/synopsys/npp/post-build.sh"
Ensure that you have a Linux config file for your host CPU (SoC) and that this config is specified in the Buildroot config, and that the correct device-tree source file is used:
BR2_LINUX_KERNEL_DEFCONFIG="npu_platform_host_cpu_config"
If a previous build exists, clear the Linux kernel source directory so that you can apply patches in the next build:
make linux-dirclean
Run linux menuconfig
and enable accelerator drivers:
make linux-menuconfig
To enable accel drivers, select options:
SNPS_ACCEL_APP=y
SNPS_ACCEL_RPROC=y
After completing all the above changes, build the Linux image:
make
To bring-up the NPX and VPX drivers, update the kernel DTS file with the proper accelerator resources description.
Each accelerator chip (SoC) with NPX and VPX cores must have its own snps_accel
node with a "snps,accel"
compatible
label.
The snps_accel_app
driver creates an accelerator instance for each node.
Each accelerator node may have one or several remoteproc nodes with a description of the core or cores and the firmware that runs on these cores. Each remoteproc node must have a compatible
label: "snps,npx-rproc"
or "snps,vpx-rproc"
.
Based on the properties in the remoteproc node the snps_accel_rproc
driver configures NPX or VPX groups and cores and starts execution of the firmware.
Each accelerator node may have one or several application nodes with a description of the resources accessible to the user-space drivers and user-space clients. Each application node must have a compatible
label "snps,accel-app"
. For each node with this label the snps_accel_app
driver creates an application device instance and creates character device with index in /dev/snps/arcnet<N>/app<M>
, where <M>
is the device index, and <N>
is the index of ARCnet (ARCSync).
For example, for a configuration with two different firmware instances for NPX cores and one firmware instance for an VPX core and separate Linux host applications for NPX and VPX, the configuration of nodes in DTS has the following high-level structure.
snps_accel ---
|---remoteproc_npx0
|---remoteproc_npx1
|---remoteproc_vpx
|---app_npx
|---app_vpx
In addition to the accelerator node snps_accel
the accelerator configuration in device tree should contain two extra types of nodes: an ARCSync IP (control unit) node and an NPX configuration node.
The ARCSync node must contain a compatible "snps,arcsync"
label. If the system has several ARCSync IP units, add the ARCSync node for each control unit. For each node with the "snps,arcsync"
compatible
label the snps_arcsync driver creates an instance.
The snps_arcsync
driver provides an internal API for the snps_accel_app
and the snps_accel_rproc
driver to control NPX and VPX cores through the ARCSync unit. Each remoteproc and application node must contain a property snps,arcsync-ctrl
with the reference to the ARCsync node.
The NPX config node contains the address of the NPX AXI configuration interface MMIO and a set of cluster-network parameters used by the snps_accel_rproc
driver to set up the NPX cluster network. The first remoteproc node for NPX processors should contain a property snps,npu-cfg
with a reference to the NPX config node to perform initial cluster-network setup.
...
#address-cells = <2>;
#size-cells = <2>;
...
snsp_accel@0 {
compatible = "snps,accel", "simple-bus";
#address-cells = <1>;
#size-cells = <1>;
reg = <0x0 0x0000000 0x1 0x00000000>;
range = <0x10000000 0x0 0x10000000 0x20000000>;
...
}
...
compatible
The top-level node with the accelerator description should contain a compatible
property with the "snps,accel"
and "simple-bus"
labels.
"snps,accel"
– compatible string for driver matching
"simple_bus"
– bus functionality to manage child nodes
#address-cells and #size-cells
Within an accelerator node (snps_accel
) and in child nodes the device address (DA) must be used to describe shared-memory regions.
The accelerator has shared memory in the 32-bit address space.
#address-cells = <1>;
#size-cells = <1>;
Set the address and size cells number to 1
for convenience. This should be enough for a 32-bit space.
reg
The reg
property describes the location of the entire accelerator shared memory in pairs of <address>
and <size>
. This property must contain the device address (DA), not the CPU address. The drivers don't map this entire shared memory region. This value is used by the
remoteproc
and accel_app
drivers to check that their shared-memory regions, specified in the remoteproc and application child nodes, are in the correct range.
ranges
The ranges
property describes shared memory DA->PA (device address to CPU address) translations. ranges
is a list of address translations. Each entry in the ranges table is a tuple containing the child address, the parent address, and the size of the region
in the child address space. The size of each field is determined by taking the child's #address-cells
value, the parent's #address-cells value
, and the child's #size-cells
value.
The remoteproc driver reads the ranges
property and calculates the offset between the device and the CPU address. This offset is needed to correctly locate the firmware loadable sections linked with the device address.
Usage examples reg and ranges:
The system has a 64 Mb shared-memory region between the accelerator and the host CPU. The accelerator and the host CPU view this memory at address 0xC0000000. Translation is one-to-one, with no offset. The description of this shared-memory region should be the following:
reg = <0x0 0xC0000000 0x0 0x4000000>;
ranges = <0xC0000000 0x0 0xC0000000 0x4000000>;
The system has a 64 Mb shared-memory region between the accelerator and the host CPU. The accelerator address is 0xD0000000, but the host CPU has shared memory at 0xC0000000. There is a -0x10000000 device-to-CPU translation offset. The description of this shared-memory region should be the following:
reg = <0x0 0xD0000000 0x0 0x4000000>;
ranges = <0xD0000000 0x0 0xC0000000 0x4000000>;
The default DTS files contain the following description:
reg = <0x0 0x00000000 0x0 0x8000000>;
ranges = <0x00000000 0x0 0x00000000 0x8000000>;
This default config tells the accelerator and remoteproc drivers to consider the first 2 GB of CPU address space as shared memory with one-to-one memory translation and allows drivers to use this region for the firmware code and runtime data.
remoteproc_npx0: remoteproc_npx@0x8000000 {
compatible = "snps,npx-rproc";
reg = <0x10000000 0x1000000>;
firmware-name = "npx-app.elf";
snps,npu-cfg = <&npu_cfg0>;
snps,arcsync-ctrl = <&arcsync0>;
snps,arcsync-cluster-id = <0x0>;
snps,arcsync-core-id = <0x1>;
snps,auto-boot;
};
compatible
Each remoteproc node for the accelerator must contain a compatible
label with "snps,npx-rproc"
or "snps,vpx-rproc"
. Use the "snps,vpx-rproc"
compatible
label if the remoteproc driver starts the VPX cores. Use the "snps,npx-rproc"
compatible
label for NPX cores. For NPX cores the driver (if it is described in the DTS) performs additional setup of NPX groups and the NPX cluster network.
reg
The reg
property describes shared-memory regions for firmware loadable sections. The remoteproc framework requires a static memory definition for firmware code. Memory for all loadable sections that the remoteproc driver loads must be specified in the reg
property. The driver maps regions specified in the reg
property before ELF sections load. If multiple shared-memory regions are needed for firmware, they can be added as a comma-separated array:
reg = <0x10000000 0x100000>,
<0x10200000 0x100000>,
<0x10300000 0x100000>;
firmware-name
The firmware-name
property contains the firmware file name in the /lib/firmware
directory. The driver
looks for the firmware file with the name specified in this property. The firmware name can be changed later by writing to the remoterpoc sysfs entry
firmware:
$ echo new_firmware.elf > /sys/class/remoteproc/remoteproc0/firmware
snps,npu-cfg
The snps,npu-cfg
property contains a reference (phandle) to a NPX config node. This property tells the driver to read the NPU Cluster network description from the NPX config node and perform the cluster-network setup. The NPU cluster-network setup should be done once. If the accelerator description contains several remoteproc nodes for NPX cores only the first node should contain the snps,npu-cfg
property with the NPX config node reference.
snps,cluster-restart
When present, this property instructs the driver to perform a full reset and power-down of the NPX cluster (including all groups and cores) when the control core (identified by the snps,npu-cfg
property) is stopped. It also triggers a full cluster start sequence each time the control core is started. This is useful for platforms that require complete cluster shutdown and reinitialization for proper power management or restart behavior.
snps,arcsync-ctrl
The snps,acrsync-ctrl
property contains a reference (phandle) to a ARCSync node. This property allows the remoteproc driver to get the ARCSync device reference from the arcsync
driver and to address the specific ARCSync control unit. If several instances of ARCSync are present in the system, different accelerators can reference a different ARCSync unit by this property.
snps,arcsync-cluster-id
This property tells the driver the accelerator cluster identifier (cluster number) visible to the ARCSync interface. Each processor (cluster) in the system controlled by the ARCSync unit has a unique cluster identifier. The driver uses this identifier to calculate the core ID and to send control commands to the cores though the ARCSync driver. This property is not an array; only one cluster number can be specified.
For example, the NPP development platform has the following cluster identifiers: NPX - 0, VPX - 1, host CPU - 2;
snps,arcsync-core-id
The snps,arcsync-core-id
property contains one or more core numbers. These numbers are core numbers inside the cluster. For example, for an NPX cluster that has one L2 and four L1 cores numbers, the snps,arcsync-core-id
is L2 - 0, L10 - 1, L11 - 2,
L12 - 3, L13 - 4.
The remoteproc driver uses the cluster-id
and core-id
to send control commands to a specific core through ARCSync driver function calls. The ARCSync driver calculates the effective core-id
based on the cluster-id
and the core-id
received from the remotproc driver.
If the firmware is single-core, specify only one number. Several core numbers can be specified only for firmware with SMP support, as the driver initializes all cores with one reset vector.
snps,auto-boot
This property tells the driver to upload firmware and start cores during the driver startup. Without this option the firmware can be started later by writing to the remoteproc sysfs entry state:
echo start > /sys/class/remoteproc/remoteproc0/state
app_npx0: app_npx@20000000 {
compatible = "snps,accel-app";
reg = <0x20000000 0x10000000>;
snps,arcsync-ctrl = <&arcsync0>;
interrupts = <23>;
};
compatible
Each application node must contain an "snps,accel-app"
compatible
label. For each node with this label the snps_accel_app
driver creates an application device instance and creates a character device.
reg
The reg
property describes a region in the accelerator shared memory reserved for the core firmware runtime (control structures, communication queues, buffers). The firmware and user-space host application share this region. The driver reads only the first region from the reg
property.
Usage notes:
The integrator is responsible for specifying the correct amount of shared memory for the firmware and host user-space application. The size must match the amount of available shared memory and the accelerator firmware and user-space application requirements. A user-space application is allowed to map this region to its own address space to access it, and an incorrect value in this property can cause the user-space application to crash if it needs to access memory outside the allowed space.
snps,arcsync-ctrl
The snps,acrsync-ctrl
property contains a reference (phandle) to an ARCSync node. This property allows the application driver to get an ARCsync device reference from the ARCsync driver and to address the specific ARCSync control unit and to receive notifications about interrupts from this unit. If several ARCSync units are present in the system different accelerators may reference different ARCSync units by this property.
interrupts
The interrupts
property contains an interrupt specifier that describes the specific interrupt line. The interrupts
property is the default device-tree property. For a full technical description, see the ePAPR v1.1 specification.
The specifier in the interrupts
property must match with one of IRQs in the ARCSync node. The driver doesn't install its own interrupt-service routine, but instead uses this number to request notifications from the ARCsync driver. The driver reads only the first interrupt specifier and can work only with one interrupt.
arcsync0: arcsync@d4000000 {
compatible = "snps,arcsync";
reg = <0x0 0xd4000000 0x0 0x1000000>;
snps,arcnet-id = <0x0>
snps,host-cluster-id = <0x2>;
interrupts = <23>, <22>;
};
compatible
The ARCSync node must contain an "snps,arcsync"
compatible
label. For each node with this label the snps_arcsync
driver creates a device instance.
reg
This property contains the address and size of the ARCsync MMIO region.
snps,arcnet-id
The logical ARCSync (ARCnet) identifier. This option can be skipped. The default value is 0
. The logical ARCSync identifier is used by the application driver to create character device in a specific ARCnet: /dev/snps/arcnet<N>/app<M>
, where <N>
is the index of the ARCnet (ARCSync), and <M>
is the device index.
snps,host-cluster-id
This property describes the host processor (cluster) identifier as it is visible to the ARCsync interface. This value is needed for the ARCsync driver to calculate the offset in the ARCSync MMIO to manage host interrupts (for interrupt ACK).
interrupts
The interrupts
property contains interrupt
specifiers that describe specific ARCSync interrupt lines. The ARCSync control unit may have several interrupt lines connected to the host CPU. The ARCSync driver reads all interrupt specifiers from the interrupts properties and sets the interrupt-service routine for each interrupt.
interrupt-names
This optional property allows overriding the default naming of ARCSync interrupts. Default format: arcsyncN-irqM
, where N is the ARCNet index and
M is the ARCSync host IRQ index. If interrupt-names
is specified, the name becomes arcsyncN-<irqname>
, where is the string provided in the property. This helps improve readability and traceability in /proc/interrupts
.
The remoteproc driver for the NPX processor reads properties from the NPU config node and performs group setups and cluster-network setup.
The driver finds this node by the phandle
reference in the snps,npu-cfg
property in the remoteproc node.
npu_cfg0: npu_cfg@d3000000 {
reg = <0x0 0xd3000000 0x0 0xF4000>;
snps,npu-slice-num = <4>;
snps,npu-group-num = <1>;
snps,cln-map-start = <0xE0000000>;
snps,csm-size = <0x4000000>;
snps,csm-banks-per-group = <8>;
snps,stu-per-group = <2>;
snps,cln-safety-lvl = <1>;
};
reg
This property contains the address and size of the NPX AXI configuration interface MMIO.
snps,npu-slice-num
Number of L1 slices in the NPX processor. This is the main property for the NPX remoteproc driver. The driver counts all other CLN setup options based on this property.
snps,npu-group-num
Number of groups in the NPX processor. If number of groups is not specified, the driver counts number of groups as:
1 group if num of slices <= 4,
2 groups if 4 < num of slices <= 8,
4 groups if num of slices > 8.
snps,cln-map-start
The start offset (address) of DMI mappings in the cluster-network address space. The default value is 0xE0000000
.
snps,csm-size
The size of the NPX Cluster Shared Memory (CSM). This property can be skipped. The default value is 64 Mb (0x4000000
).
snps,csm-banks-per-group
Number of L2 CSM banks in each group. If this property is not specified, the default value is used. Default is 8
.
snps,stu-per-group
Number of STU channels in each group. The default value is 2
.
snps,cln-safety-lvl
This value tells the driver if the NPX processor has a functional-safety extension or not. The default value is 1
. If the L1 and L2 cores of the NPX processor doesn't have safety registers (no FS extension) use option snps,cln-safety-lvl
with a value of 0
.
snps,skip-cln-setup
This property tells the driver to skip interconnect (CLN) setup during the initial bring-up of the NPX cluster or during each start of the control core (identified by the snps,npu-cfg property) when the snps,cluster-restart
property is present. It is intended for platforms that have default interconnect configurations pre-set in hardware.
Usage notes:
All these options affect cluster-network configuration. In general, it is best to use the setup default values. The only mandatory parameters are reg and snps,npu-slice-num
.
npu_cfg0: npu_cfg@d3000000 {
reg = <0x0 0xd3000000 0x0 0xF4000>;
snps,npu-slice-num = <4>;
};
The driver implements a standard CLN configuration. If this configuration is not suitable for a specific chip, some changes in code may be needed. First, check the remoteproc/snps/npx_config.c file. A good starting point is the definitions in the beginning of this file.
For example, some cluster-network properties, like group connections, are hardwired to specific ports according to the outgoing shuffle table. The driver implements the standard table and this table is coded in the groups_map
array. Change the values in this array if you need to change the connections map.
If the platform has interconnect configuration pre-set in hardware, it is possible to skip interconnect setup in the driver by using the snps,skip-cln-setup
option.
For some configurations, where shared memory is part of the system memory, it is necessary to exclude the accelerator shared-memory region from the system-memory pool. You can do so with help of the reserved-memory node:
reserved-memory {
#address-cells = <2>;
#size-cells = <2>;
ranges;
reserved: buffer@0 {
compatible = "removed-dma-pool";
no-map;
reg = <0x0 0x10000000 0x0 0x20000000>;
};
};
This description tells the kernel memory manager that it should not use the region with PA from 0x10000000
to 0x30000000
to allocate system memory.
The accelerator helper driver supports dma-bufs
, and allocates contiguous buffers for dma-bufs
. The default Linux physical memory allocator (buddy allocator) does not allow getting huge contiguous buffers. The contiguous memory allocator (CMA) lets you overcome this limitation.
The VPX and NPX driver uses an internal Linux kernel DMA framework to allocate buffers in system memory, and if CMA is enabled for DMA buffers (the Linux configuration option CONFIG_DMA_CMA
is set), the driver requests buffers through the CMA.
To enable CMA, set the kernel configuration option CONFIG_DMA_CMA
.
Reserve some memory in the .dts
file for the CMA pool. Example of a reserved memory node in DTS:
reserved-memory {
#address-cells = <2>;
#size-cells = <2>;
ranges;
reserved: buffer@0 {
compatible = "shared-dma-pool";
reusable;
reg = <0x0 0xC0000000 0x0 0x10000000>;
linux,cma-default;
};
};
This example describes a simple configuration for the ARC host based NPP:
- The accelerator has two L1 slices
- NPU configs MMIO at 0xd3000000
- ARCSync MMIO is at 0xd4000000
- ARCSync has two interrupt lines (23 and 22) connected to the host CPU
- The shared memory region is 0x10000000 - 0x30000000
- One-to-one DA-PA translation
- The remoteproc driver starts three different firmware instances:
- Firmware for NPX L2 core (cluster-id 0x0, core-id 0x0) - Deployment_l2.elf linked at base -
0x10000000
size -0x01000000
- Firmware for NPX L1 slice 0 (cluster-id 0x0, core-id 0x1) - Deployment_l1_0.elf linked at base -
0x11000000
size -0x01000000
- Firmware for NPX L1 slice 1 (cluster-id 0x0, core-id 0x2) - Deployment_l1_1.elf linked at base -
0x12000000
size -0x01000000
- Firmware for NPX L2 core (cluster-id 0x0, core-id 0x0) - Deployment_l2.elf linked at base -
- The application helper driver reserves a region 0x20000000-0x30000000 in the accelerator shared memory for access by the firmware runtime and the host user-space driver. The application helper driver receives notifications from the interrupt line 23.
reserved-memory { <--- reserve shared memory area if needed
#address-cells = <2>; to exclude from the system ram
#size-cells = <2>;
ranges;
reserved: buffer@0 {
compatible = "removed-dma-pool";
no-map;
reg = <0x0 0x10000000 0x0 0x20000000>;
};
};
npu_cfg0: npu_cfg@d3000000 { <--- Cluster Network config node
reg = <0x0 0xd3000000 0x0 0xF4000>;
snps,npu-slice-num = <2>;
snps,skip-cln-setup;
};
arcsync0: arcsync@d4000000 { <--- ARCSync node
compatible = "snps,arcsync";
reg = <0x0 0xd4000000 0x0 0x1000000>;
snps,host-cluster-id = <0x2>;
interrupts = <23>, <22>;
};
snsp_accel@0 { <--- Top level snps_accel node
compatible = "snps,accel", "simple-bus";
#address-cells = <1>;
#size-cells = <1>;
reg = <0x0 0x10000000 0x0 0x20000000>;
range = <0x10000000 0x0 0x10000000 0x20000000>;
remoteproc_npx_l2: remoteproc_npx@10000000 { <--- Remoteproc node
compatible = "snps,npx-rproc";
reg = <0x10000000 0x1000000>;
firmware-name = "Deployment_l2.elf";
snps,npu-cfg = <&npu_cfg0>;
snps,cluster-restart;
snps,arcsync-ctrl = <&arcsync0>;
snps,arcsync-cluster-id = <0x0>;
snps,arcsync-core-id = <0x0>;
snps,auto-boot;
};
remoteproc_npx_l1_0: remoteproc_npx@0x11000000 {
compatible = "snps,npx-rproc";
reg = <0x11000000 0x1000000>;
firmware-name = "Deployment_l1_0.elf";
snps,arcsync-ctrl = <&arcsync0>;
snps,arcsync-cluster-id = <0x0>;
snps,arcsync-core-id = <0x1>;
snps,auto-boot;
};
remoteproc_npx_l1_1: remoteproc_npx@0x12000000 {
compatible = "snps,npx-rproc";
reg = <0x12000000 0x1000000>;
firmware-name = "Deployment_l1_1.elf";
snps,arcsync-ctrl = <&arcsync0>;
snps,arcsync-cluster-id = <0x0>;
snps,arcsync-core-id = <0x2>;
snps,auto-boot;
};
app_npx: app_npx@0x20000000 { <--- Application node
compatible = "snps,accel-app";
reg = <0x20000000 0x10000000>
snps,arcsync-ctrl = <&arcsync0>;
interrupts = <23>;
};
};
This example describes a simple configuration for an ARC host-based NPP:
- The accelerator has two L1 slices and VPX DSP
- NPU configs MMIO at 0xd3000000
- ARCSync MMIO is at 0xd4000000
- ARCSync has one interrupt line connected to the host CPU - 24
- The shared memory region is 0x10000000 - 0x40000000
- One-to-one DA-PA translation
- The remoteproc driver starts three different firmware instances:
- Firmware for VPX core (cluster-id 0x1, core-id 0x0) - Deployment_vpx.elf linked at base -
0x18000000
size -0x01000000
- Firmware for NPX L2 core (cluster-id 0x0, core-id 0x1) - Deployment_l2.elf linked at base -
0x20000000
size -0x02000000
- Firmware for NPX L1 core (cluster-id 0x0, core-id 0x1) - Deployment_l1.elf linked at base -
0x10000000
size -0x02000000
- Firmware for VPX core (cluster-id 0x1, core-id 0x0) - Deployment_vpx.elf linked at base -
- The application helper driver reserves a region 0x30000000-0x40000000 in the accelerator shared memory for access by the firmware runtime and the host user-space driver. The application helper driver receives notifications from interrupt line 24.
npu_cfg0: npu_cfg@d3000000 {
reg = <0x0 0xd3000000 0x0 0xF4000>;
snps,npu-slice-num = <2>;
};
arcsync0: arcsync@d4000000 {
compatible = "snps,arcsync";
reg = <0x0 0xd4000000 0x0 0x1000000>;
snps,host-cluster-id = <0x2>;
interrupts = <24>;
};
snps_accel@0 {
compatible = "snps,accel", "simple-bus";
#address-cells = <1>;
#size-cells = <1>;
reg = <0x0 0x10000000 0x0 0x30000000>;
ranges = <0x10000000 0x0 0x10000000 0x30000000>;
remoteproc_vpx0: remoteproc_vpx@0x18000000 {
compatible = "snps,vpx-rproc";
reg = <0x18000000 0x01000000>;
firmware-name = "Deployment_vpx.elf";
snps,arcsync-ctrl = <&arcsync0>;
snps,arcsync-core-id = <0x0>;
snps,arcsync-cluster-id = <0x1>;
snps,auto-boot;
};
remoteproc_npx0: remoteproc_npx@0x5000000 {
compatible = "snps,npx-rproc";
reg = <0x20000000 0x2000000>;
firmware-name = "Deployment_l2.elf";
snps,npu-cfg = <&npu_cfg0>;
snps,cluster-restart;
snps,arcsync-ctrl = <&arcsync0>;
snps,arcsync-core-id = <0x0>;
snps,arcsync-cluster-id = <0x0>;
snps,auto-boot;
};
remoteproc_npx1: remoteproc_npx@0x8000000 {
compatible = "snps,npx-rproc";
reg = <0x10000000 0x2000000>;
firmware-name = "Deployment_l1.elf";
snps,arcsync-ctrl = <&arcsync0>;
snps,arcsync-core-id = <0x1>;
snps,arcsync-cluster-id = <0x0>;
snps,auto-boot;
};
app_npx2: app_npx@2 {
compatible = "snps,accel-app";
reg = <0x30000000 0x10000000>;
snps,arcsync-ctrl = <&arcsync0>;
interrupts = <24>;
};
};