[PATCH V5 00/10] AMD XDNA driver

Lizhi Hou lizhi.hou at amd.com
Fri Oct 25 21:28:03 UTC 2024


On 10/25/24 10:55, Jeffrey Hugo wrote:
> On 10/21/2024 10:19 AM, Lizhi Hou wrote:
>> This patchset introduces a new Linux Kernel Driver, amdxdna for AMD 
>> NPUs.
>> The driver is based on Linux accel subsystem.
>>
>> NPU (Neural Processing Unit) is an AI inference accelerator integrated
>> into AMD client CPUs. NPU enables efficient execution of Machine 
>> Learning
>> applications like CNNs, LLMs, etc.  NPU is based on AMD XDNA
>> architecture [1].
>>
>> AMD NPU consists of the following components:
>>
>>    - Tiled array of AMD AI Engine processors.
>>    - Micro Controller which runs the NPU Firmware responsible for
>>      command processing, AIE array configuration, and execution 
>> management.
>>    - PCI EP for host control of the NPU device.
>>    - Interconnect for connecting the NPU components together.
>>    - SRAM for use by the NPU Firmware.
>>    - Address translation hardware for protected host memory access by 
>> the
>>      NPU.
>>
>> NPU supports multiple concurrent fully isolated contexts. Concurrent
>> contexts may be bound to AI Engine array spatially and or temporarily.
>>
>> The driver is licensed under GPL-2.0 except for UAPI header which is
>> licensed GPL-2.0 WITH Linux-syscall-note.
>>
>> User mode driver stack consists of XRT [2] and AMD AIE Plugin for 
>> IREE [3].
>>
>> The firmware for the NPU is distributed as a closed source binary, 
>> and has
>> already been pushed to the DRM firmware repository [4].
>>
>> [1]https://www.amd.com/en/technologies/xdna.html
>> [2]https://github.com/Xilinx/XRT
>> [3]https://github.com/nod-ai/iree-amd-aie
>> [4]https://gitlab.freedesktop.org/drm/firmware/-/tree/amd-ipu-staging/amdnpu 
>>
>>
>> Changes since v4:
>> - Fix lockdep errors
>> - Use __u* structure for struct aie_error
>
> One nit, when you send the next version would you please either To: or 
> Cc: me on the entire series?  I only get pieces in my inbox which is 
> mildly annoying on my end.
Sure.
>
> Looks like we are getting close here.  One procedural question I have, 
> do you have commit permissions to drm-misc?
No, I do not have commit permissions yet.
>
> I applied the series to drm-misc-next and tried to build.  Got the 
> following errors -

Could you share the build command line? So I can reproduce and verify my 
fix.

I used "make M=drivers/accel/amdxdna" and did not reproduce the error 
with drm-misc-next. It looks build robot did not complain with the patch 
neither.

$ git branch
* drm-misc-next
$ make M=drivers/accel/amdxdna
   CC [M]  drivers/accel/amdxdna/aie2_ctx.o
   CC [M]  drivers/accel/amdxdna/aie2_error.o
   CC [M]  drivers/accel/amdxdna/aie2_message.o
   CC [M]  drivers/accel/amdxdna/aie2_pci.o
   CC [M]  drivers/accel/amdxdna/aie2_psp.o
   CC [M]  drivers/accel/amdxdna/aie2_smu.o
   CC [M]  drivers/accel/amdxdna/aie2_solver.o
   CC [M]  drivers/accel/amdxdna/amdxdna_ctx.o
   CC [M]  drivers/accel/amdxdna/amdxdna_gem.o
   CC [M]  drivers/accel/amdxdna/amdxdna_mailbox.o
   CC [M]  drivers/accel/amdxdna/amdxdna_mailbox_helper.o
   CC [M]  drivers/accel/amdxdna/amdxdna_pci_drv.o
   CC [M]  drivers/accel/amdxdna/amdxdna_sysfs.o
   CC [M]  drivers/accel/amdxdna/npu1_regs.o
   CC [M]  drivers/accel/amdxdna/npu2_regs.o
   CC [M]  drivers/accel/amdxdna/npu4_regs.o
   CC [M]  drivers/accel/amdxdna/npu5_regs.o
   LD [M]  drivers/accel/amdxdna/amdxdna.o
   MODPOST drivers/accel/amdxdna/Module.symvers
   CC [M]  drivers/accel/amdxdna/amdxdna.mod.o
   CC [M]  drivers/accel/amdxdna/.module-common.o
   LD [M]  drivers/accel/amdxdna/amdxdna.ko
$

>
>   CC [M]  drivers/accel/amdxdna/aie2_ctx.o
>   CC [M]  drivers/accel/amdxdna/aie2_error.o
>   CC [M]  drivers/accel/amdxdna/aie2_message.o
>   CC [M]  drivers/accel/amdxdna/aie2_pci.o
>   CC [M]  drivers/accel/amdxdna/aie2_psp.o
>   CC [M]  drivers/accel/amdxdna/aie2_smu.o
>   CC [M]  drivers/accel/amdxdna/aie2_solver.o
>   CC [M]  drivers/accel/amdxdna/amdxdna_ctx.o
>   CC [M]  drivers/accel/amdxdna/amdxdna_gem.o
>   CC [M]  drivers/accel/amdxdna/amdxdna_mailbox.o
>   CC [M]  drivers/accel/amdxdna/amdxdna_mailbox_helper.o
>   CC [M]  drivers/accel/amdxdna/amdxdna_pci_drv.o
>   CC [M]  drivers/accel/amdxdna/amdxdna_sysfs.o
>   CC [M]  drivers/accel/amdxdna/npu1_regs.o
>   CC [M]  drivers/accel/amdxdna/npu2_regs.o
>   CC [M]  drivers/accel/amdxdna/npu4_regs.o
>   CC [M]  drivers/accel/amdxdna/npu5_regs.o
>   AR      drivers/base/firmware_loader/built-in.a
>   AR      drivers/base/built-in.a
> In file included from drivers/accel/amdxdna/aie2_message.c:19:
> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
> drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit 
> declaration of function ‘FIELD_GET’ 
> [-Werror=implicit-function-declaration]
>   112 |         return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
>       |                ^~~~~~~~~
> In file included from drivers/accel/amdxdna/amdxdna_gem.c:15:
> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
> drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit 
> declaration of function ‘FIELD_GET’ 
> [-Werror=implicit-function-declaration]
>   112 |         return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
>       |                ^~~~~~~~~
> In file included from drivers/accel/amdxdna/aie2_psp.c:11:
> drivers/accel/amdxdna/aie2_psp.c: In function ‘psp_exec’:
> drivers/accel/amdxdna/aie2_psp.c:62:34: error: implicit declaration of 
> function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
>    62 | FIELD_GET(PSP_STATUS_READY, ready),
>       |                                  ^~~~~~~~~
> ./include/linux/iopoll.h:47:21: note: in definition of macro 
> ‘read_poll_timeout’
>    47 |                 if (cond) \
>       |                     ^~~~
> drivers/accel/amdxdna/aie2_psp.c:61:15: note: in expansion of macro 
> ‘readx_poll_timeout’
>    61 |         ret = readx_poll_timeout(readl, PSP_REG(psp, 
> PSP_STATUS_REG), ready,
>       |               ^~~~~~~~~~~~~~~~~~
> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
> drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit 
> declaration of function ‘FIELD_PREP’ 
> [-Werror=implicit-function-declaration]
>   121 |         cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
>       |                        ^~~~~~~~~~
> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
> drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit 
> declaration of function ‘FIELD_PREP’ 
> [-Werror=implicit-function-declaration]
>   121 |         cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
>       |                        ^~~~~~~~~~
> In file included from drivers/accel/amdxdna/aie2_pci.c:22:
> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
> drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit 
> declaration of function ‘FIELD_GET’ 
> [-Werror=implicit-function-declaration]
>   112 |         return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
>       |                ^~~~~~~~~
> In file included from drivers/accel/amdxdna/aie2_ctx.c:18:
> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
> drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit 
> declaration of function ‘FIELD_GET’ 
> [-Werror=implicit-function-declaration]
>   112 |         return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
>       |                ^~~~~~~~~
> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
> drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit 
> declaration of function ‘FIELD_PREP’ 
> [-Werror=implicit-function-declaration]
>   121 |         cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
>       |                        ^~~~~~~~~~
> In file included from drivers/accel/amdxdna/amdxdna_ctx.c:16:
> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
> drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit 
> declaration of function ‘FIELD_GET’ 
> [-Werror=implicit-function-declaration]
>   112 |         return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
>       |                ^~~~~~~~~
> cc1: all warnings being treated as errors
> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
> drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit 
> declaration of function ‘FIELD_PREP’ 
> [-Werror=implicit-function-declaration]
>   121 |         cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
>       |                        ^~~~~~~~~~
> drivers/accel/amdxdna/aie2_ctx.c: In function ‘aie2_hwctx_restart’:
> drivers/accel/amdxdna/aie2_ctx.c:114:9: error: too few arguments to 
> function ‘drm_sched_start’
>   114 |         drm_sched_start(&hwctx->priv->sched);
>       |         ^~~~~~~~~~~~~~~
> In file included from ./include/trace/events/amdxdna.h:12,
>                  from drivers/accel/amdxdna/aie2_ctx.c:13:
> ./include/drm/gpu_scheduler.h:593:6: note: declared here
>   593 | void drm_sched_start(struct drm_gpu_scheduler *sched, int errno);
>       |      ^~~~~~~~~~~~~~~
> make[5]: *** [scripts/Makefile.build:229: 
> drivers/accel/amdxdna/aie2_psp.o] Error 1
> make[5]: *** Waiting for unfinished jobs....
> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
> drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit 
> declaration of function ‘FIELD_PREP’ 
> [-Werror=implicit-function-declaration]
>   121 |         cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
>       |                        ^~~~~~~~~~
> In file included from drivers/accel/amdxdna/amdxdna_pci_drv.c:18:
> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
> drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit 
> declaration of function ‘FIELD_GET’ 
> [-Werror=implicit-function-declaration]
>   112 |         return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
>       |                ^~~~~~~~~
> cc1: all warnings being treated as errors
> make[5]: *** [scripts/Makefile.build:229: 
> drivers/accel/amdxdna/aie2_ctx.o] Error 1
> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
> drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit 
> declaration of function ‘FIELD_PREP’ 
> [-Werror=implicit-function-declaration]
>   121 |         cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
>       |                        ^~~~~~~~~~
> drivers/accel/amdxdna/amdxdna_mailbox.c: In function 
> ‘xdna_mailbox_send_msg’:
> drivers/accel/amdxdna/amdxdna_mailbox.c:444:26: error: implicit 
> declaration of function ‘FIELD_PREP’ 
> [-Werror=implicit-function-declaration]
>   444 |         header->sz_ver = FIELD_PREP(MSG_BODY_SZ, 
> msg->send_size) |
>       |                          ^~~~~~~~~~
>
>
> You also have the following checkpatch issues -

Could you share the command you used?  I tried to use 'dim checkpatch' 
and it did not find out the misspelling issue.


Thanks,

Lizhi

>
>
> WARNING: 'Disalbe' may be misspelled - perhaps 'Disable'?
> #1646: FILE: drivers/accel/amdxdna/amdxdna_mailbox.c:553:
> +       /* Disalbe an irq and wait. This might sleep. */
>            ^^^^^^^
>
> WARNING: 'splite' may be misspelled - perhaps 'split'?
> #1695: FILE: drivers/accel/amdxdna/amdxdna_mailbox.h:21:
> + * The mailbox will splite the sending data in to multiple firmware 
> message if
>                      ^^^^^^
>
> WARNING: 'miliseconds' may be misspelled - perhaps 'milliseconds'?
> #1875: FILE: drivers/accel/amdxdna/amdxdna_mailbox_helper.h:9:
> +#define TX_TIMEOUT 2000 /* miliseconds */
>                             ^^^^^^^^^^^
>
> WARNING: 'miliseconds' may be misspelled - perhaps 'milliseconds'?
> #1876: FILE: drivers/accel/amdxdna/amdxdna_mailbox_helper.h:10:
> +#define RX_TIMEOUT 5000 /* miliseconds */
>                             ^^^^^^^^^^^
>
> total: 0 errors, 4 warnings, 0 checks, 1916 lines checked
>
> NOTE: For some of the reported defects, checkpatch may be able to
>       mechanically convert to the typical style using --fix or 
> --fix-inplace.
>
> 0003-accel-amdxdna-Support-hardware-mailbox.patch has style problems, 
> please review.
>
>
>
> 0007-accel-amdxdna-Add-command-execution.patch
> ----------------------------------------------
> WARNING: 'miliseconds' may be misspelled - perhaps 'milliseconds'?
> #59: FILE: drivers/accel/amdxdna/aie2_ctx.c:27:
> +#define HWCTX_MAX_TIMEOUT      60000 /* miliseconds */
>                                          ^^^^^^^^^^^
>
> WARNING: 'reverve' may be misspelled - perhaps 'reserve'?
> #612: FILE: drivers/accel/amdxdna/aie2_ctx.c:779:
> +               XDNA_WARN(xdna, "Failed to reverve fence, ret %d", ret);
>                                            ^^^^^^^
>
> WARNING: 'Exectuion' may be misspelled - perhaps 'Execution'?
> #1899: FILE: drivers/accel/amdxdna/amdxdna_pci_drv.c:139:
> +       /* Exectuion */
>            ^^^^^^^^^
>
> WARNING: 'exectuion' may be misspelled - perhaps 'execution'?
> #2113: FILE: include/uapi/drm/amdxdna_accel.h:239:
> + * struct amdxdna_drm_wait_cmd - Wait exectuion command.
>                                        ^^^^^^^^^
>
> total: 0 errors, 10 warnings, 0 checks, 1983 lines checked
>
> NOTE: For some of the reported defects, checkpatch may be able to
>       mechanically convert to the typical style using --fix or 
> --fix-inplace.
>
> 0007-accel-amdxdna-Add-command-execution.patch has style problems, 
> please review.
>
>
> 0008-accel-amdxdna-Add-suspend-and-resume.patch
> -----------------------------------------------
> WARNING: 'miliseconds' may be misspelled - perhaps 'milliseconds'?
> #163: FILE: drivers/accel/amdxdna/amdxdna_pci_drv.c:22:
> +#define AMDXDNA_AUTOSUSPEND_DELAY      5000 /* miliseconds */
>                                                 ^^^^^^^^^^^
>
> total: 0 errors, 1 warnings, 0 checks, 302 lines checked
>
> NOTE: For some of the reported defects, checkpatch may be able to
>       mechanically convert to the typical style using --fix or 
> --fix-inplace.
>
> 0008-accel-amdxdna-Add-suspend-and-resume.patch has style problems, 
> please review.
>
>
> -Jeff


More information about the dri-devel mailing list