[PATCH V5 00/10] AMD XDNA driver
Lizhi Hou
lizhi.hou at amd.com
Tue Oct 29 15:24:37 UTC 2024
On 10/25/24 15:02, Jeffrey Hugo wrote:
> On 10/25/2024 3:28 PM, Lizhi Hou wrote:
>>
>> On 10/25/24 10:55, Jeffrey Hugo wrote:
>>> On 10/21/2024 10:19 AM, Lizhi Hou wrote:
>>>> This patchset introduces a new Linux Kernel Driver, amdxdna for AMD
>>>> NPUs.
>>>> The driver is based on Linux accel subsystem.
>>>>
>>>> NPU (Neural Processing Unit) is an AI inference accelerator integrated
>>>> into AMD client CPUs. NPU enables efficient execution of Machine
>>>> Learning
>>>> applications like CNNs, LLMs, etc. NPU is based on AMD XDNA
>>>> architecture [1].
>>>>
>>>> AMD NPU consists of the following components:
>>>>
>>>> - Tiled array of AMD AI Engine processors.
>>>> - Micro Controller which runs the NPU Firmware responsible for
>>>> command processing, AIE array configuration, and execution
>>>> management.
>>>> - PCI EP for host control of the NPU device.
>>>> - Interconnect for connecting the NPU components together.
>>>> - SRAM for use by the NPU Firmware.
>>>> - Address translation hardware for protected host memory access
>>>> by the
>>>> NPU.
>>>>
>>>> NPU supports multiple concurrent fully isolated contexts. Concurrent
>>>> contexts may be bound to AI Engine array spatially and or temporarily.
>>>>
>>>> The driver is licensed under GPL-2.0 except for UAPI header which is
>>>> licensed GPL-2.0 WITH Linux-syscall-note.
>>>>
>>>> User mode driver stack consists of XRT [2] and AMD AIE Plugin for
>>>> IREE [3].
>>>>
>>>> The firmware for the NPU is distributed as a closed source binary,
>>>> and has
>>>> already been pushed to the DRM firmware repository [4].
>>>>
>>>> [1]https://www.amd.com/en/technologies/xdna.html
>>>> [2]https://github.com/Xilinx/XRT
>>>> [3]https://github.com/nod-ai/iree-amd-aie
>>>> [4]https://gitlab.freedesktop.org/drm/firmware/-/tree/amd-ipu-staging/amdnpu
>>>>
>>>>
>>>> Changes since v4:
>>>> - Fix lockdep errors
>>>> - Use __u* structure for struct aie_error
>>>
>>> One nit, when you send the next version would you please either To:
>>> or Cc: me on the entire series? I only get pieces in my inbox which
>>> is mildly annoying on my end.
>> Sure.
>>>
>>> Looks like we are getting close here. One procedural question I
>>> have, do you have commit permissions to drm-misc?
>> No, I do not have commit permissions yet.
>
> You should apply for access. Assuming this series is ready before
> that goes through, I'll apply it.
>
>>> I applied the series to drm-misc-next and tried to build. Got the
>>> following errors -
>>
>> Could you share the build command line? So I can reproduce and verify
>> my fix.
>
> The command is simple:
> make -j20
>
> The system details, incase it somehow matters:
> Ubuntu 22.04 w/ 5.15 kernel
>
> $ lsb_release -a
> No LSB modules are available.
> Distributor ID: Ubuntu
> Description: Ubuntu 22.04.3 LTS
> Release: 22.04
> Codename: jammy
>
> $ uname -a
> Linux jhugo-lnx 5.15.0-89-generic #99-Ubuntu SMP Mon Oct 30 20:42:41
> UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
>
> The kernel config is probably the relevant piece. When I first built
> after applying the series, I was asked to choose what to do with the
> new config item. I selected =m.
> .config can be found at
> https://gist.github.com/quic-jhugo/4cc249b1e3ba127039fbc709a513a432
>
>>
>> I used "make M=drivers/accel/amdxdna" and did not reproduce the error
>> with drm-misc-next. It looks build robot did not complain with the
>> patch neither.
>>
>> $ git branch
>> * drm-misc-next
>> $ make M=drivers/accel/amdxdna
>> CC [M] drivers/accel/amdxdna/aie2_ctx.o
>> CC [M] drivers/accel/amdxdna/aie2_error.o
>> CC [M] drivers/accel/amdxdna/aie2_message.o
>> CC [M] drivers/accel/amdxdna/aie2_pci.o
>> CC [M] drivers/accel/amdxdna/aie2_psp.o
>> CC [M] drivers/accel/amdxdna/aie2_smu.o
>> CC [M] drivers/accel/amdxdna/aie2_solver.o
>> CC [M] drivers/accel/amdxdna/amdxdna_ctx.o
>> CC [M] drivers/accel/amdxdna/amdxdna_gem.o
>> CC [M] drivers/accel/amdxdna/amdxdna_mailbox.o
>> CC [M] drivers/accel/amdxdna/amdxdna_mailbox_helper.o
>> CC [M] drivers/accel/amdxdna/amdxdna_pci_drv.o
>> CC [M] drivers/accel/amdxdna/amdxdna_sysfs.o
>> CC [M] drivers/accel/amdxdna/npu1_regs.o
>> CC [M] drivers/accel/amdxdna/npu2_regs.o
>> CC [M] drivers/accel/amdxdna/npu4_regs.o
>> CC [M] drivers/accel/amdxdna/npu5_regs.o
>> LD [M] drivers/accel/amdxdna/amdxdna.o
>> MODPOST drivers/accel/amdxdna/Module.symvers
>> CC [M] drivers/accel/amdxdna/amdxdna.mod.o
>> CC [M] drivers/accel/amdxdna/.module-common.o
>> LD [M] drivers/accel/amdxdna/amdxdna.ko
>> $
>>
>>>
>>> CC [M] drivers/accel/amdxdna/aie2_ctx.o
>>> CC [M] drivers/accel/amdxdna/aie2_error.o
>>> CC [M] drivers/accel/amdxdna/aie2_message.o
>>> CC [M] drivers/accel/amdxdna/aie2_pci.o
>>> CC [M] drivers/accel/amdxdna/aie2_psp.o
>>> CC [M] drivers/accel/amdxdna/aie2_smu.o
>>> CC [M] drivers/accel/amdxdna/aie2_solver.o
>>> CC [M] drivers/accel/amdxdna/amdxdna_ctx.o
>>> CC [M] drivers/accel/amdxdna/amdxdna_gem.o
>>> CC [M] drivers/accel/amdxdna/amdxdna_mailbox.o
>>> CC [M] drivers/accel/amdxdna/amdxdna_mailbox_helper.o
>>> CC [M] drivers/accel/amdxdna/amdxdna_pci_drv.o
>>> CC [M] drivers/accel/amdxdna/amdxdna_sysfs.o
>>> CC [M] drivers/accel/amdxdna/npu1_regs.o
>>> CC [M] drivers/accel/amdxdna/npu2_regs.o
>>> CC [M] drivers/accel/amdxdna/npu4_regs.o
>>> CC [M] drivers/accel/amdxdna/npu5_regs.o
>>> AR drivers/base/firmware_loader/built-in.a
>>> AR drivers/base/built-in.a
>>> In file included from drivers/accel/amdxdna/aie2_message.c:19:
>>> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
>>> drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit
>>> declaration of function ‘FIELD_GET’
>>> [-Werror=implicit-function-declaration]
>>> 112 | return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
>>> | ^~~~~~~~~
>>> In file included from drivers/accel/amdxdna/amdxdna_gem.c:15:
>>> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
>>> drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit
>>> declaration of function ‘FIELD_GET’
>>> [-Werror=implicit-function-declaration]
>>> 112 | return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
>>> | ^~~~~~~~~
>>> In file included from drivers/accel/amdxdna/aie2_psp.c:11:
>>> drivers/accel/amdxdna/aie2_psp.c: In function ‘psp_exec’:
>>> drivers/accel/amdxdna/aie2_psp.c:62:34: error: implicit declaration
>>> of function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
>>> 62 | FIELD_GET(PSP_STATUS_READY, ready),
>>> | ^~~~~~~~~
>>> ./include/linux/iopoll.h:47:21: note: in definition of macro
>>> ‘read_poll_timeout’
>>> 47 | if (cond) \
>>> | ^~~~
>>> drivers/accel/amdxdna/aie2_psp.c:61:15: note: in expansion of macro
>>> ‘readx_poll_timeout’
>>> 61 | ret = readx_poll_timeout(readl, PSP_REG(psp,
>>> PSP_STATUS_REG), ready,
>>> | ^~~~~~~~~~~~~~~~~~
>>> drivers/accel/amdxdna/amdxdna_ctx.h: In function
>>> ‘amdxdna_cmd_set_state’:
>>> drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit
>>> declaration of function ‘FIELD_PREP’
>>> [-Werror=implicit-function-declaration]
>>> 121 | cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
>>> | ^~~~~~~~~~
>>> drivers/accel/amdxdna/amdxdna_ctx.h: In function
>>> ‘amdxdna_cmd_set_state’:
>>> drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit
>>> declaration of function ‘FIELD_PREP’
>>> [-Werror=implicit-function-declaration]
>>> 121 | cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
>>> | ^~~~~~~~~~
>>> In file included from drivers/accel/amdxdna/aie2_pci.c:22:
>>> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
>>> drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit
>>> declaration of function ‘FIELD_GET’
>>> [-Werror=implicit-function-declaration]
>>> 112 | return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
>>> | ^~~~~~~~~
>>> In file included from drivers/accel/amdxdna/aie2_ctx.c:18:
>>> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
>>> drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit
>>> declaration of function ‘FIELD_GET’
>>> [-Werror=implicit-function-declaration]
>>> 112 | return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
>>> | ^~~~~~~~~
>>> drivers/accel/amdxdna/amdxdna_ctx.h: In function
>>> ‘amdxdna_cmd_set_state’:
>>> drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit
>>> declaration of function ‘FIELD_PREP’
>>> [-Werror=implicit-function-declaration]
>>> 121 | cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
>>> | ^~~~~~~~~~
>>> In file included from drivers/accel/amdxdna/amdxdna_ctx.c:16:
>>> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
>>> drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit
>>> declaration of function ‘FIELD_GET’
>>> [-Werror=implicit-function-declaration]
>>> 112 | return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
>>> | ^~~~~~~~~
>>> cc1: all warnings being treated as errors
>>> drivers/accel/amdxdna/amdxdna_ctx.h: In function
>>> ‘amdxdna_cmd_set_state’:
>>> drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit
>>> declaration of function ‘FIELD_PREP’
>>> [-Werror=implicit-function-declaration]
>>> 121 | cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
>>> | ^~~~~~~~~~
>>> drivers/accel/amdxdna/aie2_ctx.c: In function ‘aie2_hwctx_restart’:
>>> drivers/accel/amdxdna/aie2_ctx.c:114:9: error: too few arguments to
>>> function ‘drm_sched_start’
>>> 114 | drm_sched_start(&hwctx->priv->sched);
>>> | ^~~~~~~~~~~~~~~
>>> In file included from ./include/trace/events/amdxdna.h:12,
>>> from drivers/accel/amdxdna/aie2_ctx.c:13:
>>> ./include/drm/gpu_scheduler.h:593:6: note: declared here
>>> 593 | void drm_sched_start(struct drm_gpu_scheduler *sched, int
>>> errno);
>>> | ^~~~~~~~~~~~~~~
>>> make[5]: *** [scripts/Makefile.build:229:
>>> drivers/accel/amdxdna/aie2_psp.o] Error 1
>>> make[5]: *** Waiting for unfinished jobs....
>>> drivers/accel/amdxdna/amdxdna_ctx.h: In function
>>> ‘amdxdna_cmd_set_state’:
>>> drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit
>>> declaration of function ‘FIELD_PREP’
>>> [-Werror=implicit-function-declaration]
>>> 121 | cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
>>> | ^~~~~~~~~~
>>> In file included from drivers/accel/amdxdna/amdxdna_pci_drv.c:18:
>>> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
>>> drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit
>>> declaration of function ‘FIELD_GET’
>>> [-Werror=implicit-function-declaration]
>>> 112 | return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
>>> | ^~~~~~~~~
>>> cc1: all warnings being treated as errors
>>> make[5]: *** [scripts/Makefile.build:229:
>>> drivers/accel/amdxdna/aie2_ctx.o] Error 1
>>> drivers/accel/amdxdna/amdxdna_ctx.h: In function
>>> ‘amdxdna_cmd_set_state’:
>>> drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit
>>> declaration of function ‘FIELD_PREP’
>>> [-Werror=implicit-function-declaration]
>>> 121 | cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
>>> | ^~~~~~~~~~
>>> drivers/accel/amdxdna/amdxdna_mailbox.c: In function
>>> ‘xdna_mailbox_send_msg’:
>>> drivers/accel/amdxdna/amdxdna_mailbox.c:444:26: error: implicit
>>> declaration of function ‘FIELD_PREP’
>>> [-Werror=implicit-function-declaration]
>>> 444 | header->sz_ver = FIELD_PREP(MSG_BODY_SZ,
>>> msg->send_size) |
>>> | ^~~~~~~~~~
>>>
>>>
>>> You also have the following checkpatch issues -
>>
>> Could you share the command you used? I tried to use 'dim
>> checkpatch' and it did not find out the misspelling issue.
>
> ./scripts/checkpatch.pl --strict --codespell *.patch
>
> Note, --codespell requires some local setup. I beleive the comments
> in the checkpatch.pl script are fairly straightforward. I use a copy
> of the database from the github that is rather recent. The Ubuntu
> distro package is really out of date and I don't think I looked to see
> if there is a pythong pip version. Grabbing the one file from the
> github repo seemed simple emough.
I was able to reproduce with your suggestions. Thanks a lot.
Lizhi
>
> -Jeff
More information about the dri-devel
mailing list