[PATCH V5 00/10] AMD XDNA driver
Jeffrey Hugo
quic_jhugo at quicinc.com
Fri Oct 25 22:02:13 UTC 2024
On 10/25/2024 3:28 PM, Lizhi Hou wrote:
>
> On 10/25/24 10:55, Jeffrey Hugo wrote:
>> On 10/21/2024 10:19 AM, Lizhi Hou wrote:
>>> This patchset introduces a new Linux Kernel Driver, amdxdna for AMD
>>> NPUs.
>>> The driver is based on Linux accel subsystem.
>>>
>>> NPU (Neural Processing Unit) is an AI inference accelerator integrated
>>> into AMD client CPUs. NPU enables efficient execution of Machine
>>> Learning
>>> applications like CNNs, LLMs, etc. NPU is based on AMD XDNA
>>> architecture [1].
>>>
>>> AMD NPU consists of the following components:
>>>
>>> - Tiled array of AMD AI Engine processors.
>>> - Micro Controller which runs the NPU Firmware responsible for
>>> command processing, AIE array configuration, and execution
>>> management.
>>> - PCI EP for host control of the NPU device.
>>> - Interconnect for connecting the NPU components together.
>>> - SRAM for use by the NPU Firmware.
>>> - Address translation hardware for protected host memory access by
>>> the
>>> NPU.
>>>
>>> NPU supports multiple concurrent fully isolated contexts. Concurrent
>>> contexts may be bound to AI Engine array spatially and or temporarily.
>>>
>>> The driver is licensed under GPL-2.0 except for UAPI header which is
>>> licensed GPL-2.0 WITH Linux-syscall-note.
>>>
>>> User mode driver stack consists of XRT [2] and AMD AIE Plugin for
>>> IREE [3].
>>>
>>> The firmware for the NPU is distributed as a closed source binary,
>>> and has
>>> already been pushed to the DRM firmware repository [4].
>>>
>>> [1]https://www.amd.com/en/technologies/xdna.html
>>> [2]https://github.com/Xilinx/XRT
>>> [3]https://github.com/nod-ai/iree-amd-aie
>>> [4]https://gitlab.freedesktop.org/drm/firmware/-/tree/amd-ipu-staging/amdnpu
>>>
>>>
>>> Changes since v4:
>>> - Fix lockdep errors
>>> - Use __u* structure for struct aie_error
>>
>> One nit, when you send the next version would you please either To: or
>> Cc: me on the entire series? I only get pieces in my inbox which is
>> mildly annoying on my end.
> Sure.
>>
>> Looks like we are getting close here. One procedural question I have,
>> do you have commit permissions to drm-misc?
> No, I do not have commit permissions yet.
You should apply for access. Assuming this series is ready before that
goes through, I'll apply it.
>> I applied the series to drm-misc-next and tried to build. Got the
>> following errors -
>
> Could you share the build command line? So I can reproduce and verify my
> fix.
The command is simple:
make -j20
The system details, incase it somehow matters:
Ubuntu 22.04 w/ 5.15 kernel
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.3 LTS
Release: 22.04
Codename: jammy
$ uname -a
Linux jhugo-lnx 5.15.0-89-generic #99-Ubuntu SMP Mon Oct 30 20:42:41 UTC
2023 x86_64 x86_64 x86_64 GNU/Linux
The kernel config is probably the relevant piece. When I first built
after applying the series, I was asked to choose what to do with the new
config item. I selected =m.
.config can be found at
https://gist.github.com/quic-jhugo/4cc249b1e3ba127039fbc709a513a432
>
> I used "make M=drivers/accel/amdxdna" and did not reproduce the error
> with drm-misc-next. It looks build robot did not complain with the patch
> neither.
>
> $ git branch
> * drm-misc-next
> $ make M=drivers/accel/amdxdna
> CC [M] drivers/accel/amdxdna/aie2_ctx.o
> CC [M] drivers/accel/amdxdna/aie2_error.o
> CC [M] drivers/accel/amdxdna/aie2_message.o
> CC [M] drivers/accel/amdxdna/aie2_pci.o
> CC [M] drivers/accel/amdxdna/aie2_psp.o
> CC [M] drivers/accel/amdxdna/aie2_smu.o
> CC [M] drivers/accel/amdxdna/aie2_solver.o
> CC [M] drivers/accel/amdxdna/amdxdna_ctx.o
> CC [M] drivers/accel/amdxdna/amdxdna_gem.o
> CC [M] drivers/accel/amdxdna/amdxdna_mailbox.o
> CC [M] drivers/accel/amdxdna/amdxdna_mailbox_helper.o
> CC [M] drivers/accel/amdxdna/amdxdna_pci_drv.o
> CC [M] drivers/accel/amdxdna/amdxdna_sysfs.o
> CC [M] drivers/accel/amdxdna/npu1_regs.o
> CC [M] drivers/accel/amdxdna/npu2_regs.o
> CC [M] drivers/accel/amdxdna/npu4_regs.o
> CC [M] drivers/accel/amdxdna/npu5_regs.o
> LD [M] drivers/accel/amdxdna/amdxdna.o
> MODPOST drivers/accel/amdxdna/Module.symvers
> CC [M] drivers/accel/amdxdna/amdxdna.mod.o
> CC [M] drivers/accel/amdxdna/.module-common.o
> LD [M] drivers/accel/amdxdna/amdxdna.ko
> $
>
>>
>> CC [M] drivers/accel/amdxdna/aie2_ctx.o
>> CC [M] drivers/accel/amdxdna/aie2_error.o
>> CC [M] drivers/accel/amdxdna/aie2_message.o
>> CC [M] drivers/accel/amdxdna/aie2_pci.o
>> CC [M] drivers/accel/amdxdna/aie2_psp.o
>> CC [M] drivers/accel/amdxdna/aie2_smu.o
>> CC [M] drivers/accel/amdxdna/aie2_solver.o
>> CC [M] drivers/accel/amdxdna/amdxdna_ctx.o
>> CC [M] drivers/accel/amdxdna/amdxdna_gem.o
>> CC [M] drivers/accel/amdxdna/amdxdna_mailbox.o
>> CC [M] drivers/accel/amdxdna/amdxdna_mailbox_helper.o
>> CC [M] drivers/accel/amdxdna/amdxdna_pci_drv.o
>> CC [M] drivers/accel/amdxdna/amdxdna_sysfs.o
>> CC [M] drivers/accel/amdxdna/npu1_regs.o
>> CC [M] drivers/accel/amdxdna/npu2_regs.o
>> CC [M] drivers/accel/amdxdna/npu4_regs.o
>> CC [M] drivers/accel/amdxdna/npu5_regs.o
>> AR drivers/base/firmware_loader/built-in.a
>> AR drivers/base/built-in.a
>> In file included from drivers/accel/amdxdna/aie2_message.c:19:
>> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
>> drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit
>> declaration of function ‘FIELD_GET’
>> [-Werror=implicit-function-declaration]
>> 112 | return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
>> | ^~~~~~~~~
>> In file included from drivers/accel/amdxdna/amdxdna_gem.c:15:
>> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
>> drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit
>> declaration of function ‘FIELD_GET’
>> [-Werror=implicit-function-declaration]
>> 112 | return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
>> | ^~~~~~~~~
>> In file included from drivers/accel/amdxdna/aie2_psp.c:11:
>> drivers/accel/amdxdna/aie2_psp.c: In function ‘psp_exec’:
>> drivers/accel/amdxdna/aie2_psp.c:62:34: error: implicit declaration of
>> function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
>> 62 | FIELD_GET(PSP_STATUS_READY, ready),
>> | ^~~~~~~~~
>> ./include/linux/iopoll.h:47:21: note: in definition of macro
>> ‘read_poll_timeout’
>> 47 | if (cond) \
>> | ^~~~
>> drivers/accel/amdxdna/aie2_psp.c:61:15: note: in expansion of macro
>> ‘readx_poll_timeout’
>> 61 | ret = readx_poll_timeout(readl, PSP_REG(psp,
>> PSP_STATUS_REG), ready,
>> | ^~~~~~~~~~~~~~~~~~
>> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
>> drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit
>> declaration of function ‘FIELD_PREP’
>> [-Werror=implicit-function-declaration]
>> 121 | cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
>> | ^~~~~~~~~~
>> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
>> drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit
>> declaration of function ‘FIELD_PREP’
>> [-Werror=implicit-function-declaration]
>> 121 | cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
>> | ^~~~~~~~~~
>> In file included from drivers/accel/amdxdna/aie2_pci.c:22:
>> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
>> drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit
>> declaration of function ‘FIELD_GET’
>> [-Werror=implicit-function-declaration]
>> 112 | return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
>> | ^~~~~~~~~
>> In file included from drivers/accel/amdxdna/aie2_ctx.c:18:
>> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
>> drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit
>> declaration of function ‘FIELD_GET’
>> [-Werror=implicit-function-declaration]
>> 112 | return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
>> | ^~~~~~~~~
>> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
>> drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit
>> declaration of function ‘FIELD_PREP’
>> [-Werror=implicit-function-declaration]
>> 121 | cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
>> | ^~~~~~~~~~
>> In file included from drivers/accel/amdxdna/amdxdna_ctx.c:16:
>> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
>> drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit
>> declaration of function ‘FIELD_GET’
>> [-Werror=implicit-function-declaration]
>> 112 | return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
>> | ^~~~~~~~~
>> cc1: all warnings being treated as errors
>> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
>> drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit
>> declaration of function ‘FIELD_PREP’
>> [-Werror=implicit-function-declaration]
>> 121 | cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
>> | ^~~~~~~~~~
>> drivers/accel/amdxdna/aie2_ctx.c: In function ‘aie2_hwctx_restart’:
>> drivers/accel/amdxdna/aie2_ctx.c:114:9: error: too few arguments to
>> function ‘drm_sched_start’
>> 114 | drm_sched_start(&hwctx->priv->sched);
>> | ^~~~~~~~~~~~~~~
>> In file included from ./include/trace/events/amdxdna.h:12,
>> from drivers/accel/amdxdna/aie2_ctx.c:13:
>> ./include/drm/gpu_scheduler.h:593:6: note: declared here
>> 593 | void drm_sched_start(struct drm_gpu_scheduler *sched, int errno);
>> | ^~~~~~~~~~~~~~~
>> make[5]: *** [scripts/Makefile.build:229:
>> drivers/accel/amdxdna/aie2_psp.o] Error 1
>> make[5]: *** Waiting for unfinished jobs....
>> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
>> drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit
>> declaration of function ‘FIELD_PREP’
>> [-Werror=implicit-function-declaration]
>> 121 | cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
>> | ^~~~~~~~~~
>> In file included from drivers/accel/amdxdna/amdxdna_pci_drv.c:18:
>> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
>> drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit
>> declaration of function ‘FIELD_GET’
>> [-Werror=implicit-function-declaration]
>> 112 | return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
>> | ^~~~~~~~~
>> cc1: all warnings being treated as errors
>> make[5]: *** [scripts/Makefile.build:229:
>> drivers/accel/amdxdna/aie2_ctx.o] Error 1
>> drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
>> drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit
>> declaration of function ‘FIELD_PREP’
>> [-Werror=implicit-function-declaration]
>> 121 | cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
>> | ^~~~~~~~~~
>> drivers/accel/amdxdna/amdxdna_mailbox.c: In function
>> ‘xdna_mailbox_send_msg’:
>> drivers/accel/amdxdna/amdxdna_mailbox.c:444:26: error: implicit
>> declaration of function ‘FIELD_PREP’
>> [-Werror=implicit-function-declaration]
>> 444 | header->sz_ver = FIELD_PREP(MSG_BODY_SZ,
>> msg->send_size) |
>> | ^~~~~~~~~~
>>
>>
>> You also have the following checkpatch issues -
>
> Could you share the command you used? I tried to use 'dim checkpatch'
> and it did not find out the misspelling issue.
./scripts/checkpatch.pl --strict --codespell *.patch
Note, --codespell requires some local setup. I beleive the comments in
the checkpatch.pl script are fairly straightforward. I use a copy of
the database from the github that is rather recent. The Ubuntu distro
package is really out of date and I don't think I looked to see if there
is a pythong pip version. Grabbing the one file from the github repo
seemed simple emough.
-Jeff
More information about the dri-devel
mailing list