[EXTERNAL] Re: Code Review Request for AMDGPU Hotplug Support

Shuotao Xu shuotaoxu at microsoft.com
Wed Apr 6 14:25:59 UTC 2022


Hi Andrey,

We just tried kernel 5.16 based on https://gitlab.freedesktop.org/agd5f/linux.git amd-staging-drm-next branch, and found out that hotplug did not work out of box for Rocm compute stack.
We did not try the rendering stack since we currently are more focused on AI workloads.

We have also created a patch against the amd-staging-drm-next branch to enable hotplug for ROCM stack, which were sent in another later email with same subject. I am attaching the patch in this email, in case that you would want to delete that later email.

Best regards,
Shuotao

From: Andrey Grodzovsky <andrey.grodzovsky at amd.com>
Date: Wednesday, April 6, 2022 at 10:13 PM
To: Shuotao Xu <shuotaoxu at microsoft.com>, amd-gfx at lists.freedesktop.org <amd-gfx at lists.freedesktop.org>
Cc: Ziyue Yang <Ziyue.Yang at microsoft.com>, Lei Qu <Lei.Qu at microsoft.com>, Peng Cheng <pengc at microsoft.com>, Ran Shu <Ran.Shu at microsoft.com>
Subject: [EXTERNAL] Re: Code Review Request for AMDGPU Hotplug Support
[You don't often get email from andrey.grodzovsky at amd.com. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.]<http://aka.ms/LearnAboutSenderIdentification.%5d>

Looks like you are using 5.13 kernel for this work, FYI we added
hot plug support for the graphic stack in 5.14 kernel (see
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.phoronix.com%2Fscan.php%3Fpage%3Dnews_item%26px%3DLinux-5.14-AMDGPU-Hot-Unplug&data=05%7C01%7Cshuotaoxu%40microsoft.com%7Cf1f7980b198541d7196d08da17d79838%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637848512015144682%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=26qOd5vKzOigo0SaSc5%2FF8BOI9yzRlqC08xUMC01Jzk%3D&reserved=0)


I am not sure about the code part since it all touches KFD driver (KFD
team can comment on that) - but I was just wondering if you try 5.14
kernel would things just work for you out of the box ?

Andrey

On 2022-04-05 22:45, Shuotao Xu wrote:
> Dear AMD Colleagues,
>
> We are from Microsoft Research, and are working on GPU disaggregation
> technology.
>
> We have created a new pull requestAdd PCIe hotplug support for amdgpu by
> xushuotao · Pull Request #131 · RadeonOpenCompute/ROCK-Kernel-Driver
> (github.com)
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FRadeonOpenCompute%2FROCK-Kernel-Driver%2Fpull%2F131&data=05%7C01%7Cshuotaoxu%40microsoft.com%7Cf1f7980b198541d7196d08da17d79838%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637848512015144682%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=u2NtNDfuiCfKNKqeZ337KLq2uRDB1oGyO3%2BxIMQweRA%3D&reserved=0>in
> ROCK-Kernel-Driver, which will enable PCIe hot-plug support for amdgpu.
>
> We believe the support of hot-plug of GPU devices can open doors for
> many advanced applications in data center in the next few years, and we
> would like to have some reviewers on this PR so we can continue further
> technical discussions around this feature.
>
> Would you please help review this PR?
>
> Thank you very much!
>
> Best regards,
>
> Shuotao Xu
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20220406/6f8711de/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-drm-amdkfd-Add-PCIe-Hotplug-Support-for-AMDGPU.patch
Type: application/octet-stream
Size: 4008 bytes
Desc: 0001-drm-amdkfd-Add-PCIe-Hotplug-Support-for-AMDGPU.patch
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20220406/6f8711de/attachment.obj>


More information about the amd-gfx mailing list