[PATCH v2 1/3] drm/amdgpu: Don't pin VRAM without DMABUF_MOVE_NOTIFY
Felix Kuehling
felix.kuehling at amd.com
Thu Apr 17 16:24:20 UTC 2025
On 2025-04-17 8:57, Christian König wrote:
> Am 17.04.25 um 15:33 schrieb Felix Kuehling:
>> Pinning of VRAM is for peer devices that don't support dynamic attachment
>> and move notifiers. But it requires that all such peer devices are able to
>> access VRAM via PCIe P2P. Any device without P2P access requires migration
>> to GTT, which fails if the memory is already pinned for another peer
>> device.
>>
>> Sharing between GPUs should not require pinning in VRAM. However, if
>> DMABUF_MOVE_NOTIFY is disabled in the kernel build, even DMABufs shared
>> between GPUs must be pinned, which can lead to failures and functional
>> regressions on systems where some peer GPUs are not P2P accessible.
>>
>> Disable VRAM pinning if move notifiers are disabled in the kernel build
>> to fix regressions when sharing BOs between GPUs.
>>
>> v2: ensure domains is not 0, add GTT if necessary
>>
>> Signed-off-by: Felix Kuehling <felix.kuehling at amd.com>
>> Tested-by: Hao (Claire) Zhou <hao.zhou at amd.com>
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 19 ++++++++++++++-----
>> 1 file changed, 14 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
>> index 667080cc9ae1..3c2c32a252d4 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
>> @@ -81,17 +81,26 @@ static int amdgpu_dma_buf_pin(struct dma_buf_attachment *attach)
>>
>> dma_resv_assert_held(dmabuf->resv);
>>
>> - /*
>> - * Try pinning into VRAM to allow P2P with RDMA NICs without ODP
>> + /* Try pinning into VRAM to allow P2P with RDMA NICs without ODP
>> * support if all attachments can do P2P. If any attachment can't do
>> * P2P just pin into GTT instead.
>> + *
>> + * To avoid with conflicting pinnings between GPUs and RDMA when move
>> + * notifiers are disabled, only allow pinning in VRAM when move
>> + * notiers are enabled.
>> */
>> - list_for_each_entry(attach, &dmabuf->attachments, node)
>> - if (!attach->peer2peer)
>> - domains &= ~AMDGPU_GEM_DOMAIN_VRAM;
>> + if (!IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY)) {
>> + domains &= ~AMDGPU_GEM_DOMAIN_VRAM;
>> + } else {
>> + list_for_each_entry(attach, &dmabuf->attachments, node)
>> + if (!attach->peer2peer)
>> + domains &= ~AMDGPU_GEM_DOMAIN_VRAM;
>> + }
>>
>> if (domains & AMDGPU_GEM_DOMAIN_VRAM)
>> bo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>> + else if (!domains)
>> + domains = AMDGPU_GEM_DOMAIN_GTT;
> Please drop that.
>
> We should probably use allowed_domains instead of preferred_domains and return an error if none of them work.
I sent out an updated patch with that. One concern, though: With discardable VRAM BOs, allowed_domains may not include GTT. I'm not sure if such BOs would ever be exported.
Regards,
Felix
>
> Regards,
> Christian.
>
>>
>> return amdgpu_bo_pin(bo, domains);
>> }
More information about the amd-gfx
mailing list