[PATCH 00/20] Add KFD GPUVM support for dGPUs v4
Oded Gabbay
oded.gabbay at gmail.com
Wed Mar 21 07:52:00 UTC 2018
On Mon, Mar 19, 2018 at 9:05 PM, Felix Kuehling <felix.kuehling at amd.com> wrote:
> On 2018-03-19 12:39 PM, Christian König wrote:
>> So coming back to this series once more.
>>
>> Patch #1, #3 are Reviewed-by: Christian König <christian.koenig at amd.com>.
>>
>> Patch #2, #4 - #13 and #18-#19 are Acked-by: Christian König
>> <christian.koenig at amd.com>.
>>
>> Patch #14: What's the difference to setting vramlimit=$size_of_bar ?
>
> The debug_largebar option only affects KFD. Graphics can still use all
> memory.
>
>>
>> Patch #15 & #20: Why is that actually still needed? I thought we have
>> fixed all dependencies and can now use the "standard" way of attaching
>> fences to reservation objects to do this.
>
> Patch 15 adds a KFD-specific MMU notifier, because the graphics MMU
> notifier deals with amdgpu_cs command submission. We need a completely
> different treatment of MMU notifiers for KFD. We need to stop user mode
> queues.
>
> Patch 20 implements the user mode queue preemption mechanism, and the
> corresponding restore function that re-validates userptr BOs before
> restarting user mode queues.
>
> I think you're implying that the graphics MMU notifier would wait for
> the eviction fence and trigger a regular eviction in KFD. I haven't
> tried that. The MMU notifier and userptr eviction mechanism was
> implemented in KFD before we had the TTM evictions. We didn't go back
> and revisit it after that. There are a few major differences that make
> me want to keep the two types of evictions separate, though:
>
> * TTM evictions are the result of memory pressure (triggered by
> another process)
> o Most MMU notifiers are triggered by something in the same
> process (fork, mprotect, etc.)
> o Thus the restore delay after MMU notifiers can be much shorter
> * TTM evictions are done asynchronously in a delayed worker
> o MMU notifiers are synchronous, queues need to be stopped before
> returning
> * KFD's TTM eviction/restore code doesn't handle userptrs
> (get_user_pages, etc)
> o MMU notifier restore worker is specialized to handle just userptrs
>
>
>>
>> Saying so I still need to take a closer look at patch #20.
>>
>> Patch #16: Looks good to me in general, but I think it would be safer
>> if we grab a reference to the task structure. Otherwise grabbing pages
>> from a mm_struct sounds a bit scary to me.
>
> You're right. I've never seen it cause problems when testing process
> termination, probably because during process termination KFD cancels the
> delayed worker that calls this function. But I'll fix this and take a
> proper reference.
>
>>
>> Patch #17: I think it would be better to allocate the node when the
>> locks are not held and free it when we find that it isn't used, but no
>> big deal.
>
> OK. I'll change that.
>
> Thanks,
> Felix
>
>>
>> Regards,
>> Christian.
>>
>> Am 15.03.2018 um 22:27 schrieb Felix Kuehling:
>>> Rebased and integrated review feedback from v3:
>>> * Removed vm->vm_context field
>>> * Use uninterruptible waiting in initial PD validation to avoid
>>> ERESTARTSYS
>>> * Return number of successful map/unmap operations in failure cases
>>> * Facilitate partial retry after failed map/unmap
>>> * Added comments with parameter descriptions to new APIs
>>> * Defined AMDKFD_IOC_FREE_MEMORY_OF_GPU write-only
>>>
>>> This patch series also adds Userptr support in patches 15-20.
>>>
>>> Felix Kuehling (19):
>>> drm/amdgpu: Move KFD-specific fields into struct amdgpu_vm
>>> drm/amdgpu: Fix initial validation of PD BO for KFD VMs
>>> drm/amdgpu: Add helper to turn an existing VM into a compute VM
>>> drm/amdgpu: Add kfd2kgd interface to acquire an existing VM
>>> drm/amdkfd: Create KFD VMs on demand
>>> drm/amdkfd: Remove limit on number of GPUs
>>> drm/amdkfd: Aperture setup for dGPUs
>>> drm/amdkfd: Add per-process IDR for buffer handles
>>> drm/amdkfd: Allocate CWSR trap handler memory for dGPUs
>>> drm/amdkfd: Add TC flush on VMID deallocation for Hawaii
>>> drm/amdkfd: Add ioctls for GPUVM memory management
>>> drm/amdkfd: Kmap event page for dGPUs
>>> drm/amdkfd: Add module option for testing large-BAR functionality
>>> drm/amdgpu: Add MMU notifier type for KFD userptr
>>> drm/amdgpu: Enable amdgpu_ttm_tt_get_user_pages in worker threads
>>> drm/amdgpu: GFP_NOIO while holding locks taken in MMU notifier
>>> drm/amdkfd: GFP_NOIO while holding locks taken in MMU notifier
>>> drm/amdkfd: Add quiesce_mm and resume_mm to kgd2kfd_calls
>>> drm/amdgpu: Add userptr support for KFD
>>>
>>> Oak Zeng (1):
>>> drm/amdkfd: Populate DRM render device minor
>>>
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 37 +-
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 1 +
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 1 +
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 818
>>> ++++++++++++++++++---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c | 96 ++-
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h | 11 +-
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 30 +-
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 70 ++
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 10 +
>>> drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 532
>>> ++++++++++++++
>>> drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 3 +
>>> drivers/gpu/drm/amd/amdkfd/kfd_device.c | 40 +-
>>> .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 22 +-
>>> drivers/gpu/drm/amd/amdkfd/kfd_events.c | 31 +-
>>> drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c | 59 +-
>>> drivers/gpu/drm/amd/amdkfd/kfd_module.c | 7 +
>>> drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 2 +-
>>> drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c | 2 +-
>>> drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 37 +
>>> drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 41 ++
>>> drivers/gpu/drm/amd/amdkfd/kfd_process.c | 314 +++++++-
>>> drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 4 +
>>> drivers/gpu/drm/amd/amdkfd/kfd_topology.h | 1 +
>>> drivers/gpu/drm/amd/include/kgd_kfd_interface.h | 10 +
>>> include/uapi/linux/kfd_ioctl.h | 122 ++-
>>> 26 files changed, 2090 insertions(+), 213 deletions(-)
>>>
>>
>
Hi Felix,
I did a quick pass on the patch-set and didn't see anything scary.
Patches 1-14 are already applied to my -next tree. If I will send it
now to Dave I believe we would be ok from schedule POV.
I suggest we delay userptr support for the next kernel release because
you need to address what Christian said and I also want to take a
closer look at it.
What do you think ?
Thanks,
Oded
More information about the amd-gfx
mailing list