Support for amdgpu VM update via CPU on large-bar systems
Harish.Kasiviswanathan at amd.com
Wed May 10 18:50:42 UTC 2017
Thanks Christian for the review. One clarification required. Please see my comments and question inline.
From: Christian König [mailto:deathsimple at vodafone.de]
Sent: Wednesday, May 10, 2017 3:24 AM
To: Kasiviswanathan, Harish <Harish.Kasiviswanathan at amd.com>; amd-gfx at lists.freedesktop.org
Subject: Re: Support for amdgpu VM update via CPU on large-bar systems
for the next time please use git send-email to send patches to the mailing list.
Looks pretty good in general, but a few notes all over the place.
+static bool amdgpu_vm_is_large_bar(struct amdgpu_device *adev)
+ if (adev->mc.visible_vram_size > 0 &&
+ adev->mc.real_vram_size == adev->mc.visible_vram_size)
+ return true;
+ return false;
The visible_vram_size > 0 check looks superfluous. The coding style looks off, the second line of the "if" is to far right.
And in general that should rather look like "return adev->mc.real_vram_size == adev->mc.visible_vram_size;".
+ /* CPU update is only supported for large BAR system */
+ vm->is_vm_update_mode_cpu = (vm_update_mode &&
Mhm, it would be nice if we could enable this for testing even on systems with small BAR.
[HK] : I will add bit in the module parameter that can override large-bar requirement.
I would rather suggest to make the default value depend on if the BOX has a large or small BAR and let the user override the setting.
Additional to that we should limit this to 64bit systems.
+#define AMDGPU_VM_USE_CPU_UPDATE_FOR_GRAPHICS_MASK (1 << 0)
+#define AMDGPU_VM_USE_CPU_UPDATE_FOR_COMPUTE_MASK (1 << 1)
Not much of an issue, but the names are a bit long. Maybe use something like AMDGPU_VM_USE_CPU_FOR_GFX and AMDGPU_VM_USE_CPU_FOR_COMPUTE.
+ bool is_vm_update_mode_cpu : 1;
Please no bitmasks when we don't need them.
+ u64 flags;
unsigned shift = (adev->vm_manager.num_level - level) *
Reverse tree order.
+ flags = (vm->is_vm_update_mode_cpu) ?
+ AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED :
+ AMDGPU_GEM_CREATE_NO_CPU_ACCESS |
- AMDGPU_GEM_CREATE_NO_CPU_ACCESS |
- AMDGPU_GEM_CREATE_SHADOW |
+ flags |
I would rather write it like this:
flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS | AMDGPU_GEM_CREATE_VRAM_CLEARED;
flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
flags |= AMDGPU_GEM_CREATE_NO_CPU_ACCESS | AMDGPU_GEM_CREATE_SHADOW;
+ amdgpu_gart_flush_gpu_tlb(params->adev, 0);
You do this for the HDP flush, don't you?
+ dev_warn(adev->dev, "Page table update using CPU failed. Fallback to SDMA\n");
NACK, if kmapping the BO fails we most likely are either out of memory or out of address space.
Falling back to the SDMA doesn't make the situation better, just return a proper error code here.
+ /* Wait for BO to be free. SDMA could be clearing it */
+ amdgpu_sync_resv(adev, &sync, parent->bo->tbo.resv,
Superfluous and a bit incorrect, amdgpu_bo_kmap already calls reservation_object_wait_timeout_rcu() but only for the exclusive fence.
You probably ran into problems because of pipelined evictions, so that should be fixed in amdgpu_bo_kmap() instead.
[HK]: From Compute perspective, all validations are synchronous. PT/PD BOs are validated when restored after evictions. However, the clearing of PT/PD BOs during creation are still done by GPU which mandates the waiting.
[HK]: I am not clear on how to modify amdgpu_bo_kmap(). The function currently waits for the exclusive fence and now optionally it has to wait for all the shared fences. So are you suggesting to modify kmap function to amdgpu_bo_kmap(struct amdgpu_bo *bo, void **ptr, void *fence_owner). If fence_onwer != NULL it would wait for all the shared fences. Please clarify.
- pt_addr, count,
- amdgpu_vm_do_set_ptes(¶ms, last_pde,
- pt_addr, count, incr,
+ pt_addr, count,
+ params.func(¶ms, last_pde,
+ pt_addr, count, incr,
Might be a good idea to separate that into a cleanup patch.
Patch #3: Looks good to me and is Reviewed-by: Christian König <christian.koenig at amd.com>.
Please move that one to the beginning of the series and/or commit it directly to amd-staging-4.9.
+ /* The next three are used during VM update by CPU */
+ bool update_by_cpu;
We can distinguish that as well by looking at the function pointer, don't we?
+ "CPU update of VM failed. Fallback to SDMA\n");
+ /* Reset params for SDMA fallback path */
+ params.update_by_cpu = false;
+ params.pages_addr = NULL;
+ params.kptr = NULL;
+ params.func = NULL;
Again completely drop the SDMA fallback path.
Am 09.05.2017 um 20:34 schrieb Kasiviswanathan, Harish:
Please review the patch set that supports amdgpu VM update via CPU. This feature provides improved performance for compute (HSA) where mapping / unmapping is carried out (by Kernel) independent of command submissions (done directly by user space). This version doesn't support shadow copy of VM page tables for CPU based update.
amd-gfx mailing list
amd-gfx at lists.freedesktop.org
More information about the amd-gfx