Support for amdgpu VM update via CPU on large-bar systems
deathsimple at vodafone.de
Wed May 10 07:23:59 UTC 2017
for the next time please use git send-email to send patches to the
Looks pretty good in general, but a few notes all over the place.
> +static bool amdgpu_vm_is_large_bar(struct amdgpu_device *adev)
> + if (adev->mc.visible_vram_size > 0 &&
> + adev->mc.real_vram_size == adev->mc.visible_vram_size)
> + return true;
> + return false;
The visible_vram_size > 0 check looks superfluous. The coding style
looks off, the second line of the "if" is to far right.
And in general that should rather look like "return
adev->mc.real_vram_size == adev->mc.visible_vram_size;".
> + /* CPU update is only supported for large BAR system */
> + vm->is_vm_update_mode_cpu = (vm_update_mode &&
> + amdgpu_vm_is_large_bar(adev));
Mhm, it would be nice if we could enable this for testing even on
systems with small BAR.
I would rather suggest to make the default value depend on if the BOX
has a large or small BAR and let the user override the setting.
Additional to that we should limit this to 64bit systems.
> +#define AMDGPU_VM_USE_CPU_UPDATE_FOR_GRAPHICS_MASK (1 << 0)
> +#define AMDGPU_VM_USE_CPU_UPDATE_FOR_COMPUTE_MASK (1 << 1)
Not much of an issue, but the names are a bit long. Maybe use something
like AMDGPU_VM_USE_CPU_FOR_GFX and AMDGPU_VM_USE_CPU_FOR_COMPUTE.
> + bool is_vm_update_mode_cpu : 1;
Please no bitmasks when we don't need them.
> + u64 flags;
> unsigned shift = (adev->vm_manager.num_level - level) *
Reverse tree order.
> + flags = (vm->is_vm_update_mode_cpu) ?
> + AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED :
> + AMDGPU_GEM_CREATE_NO_CPU_ACCESS |
> + AMDGPU_GEM_CREATE_SHADOW;
> - AMDGPU_GEM_CREATE_NO_CPU_ACCESS |
> - AMDGPU_GEM_CREATE_SHADOW |
> + flags |
> AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |
I would rather write it like this:
flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS | AMDGPU_GEM_CREATE_VRAM_CLEARED;
flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
flags |= AMDGPU_GEM_CREATE_NO_CPU_ACCESS | AMDGPU_GEM_CREATE_SHADOW;
> + mb();
> + amdgpu_gart_flush_gpu_tlb(params->adev, 0);
You do this for the HDP flush, don't you?
> + dev_warn(adev->dev, "Page table update using CPU failed.
> Fallback to SDMA\n");
NACK, if kmapping the BO fails we most likely are either out of memory
or out of address space.
Falling back to the SDMA doesn't make the situation better, just return
a proper error code here.
> + /* Wait for BO to be free. SDMA could be clearing it */
> + amdgpu_sync_create(&sync);
> + amdgpu_sync_resv(adev, &sync, parent->bo->tbo.resv,
> + AMDGPU_FENCE_OWNER_VM);
> + amdgpu_sync_wait(&sync);
> + amdgpu_sync_free(&sync);
Superfluous and a bit incorrect, amdgpu_bo_kmap already calls
reservation_object_wait_timeout_rcu() but only for the exclusive fence.
You probably ran into problems because of pipelined evictions, so that
should be fixed in amdgpu_bo_kmap() instead.
> - amdgpu_vm_do_set_ptes(¶ms,
> - last_shadow,
> - pt_addr, count,
> - incr,
> - AMDGPU_PTE_VALID);
> - amdgpu_vm_do_set_ptes(¶ms, last_pde,
> - pt_addr, count, incr,
> - AMDGPU_PTE_VALID);
> + params.func(¶ms,
> + last_shadow,
> + pt_addr, count,
> + incr,
> + AMDGPU_PTE_VALID);
> + params.func(¶ms, last_pde,
> + pt_addr, count, incr,
> + AMDGPU_PTE_VALID);
Might be a good idea to separate that into a cleanup patch.
Patch #3: Looks good to me and is Reviewed-by: Christian König
<christian.koenig at amd.com>.
Please move that one to the beginning of the series and/or commit it
directly to amd-staging-4.9.
> + /* The next three are used during VM update by CPU */
> + bool update_by_cpu;
We can distinguish that as well by looking at the function pointer,
> + dev_warn(adev->dev,
> + "CPU update of VM failed. Fallback to SDMA\n");
> + /* Reset params for SDMA fallback path */
> + params.update_by_cpu = false;
> + params.pages_addr = NULL;
> + params.kptr = NULL;
> + params.func = NULL;
Again completely drop the SDMA fallback path.
Am 09.05.2017 um 20:34 schrieb Kasiviswanathan, Harish:
> Please review the patch set that supports amdgpu VM update via CPU. This feature provides improved performance for compute (HSA) where mapping / unmapping is carried out (by Kernel) independent of command submissions (done directly by user space). This version doesn't support shadow copy of VM page tables for CPU based update.
> Best Regards,
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the amd-gfx