[PATCH 4/4] drm/amdgpu: use separate status for buffer funcs availability
Christian König
ckoenig.leichtzumerken at gmail.com
Thu Mar 1 11:53:31 UTC 2018
Am 01.03.2018 um 11:55 schrieb Liu, Monk:
> Looks aborting in such case is the safe way, otherwise the fence_wait() outside will still fail
Good point, just send a v2 of that patch which does exactly that.
Please review,
Christian.
>
> -----Original Message-----
> From: Christian König [mailto:ckoenig.leichtzumerken at gmail.com]
> Sent: 2018年3月1日 18:50
> To: Liu, Monk <Monk.Liu at amd.com>; amd-gfx at lists.freedesktop.org
> Subject: Re: [PATCH 4/4] drm/amdgpu: use separate status for buffer funcs availability
>
> Am 01.03.2018 um 11:33 schrieb Liu, Monk:
>>> int amdgpu_mmap(struct file *filp, struct vm_area_struct *vma) @@
>>> -1684,6 +1680,7 @@ int amdgpu_copy_buffer(struct amdgpu_ring *ring,
>>> uint64_t src_offset,
>> amdgpu_ring_pad_ib(ring, &job->ibs[0]);
>> WARN_ON(job->ibs[0].length_dw > num_dw);
>> if (direct_submit) {
>> + WARN_ON(!ring->ready);
>> r = amdgpu_ib_schedule(ring, job->num_ibs, job->ibs,
>> NULL, fence);
>> job->fence = dma_fence_get(*fence);
>>
>> [ML] in direct_submit case if ring->ready is false why we continue and only give a warning on that ? shouldn't we just abort or use scheduler way ??
> When we use direct submission the scheduler is turned off. So we could return an error, but using the scheduler probably results in a deadlock.
>
> Christian.
>
>>
>> /Monk
>>
>>
>> -----Original Message-----
>> From: Christian König [mailto:ckoenig.leichtzumerken at gmail.com]
>> Sent: 2018年3月1日 18:23
>> To: amd-gfx at lists.freedesktop.org
>> Cc: Liu, Monk <Monk.Liu at amd.com>
>> Subject: [PATCH 4/4] drm/amdgpu: use separate status for buffer funcs
>> availability
>>
>> The ring status can change during GPU reset, but we still need to be able to schedule TTM buffer moves in the meantime.
>>
>> Otherwise we can ran into problems because of aborted move/fill operations during GPU resets.
>>
>> Signed-off-by: Christian König <christian.koenig at amd.com>
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 17 +++++++---------- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 1 +
>> 2 files changed, 8 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> index 2aa6823ef503..53b34b3b8232 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> @@ -213,9 +213,7 @@ static void amdgpu_evict_flags(struct ttm_buffer_object *bo,
>> abo = ttm_to_amdgpu_bo(bo);
>> switch (bo->mem.mem_type) {
>> case TTM_PL_VRAM:
>> - if (adev->mman.buffer_funcs &&
>> - adev->mman.buffer_funcs_ring &&
>> - adev->mman.buffer_funcs_ring->ready == false) {
>> + if (!adev->mman.buffer_funcs_enabled) {
>> amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_CPU);
>> } else if (adev->gmc.visible_vram_size < adev->gmc.real_vram_size &&
>> !(abo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)) { @@ -331,7 +329,7 @@ int amdgpu_ttm_copy_mem_to_mem(struct amdgpu_device *adev,
>> const uint64_t GTT_MAX_BYTES = (AMDGPU_GTT_MAX_TRANSFER_SIZE *
>> AMDGPU_GPU_PAGE_SIZE);
>>
>> - if (!ring->ready) {
>> + if (!adev->mman.buffer_funcs_enabled) {
>> DRM_ERROR("Trying to move memory with ring turned off.\n");
>> return -EINVAL;
>> }
>> @@ -577,12 +575,9 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo, bool evict,
>> amdgpu_move_null(bo, new_mem);
>> return 0;
>> }
>> - if (adev->mman.buffer_funcs == NULL ||
>> - adev->mman.buffer_funcs_ring == NULL ||
>> - !adev->mman.buffer_funcs_ring->ready) {
>> - /* use memcpy */
>> +
>> + if (!adev->mman.buffer_funcs_enabled)
>> goto memcpy;
>> - }
>>
>> if (old_mem->mem_type == TTM_PL_VRAM &&
>> new_mem->mem_type == TTM_PL_SYSTEM) { @@ -1549,6 +1544,7 @@ void amdgpu_ttm_set_buffer_funcs_status(struct amdgpu_device *adev, bool enable)
>> else
>> size = adev->gmc.visible_vram_size;
>> man->size = size >> PAGE_SHIFT;
>> + adev->mman.buffer_funcs_enabled = enable;
>> }
>>
>> int amdgpu_mmap(struct file *filp, struct vm_area_struct *vma) @@ -1684,6 +1680,7 @@ int amdgpu_copy_buffer(struct amdgpu_ring *ring, uint64_t src_offset,
>> amdgpu_ring_pad_ib(ring, &job->ibs[0]);
>> WARN_ON(job->ibs[0].length_dw > num_dw);
>> if (direct_submit) {
>> + WARN_ON(!ring->ready);
>> r = amdgpu_ib_schedule(ring, job->num_ibs, job->ibs,
>> NULL, fence);
>> job->fence = dma_fence_get(*fence); @@ -1720,7 +1717,7 @@ int
>> amdgpu_fill_buffer(struct amdgpu_bo *bo,
>> struct amdgpu_job *job;
>> int r;
>>
>> - if (!ring->ready) {
>> + if (!adev->mman.buffer_funcs_enabled) {
>> DRM_ERROR("Trying to clear memory with ring turned off.\n");
>> return -EINVAL;
>> }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
>> index b8117c6e51f1..6ea7de863041 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
>> @@ -53,6 +53,7 @@ struct amdgpu_mman {
>> /* buffer handling */
>> const struct amdgpu_buffer_funcs *buffer_funcs;
>> struct amdgpu_ring *buffer_funcs_ring;
>> + bool buffer_funcs_enabled;
>>
>> struct mutex gtt_window_lock;
>> /* Scheduler entity for buffer moves */
>> --
>> 2.14.1
>>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
More information about the amd-gfx
mailing list