[PATCH 00/13] shadow page table support
zhoucm1
david1.zhou at amd.com
Tue Jul 26 02:40:20 UTC 2016
On 2016年07月25日 18:31, Christian König wrote:
> First of all patches #10 and #11 look like bug fixes to existing code
> to me. So we should fix those problems before working on anything else.
>
> Patch #10 is Reviewed-by: Christian König <christian.koenig at amd.com>
>
> Patch #11:
>
>> list_for_each_entry(s_job, &sched->ring_mirror_list, node) {
>> struct amd_sched_fence *s_fence = s_job->s_fence;
>> - struct fence *fence = sched->ops->run_job(s_job);
>> + struct fence *fence;
>>
>> + spin_unlock(&sched->job_list_lock);
>> + fence = sched->ops->run_job(s_job);
>> atomic_inc(&sched->hw_rq_count);
>> if (fence) {
>> s_fence->parent = fence_get(fence);
>> @@ -451,6 +453,7 @@ void amd_sched_job_recovery(struct
>> amd_gpu_scheduler *sched)
>> DRM_ERROR("Failed to run job!\n");
>> amd_sched_process_job(NULL, &s_fence->cb);
>> }
>> + spin_lock(&sched->job_list_lock);
>> }
>> spin_unlock(&sched->job_list_lock);
> The problem is that the job might complete while we dropped the lock.
>
> Please use list_for_each_entry_safe here and add a comment why the
> list could be modified in the meantime.
>
> With that fixed the patch is Reviewed-by: Christian König
> <christian.koenig at amd.com> as well.
OK, pushed above two.
>
> The remaining set looks very good to me as well, but I was rather
> thinking of a more general approach instead of making it VM PD/PT
> specific.
>
> For example we also need to backup/restore shaders when a hard GPU
> reset happens.
>
> So I would suggest the following:
> 1. We add an optional "shadow" flag so that when a BO in VRAM is
> allocated we also allocate a shadow BO in GART.
>
> 2. We have another "backup" flag that says on the next command
> submission the BO is backed up from VRAM to GART before that submission.
>
> 3. We set the shadow flag for VM PD/PT BOs and every time we modify
> them set the backup flag so they get backed up on next CS.
>
> 4. We add an IOCTL to allow setting the backup flag from userspace so
> that we can trigger another backup even after the first CS.
>
> What do you think?
Sounds very good, will try.
Thanks,
David Zhou
>
> Regards,
> Christian.
>
> Am 25.07.2016 um 09:22 schrieb Chunming Zhou:
>> Since we cannot make sure VRAM is safe after gpu reset, page table
>> backup
>> is neccessary, shadow page table is sense way to recovery page talbe
>> when
>> gpu reset happens.
>> We need to allocate GTT bo as the shadow of VRAM bo when creating
>> page table,
>> and make them same. After gpu reset, we will need to use SDMA to copy
>> GTT bo
>> content to VRAM bo, then page table will be recoveried.
>>
>> Chunming Zhou (13):
>> drm/amdgpu: add pd/pt bo shadow
>> drm/amdgpu: update shadow pt bo while update pt
>> drm/amdgpu: update pd shadow while updating pd
>> drm/amdgpu: implement amdgpu_vm_recover_page_table_from_shadow
>> drm/amdgpu: link all vm clients
>> drm/amdgpu: add vm_list_lock
>> drm/amd: add block entity function
>> drm/amdgpu: recover page tables after gpu reset
>> drm/amdgpu: add vm recover pt fence
>> drm/amd: reset hw count when reset job
>> drm/amd: fix deadlock of job_list_lock
>> drm/amd: wait neccessary dependency before running job
>> drm/amdgpu: fix sched deadoff
>>
>> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 17 ++-
>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 12 +-
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 30 ++++-
>> drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 5 +-
>> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 5 +
>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 161
>> ++++++++++++++++++++++++--
>> drivers/gpu/drm/amd/scheduler/gpu_scheduler.c | 35 +++++-
>> drivers/gpu/drm/amd/scheduler/gpu_scheduler.h | 3 +
>> 8 files changed, 250 insertions(+), 18 deletions(-)
>>
>
More information about the amd-gfx
mailing list