[PATCH V2 00/10] Reset improvements for GC10+

Christian König christian.koenig at amd.com
Fri May 23 13:27:15 UTC 2025


On 5/23/25 05:04, Alex Deucher wrote:
> On Thu, May 22, 2025 at 5:57 PM Alex Deucher <alexander.deucher at amd.com> wrote:
>>
>> This set improves per queue reset support for GC10+.
>> This uses vmid resets for GFX.  GFX resets all state
>> associated with a vmid and then continues where it
>> left off.  Since once the IB uses the vmid, only
>> the IB is reset and execution continues after the IB.
>> Tested on GC 10 and 11 chips with a game running and
>> then running hang tests.  The game pauses when the
>> hang happens, then continues after the queue reset.
> 
> After further investigation, this appears to work as expected, but
> only by chance.  The ring is reset, but any pipelined content in the
> ring after the job is lost.  We either need to limit the ring to one
> job or patch in the subsequent packets after resetting.

Yeah, I feared that this wouldn't work.

Any idea why the VMID based reset isn't working?

On the other hand we could just restart from the ring RPTR again.

Regards,
Christian.

> 
> Alex
> 
>>
>> I tried this same approach and GC8 and 9, but it
>> was not as reliable as soft recovery.  I also compared
>> this to Christian's reset patches, but I was not
>> able to make them work as reliably as this series.
>>
>> Alex Deucher (9):
>>   Revert "drm/amd/amdgpu: add pipe1 hardware support"
>>   drm/amdgpu: add AMDGPU_QUEUE_RESET_TIMEOUT
>>   drm/amdgpu: set the exec flag on the IB fence
>>   drm/amdgpu/gfx11: adjust ring reset sequences
>>   drm/amdgpu/gfx11: drop soft recovery
>>   drm/amdgpu/gfx12: adjust ring reset sequences
>>   drm/amdgpu/gfx12: drop soft recovery
>>   drm/amdgpu/gfx10: adjust ring reset sequences
>>   drm/amdgpu/gfx10: drop soft recovery
>>
>> Christian König (1):
>>   drm/amdgpu: rework queue reset scheduler interaction
>>
>>  drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  1 +
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c  |  3 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 26 ++++++++--------
>>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  | 41 ++++++++-----------------
>>  drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c  | 35 ++++++---------------
>>  drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c  | 35 ++++++---------------
>>  drivers/gpu/drm/amd/amdgpu/nvd.h        |  1 +
>>  7 files changed, 50 insertions(+), 92 deletions(-)
>>
>> --
>> 2.49.0
>>



More information about the amd-gfx mailing list