[PATCH] drm/amdgpu: Simplify amdgpu_lockup_timeout usage.

Christian König christian.koenig at amd.com
Tue Dec 19 17:47:41 UTC 2017


Yeah, that is a known issue which came to front again because Andrey's 
patch is slightly buggy.

Please test and review the attached (only compile tested) fix for 
Andrey's patch.

Still working on finding the root cause, but so far didn't had time for 
that.

Regards,
Christian.

Am 19.12.2017 um 18:26 schrieb Michel Dänzer:
> On 2017-12-13 08:44 PM, Andrey Grodzovsky wrote:
>> With introduction of amdgpu_gpu_recovery we don't need any more
>> to rely on amdgpu_lockup_timeout == 0 for disabling GPU reset.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com>
> Since this change landed, I'm once again unable to finish a piglit run
> on my development machine, see the attached dmesg output (happens pretty
> quickly, after ~5% of piglit tests have run). I realized that with
> lockup_timeout != 0, the
>
> 	WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_SYSTEM);
>
> at the top of amdgpu_bo_gpu_offset has been triggering since the 4.15
> development cycle. See the bisection result below. Note that I'm not
> 100% sure this is the correct guilty commit, since it's probably been
> the most painful bisection I've ever done so far (14 skips, had to
> revert 4 commits causing other regressions). But I'm quite sure this
> regression happened in the
> 84d43463a2d09c28c9222fbb7d1082c078e2523a..3f3333f8a0e90ac26f84ed7b0aa344efce695c08
> range.
>
>
> 3f3333f8a0e90ac26f84ed7b0aa344efce695c08 is the first bad commit
> commit 3f3333f8a0e90ac26f84ed7b0aa344efce695c08
> Author: Christian König <christian.koenig at amd.com>
> Date:   Thu Aug 3 14:02:13 2017 +0200
>
>      drm/amdgpu: track evicted page tables v2
>
>      Instead of validating all page tables when one was evicted,
>      track which one needs a validation.
>
>      v2: simplify amdgpu_vm_ready as well
>
>      Signed-off-by: Christian König <christian.koenig at amd.com>
>      Reviewed-by: Alex Deucher <alexander.deucher at amd.com> (v1)
>      Reviewed-by: Chunming Zhou <david1.zhou at amd.com>
>
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-drm-amdgpu-fix-test-for-shadow-page-tables.patch
Type: text/x-patch
Size: 1158 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20171219/1c53011d/attachment.bin>


More information about the amd-gfx mailing list