[PATCH] drm/ttm: partial revert "cleanup ttm_tt_(unbind|destroy)" v2

Felix Kuehling felix.kuehling at amd.com
Fri Aug 5 15:06:45 UTC 2016


For the record, Michel's patch "drm/ttm: Wait for a BO to become idle
before unbinding it from GTT" fixes our KFD problem as well.

Thanks,
  Felix

On 16-07-27 05:27 PM, Felix Kuehling wrote:
> We're also looking into a hang with a KFD unit test that allocates lots
> of memory and fragments it deliberately, without mapping it all at once.
> It's a new problem for us as we're rebasing on amd-staging-4.6.
> Something weird seems to be happening with evictions, but I haven't been
> able to figure it out.
>
> I was able to see that SDMA page table updates stop working at some
> point, though SDMA fences are still signaling. If I let the test run
> longer, SDMA and CP hang. I dumped the SDMA IBs and didn't see anything
> suspicious. My guess was that maybe the SDMA IBs or the ring are getting
> corrupted, or maybe the GART table entries for the IBs or ring are
> corrupted. But I haven't been able to prove that or track it down to a
> root cause. We're now trying to reimplement the test using libdrm-amdgpu
> APIs so we can bisect on the amd-staging-4.6 branch without KFD.
>
> Regards,
>   Felix
>
> On 16-07-26 10:26 PM, Michel Dänzer wrote:
>> On 22.07.2016 22:10, Christian König wrote:
>>> From: Christian König <christian.koenig at amd.com>
>>>
>>> We still need to unbind explicitely during a move.
>> This change fixed a hang for me when running the piglit test
>> max-texture-size with the radeon driver on Kaveri.
>>
>> However, there's still a similar hang left when letting the piglit test
>> tex3d-maxsize run concurrently with other tests (running tex3d-maxsize
>> alone doesn't hang, but fails due to running out of GPU memory; that's a
>> recent radeonsi regression). There are
>>
>>  [TTM] Buffer eviction failed
>>
>> messages in dmesg shortly before the hang.
>>
>> I haven't seen such hangs with older kernels. Any ideas offhand what the
>> problem could be? If not, I'll try bisecting.
>>
>>



More information about the amd-gfx mailing list