Raciness with page table shadows being swapped out

Nicolai Hähnle nhaehnle at gmail.com
Wed Dec 14 16:14:40 UTC 2016


On 14.12.2016 15:56, Christian König wrote:
> Am 14.12.2016 um 15:22 schrieb Nicolai Hähnle:
>> On 13.12.2016 10:48, Christian König wrote:
>>>>>> The attached patch has fixed these crashes for me so far, but it's
>>>>>> very heavy-handed: it collects all page table shadows and the page
>>>>>> directory shadow and adds them all to the reservations for the
>>>>>> callers
>>>>>> of amdgpu_vm_update_page_directory.
>>>>>
>>>>> That is most likely just a timing change, cause the shadows should end
>>>>> up in the duplicates list anyway. So the patch shouldn't have any
>>>>> effect.
>>>>
>>>> Okay, so the reason for the remaining crash is still unclear at least
>>>> for me.
>>>
>>> Yeah, that's a really good question. Can you share the call stack of the
>>> problem once more?
>>
>> Pretty sure I found the root cause now. amdgpu_vm_validate_pt_bos
>> relies on the eviction counter to be able to skip the validation of
>> the page tables.
>>
>> However, moving the shadow page tables out from mem_type TT to SYSTEM
>> doesn't count as an eviction (it just unbinds the mapping in the GTT).
>>
>> Clearly, that's a problem.
>
> Nice catch!
>
>> The quick fix is to skip the num_evictions check in
>> amdgpu_vm_validate_pt_bos. That has worked for me so far.
>>
>> The next best thing is to add an unbind counter in addition to the
>> eviction counter that gets incremented whenever a BO is unbound (so it
>> counts a superset of what the eviction counter counts), and then check
>> that instead of the eviction counter.
>
> Well to complicated, we should just make the eviction counter handle
> both events.
>
> That's also the original meaning of it, e.g. unbinding pages from the
> GART is some sort of eviction as well in this case.

I'll do that then. The counter does get exposed to user-space, 
implementing the GL_NVX_gpu_memory_info extension, so it's a change of 
behavior there. But that extension is pretty under-specified anyway, and 
it does make sense to count getting kicked out of GART as an eviction.

Cheers,
Nicolai


>
> Regards,
> Christian.
>
>>
>> Cheers,
>> Nicolai
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>


More information about the amd-gfx mailing list