Graceful page fault handling for Vega/Navi

Christian König ckoenig.leichtzumerken at gmail.com
Mon Sep 9 12:09:42 UTC 2019


Am 05.09.19 um 00:52 schrieb Kuehling, Felix:
> On 2019-09-04 11:02 a.m., Christian König wrote:
>> Hi everyone,
>>
>> this series is the next puzzle piece for recoverable page fault handling on Vega and Navi.
>>
>> It adds a new direct scheduler entity for VM updates which is then used to update page tables during a fault.
>>
>> In other words previously an application doing an invalid memory access would just hang and/or repeat the invalid access over and over again. Now the handling is modified so that the invalid memory access is redirected to the dummy page.
>>
>> This needs the following prerequisites:
>> a) The firmware must be new enough so allow re-routing of page faults.
>> b) Fault retry must be enabled using the amdgpu.noretry=0 parameter.
>> c) Enough free VRAM to allocate page tables to point to the dummy page.
>>
>> The re-routing of page faults current only works on Vega10, so Vega20 and Navi will still need some more time.
> Wait, we don't do the page fault rerouting on Vega20 yet? So we're
> getting the full brunt of the fault storm on the main interrupt ring?

It's implemented, but the Vega20 firmware fails to enable the 
re-reouting for some reason.

I haven't had time yet to talk to the firmware guys why that happens.

> In that case, we should probably change the default setting of
> amdgpu.noretry=1 at least until that's done.
>
> Other than that the patch series looks reasonable to me. I commented on
> patches 4 and 9 separately.
>
> Patch 1 is Acked-by: Felix Kuehling <Felix.Kuehling at amd.com>
>
> With the issues addressed that I pointed out, the rest is
>
> Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>

Thanks,
Christian.

>
> Regards,
>     Felix
>
>
>> Please review and/or comment,
>> Christian.
>>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx



More information about the amd-gfx mailing list