Graceful page fault handling for Vega/Navi
Huang, Ray
Ray.Huang at amd.com
Wed Sep 4 23:03:28 UTC 2019
On Wed, Sep 04, 2019 at 05:02:21PM +0200, Christian König wrote:
> Hi everyone,
>
> this series is the next puzzle piece for recoverable page fault handling on Vega and Navi.
>
> It adds a new direct scheduler entity for VM updates which is then used to update page tables during a fault.
>
> In other words previously an application doing an invalid memory access would just hang and/or repeat the invalid access over and over again. Now the handling is modified so that the invalid memory access is redirected to the dummy page.
>
> This needs the following prerequisites:
> a) The firmware must be new enough so allow re-routing of page faults.
> b) Fault retry must be enabled using the amdgpu.noretry=0 parameter.
In my side, I found "notretry" parameter not workable for vmid 0 vm faults.
If the same observation in your side, I'd like give a check.
Thanks,
Ray
> c) Enough free VRAM to allocate page tables to point to the dummy page.
>
> The re-routing of page faults current only works on Vega10, so Vega20 and Navi will still need some more time.
>
> Please review and/or comment,
> Christian.
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
More information about the amd-gfx
mailing list