[PATCH 2/9] drm/xe: Convert xe_gem_fault to use direct xe_pm_runtime calls

Tue Mar 5 22:29:42 UTC 2024

On Tue, Mar 05, 2024 at 11:29:22AM +0000, Matthew Auld wrote:
> On 04/03/2024 18:21, Rodrigo Vivi wrote:
> > The gem page fault is one of the outer bound protections where
> > we want to ensure that the hardware is in D0 before proceeding
> > with memory access. Let's convert it towards the xe_pm_runtime
> > functions directly so we can then convert the mem_access to be
> > inner protection only and then Kill it for good.
> > 
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
> 
> Not strictly related to this, but FYI there is:
> 
> https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1100
> https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1300
> 
> Which is on the GPU fault path. Looks like VM is maybe nuked before the
> worker can process the fault from the GuC? In that case there no RPM ref.

not related indeed, but a true issue. Part of that is the guc_ct
refactor that we need to to...  on the g2h processing.

one idea that is crossing my mind is the to perhaps queue the interrupt
job in a same ordered queue, right after the get_resume one... with this
we could have this g2h and perhaps even enable more irq cases and allow
display hotplug for instance. thoughts?

also, for this patch itself, are you okay with this if the lockdep is okay?

> 
> > ---
> >   drivers/gpu/drm/xe/xe_bo.c | 5 +++--
> >   1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> > index 6603a0ea79c5..def68528cd40 100644
> > --- a/drivers/gpu/drm/xe/xe_bo.c
> > +++ b/drivers/gpu/drm/xe/xe_bo.c
> > @@ -22,6 +22,7 @@
> >   #include "xe_gt.h"
> >   #include "xe_map.h"
> >   #include "xe_migrate.h"
> > +#include "xe_pm.h"
> >   #include "xe_preempt_fence.h"
> >   #include "xe_res_cursor.h"
> >   #include "xe_trace.h"
> > @@ -1144,7 +1145,7 @@ static vm_fault_t xe_gem_fault(struct vm_fault *vmf)
> >   	int idx, r = 0;
> >   	if (needs_rpm)
> > -		xe_device_mem_access_get(xe);
> > +		xe_pm_runtime_get(xe);
> >   	ret = ttm_bo_vm_reserve(tbo, vmf);
> >   	if (ret)
> > @@ -1184,7 +1185,7 @@ static vm_fault_t xe_gem_fault(struct vm_fault *vmf)
> >   	dma_resv_unlock(tbo->base.resv);
> >   out:
> >   	if (needs_rpm)
> > -		xe_device_mem_access_put(xe);
> > +		xe_pm_runtime_put(xe);
> >   	return ret;
> >   }