[Intel-xe] [PATCH] drm/xe: Grab runtime PM over MMIO ioctl

Matthew Auld matthew.william.auld at gmail.com
Thu Jun 29 09:14:14 UTC 2023


On Thu, 29 Jun 2023 at 03:29, Lucas De Marchi <lucas.demarchi at intel.com> wrote:
>
> On Fri, Jun 16, 2023 at 01:10:49PM +0100, Matthew Auld wrote:
> >On Thu, 15 Jun 2023 at 21:20, Matt Roper <matthew.d.roper at intel.com> wrote:
> >>
> >> On Thu, Jun 15, 2023 at 09:42:03AM +0100, Matthew Auld wrote:
> >> > On Wed, 14 Jun 2023 at 21:04, Matt Roper <matthew.d.roper at intel.com> wrote:
> >> > >
> >> > > The MMIO ioctl (which is generally used from IGT's xe_reg tool) may be
> >> > > called while the device is in D3, resulting in invalid results (i.e.,
> >> > > all registers read back as 0xFFFFFFFF, writes are dropped).  Make sure
> >> > > we grab runtime PM wakeref before performing register reads/writes.
> >> > >
> >> > > Signed-off-by: Matt Roper <matthew.d.roper at intel.com>
> >> >
> >> > AFAIK we should use mem_access_get() for MMIO:
> >>
> >> I thought mem_access_get() was specifically for memory access (like we
> >> needed for a special workaround on some early steppings of PVC); is it
> >> intended to be used for all devices accesses, not just memory now?  If
> >> so we might want to rename it to something more accurate like
> >> 'mmio_access_get' (and also add some kerneldoc explaining when/why it
> >> would be used instead of the standard runtime PM).
> >
> >AFAIK Rodrigo is planning to add some docs to explain its usage. I was
> >also under the impression that it was more for stuff like VRAM memory
> >access, but it seems MMIO should also be included or really any device
> >level memory access.
> >
> >https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/281
>
> I don't see how mem_access_get() would be the appropriate call here.
> At the most we should have a dep in which when force_wake is taken, it
> get a mem access cookie. Here it is really about "wake up everything, we
> don't know crazy thing userspace is going to do".
>
> In that sense, why are we not getting the pm when taking the force wake
> rather than wrapping it here? Shouldn't a force_wake ensure the device
> is not pm suspended?

There was some discussion here:
https://patchwork.freedesktop.org/patch/542124/?series=119213&rev=1

For the simple cases, like in this patch which is at the ioctl level,
we could potentially add a convenience helper or something which does
as you say. But I'm not sure how feasible it would be to roll that out
to all force_wake users.

>
> Lucas De Marchi
>
> >
> >>
> >> Actually, I'm having trouble understanding what mem_access_get is trying
> >> to do.  It seems like it's trying to avoid using the runtime PM's
> >> refcounting and is turning it into a boolean on/off instead, which makes
> >> things a bit racy.  It's not obvious to me why we're trying to work
> >> around the refcount design?
> >
> >I don't know the full history here, so I'm not completely sure tbh.
> >
> >AFAICT the racy bit is in xe_pm_runtime_resume_if_suspended(), when it
> >calls pm_runtime_suspended(), which is a TOCTOU bug and also doesn't
> >work if we are currently in the process of suspending/resuming, but
> >the reason for having the check is to prevent deadlocking when
> >pm_runtime_get_sync() is called from our runtime pm callbacks. So
> >dropping that check fixes the races but then uncovers all those
> >deadlocks. Parking that to one side, I don't think there is any issue
> >with always calling pm_runtime_get_sync(), we just need to have some
> >way of asserting at various places in the driver that we have ref > 0
> >or perhaps that the device is PM active. But if we end up having our
> >own mem_access.ref then there is no strict need to call into PM every
> >time.
> >
> >>
> >>
> >> Matt
> >>
> >> > https://patchwork.freedesktop.org/patch/542124/?series=119213&rev=1
> >> >
> >> > But that requires first fixing the races with mem_access which is
> >> > quite involved, so perhaps using runtime_get() directly here is fine
> >> > for now. We can just replace the instance here with mem_access once
> >> > that's ready.
> >> >
> >> > Reviewed-by: Matthew Auld <matthew.auld at intel.com>
> >> >
> >> > > ---
> >> > >  drivers/gpu/drm/xe/xe_mmio.c | 3 +++
> >> > >  1 file changed, 3 insertions(+)
> >> > >
> >> > > diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
> >> > > index f7a7f996b37f..5f4c8ae23771 100644
> >> > > --- a/drivers/gpu/drm/xe/xe_mmio.c
> >> > > +++ b/drivers/gpu/drm/xe/xe_mmio.c
> >> > > @@ -19,6 +19,7 @@
> >> > >  #include "xe_gt_mcr.h"
> >> > >  #include "xe_macros.h"
> >> > >  #include "xe_module.h"
> >> > > +#include "xe_pm.h"
> >> > >
> >> > >  #define XEHP_MTCFG_ADDR                XE_REG(0x101800)
> >> > >  #define TILE_COUNT             REG_GENMASK(15, 8)
> >> > > @@ -476,6 +477,7 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
> >> > >          */
> >> > >         reg = XE_REG(args->addr);
> >> > >
> >> > > +       xe_pm_runtime_get(xe);
> >> > >         xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
> >> > >
> >> > >         if (args->flags & DRM_XE_MMIO_WRITE) {
> >> > > @@ -519,6 +521,7 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
> >> > >
> >> > >  exit:
> >> > >         xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL);
> >> > > +       xe_pm_runtime_put(xe);
> >> > >
> >> > >         return ret;
> >> > >  }
> >> > > --
> >> > > 2.40.1
> >> > >
> >>
> >> --
> >> Matt Roper
> >> Graphics Software Engineer
> >> Linux GPU Platform Enablement
> >> Intel Corporation


More information about the Intel-xe mailing list