[Intel-xe] [PATCH] drm/xe: Fix locking in CT fast path
Matthew Brost
matthew.brost at intel.com
Tue Mar 21 14:34:29 UTC 2023
On Tue, Mar 21, 2023 at 01:25:51PM +0100, Maarten Lankhorst wrote:
> Hey,
>
> I'm afraid this is not allowed, you can't take a mutex in an irq handler, not even a trylock.
>
> From Documentation/locking/mutex-design.rst:
>
> The mutex subsystem checks and enforces the following rules:
> ...
> - Mutexes may not be used in hardware or software interrupt
> contexts such as tasklets and timers.
>
I wasn't aware of this byr DOC makes it clear this isn;t allowed.
> Lockdep will likely still splat too as a result.
>
Lockdep is happy which is very odd since clearly this isn't allowed per
the DOC.
Anyways, I'm thinking your atomic fix is needed but likely also need a
follow on to this patch as well something like:
xe_device_mem_access_get_if_active();
do CT fast path...
xe_device_mem_access_put_async();
The key being we can't sleep but also can't power down access to the
VRAM when the CT fast path is executing.
Matt
> Cheers,
> ~Maarten
>
> On 2023-03-17 01:22, Matthew Brost wrote:
> > We can't sleep in the CT fast but need to ensure we can access VRAM. Use
> > a trylock + reference counter check to ensure safe access to VRAM, if
> > either check fails, fall back to slow path.
> >
> > VLK-45296
> >
> > Signed-off-by: Matthew Brost<matthew.brost at intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_device.h | 9 ++++++++-
> > drivers/gpu/drm/xe/xe_guc_ct.c | 11 ++++++++++-
> > 2 files changed, 18 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> > index 25c5087f5aad..0cc4f52098a1 100644
> > --- a/drivers/gpu/drm/xe/xe_device.h
> > +++ b/drivers/gpu/drm/xe/xe_device.h
> > @@ -95,12 +95,19 @@ static inline void xe_device_assert_mem_access(struct xe_device *xe)
> > XE_WARN_ON(!xe->mem_access.ref);
> > }
> > +static inline bool __xe_device_mem_access_ongoing(struct xe_device *xe)
> > +{
> > + lockdep_assert_held(&xe->mem_access.lock);
> > +
> > + return xe->mem_access.ref;
> > +}
> > +
> > static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
> > {
> > bool ret;
> > mutex_lock(&xe->mem_access.lock);
> > - ret = xe->mem_access.ref;
> > + ret = __xe_device_mem_access_ongoing(xe);
> > mutex_unlock(&xe->mem_access.lock);
> > return ret;
> > diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
> > index e5ed9022a0a2..bba0ef21c9e5 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_ct.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
> > @@ -1030,9 +1030,15 @@ void xe_guc_ct_fast_path(struct xe_guc_ct *ct)
> > struct xe_device *xe = ct_to_xe(ct);
> > int len;
> > - if (!xe_device_in_fault_mode(xe) || !xe_device_mem_access_ongoing(xe))
> > + if (!xe_device_in_fault_mode(xe))
> > return;
> > + if (!mutex_trylock(&xe->mem_access.lock))
> > + return;
> > +
> > + if (!__xe_device_mem_access_ongoing(xe))
> > + goto unlock;
> > +
> > spin_lock(&ct->fast_lock);
> > do {
> > len = g2h_read(ct, ct->fast_msg, true);
> > @@ -1040,6 +1046,9 @@ void xe_guc_ct_fast_path(struct xe_guc_ct *ct)
> > g2h_fast_path(ct, ct->fast_msg, len);
> > } while (len > 0);
> > spin_unlock(&ct->fast_lock);
> > +
> > +unlock:
> > + mutex_unlock(&xe->mem_access.lock);
> > }
> > /* Returns less than zero on error, 0 on done, 1 on more available */
More information about the Intel-xe
mailing list