[PATCH 2/2] drm/xe: Add lock around devcoredump capture status
Matthew Brost
matthew.brost at intel.com
Thu Nov 21 17:24:37 UTC 2024
On Thu, Nov 21, 2024 at 08:56:19AM -0800, John Harrison wrote:
> On 11/20/2024 19:30, Matthew Brost wrote:
> > On Wed, Nov 20, 2024 at 03:22:57PM -0800, John.C.Harrison at Intel.com wrote:
> > > From: John Harrison <John.C.Harrison at Intel.com>
> > >
> > > There are now multiple places that can trigger a coredump. Some of
> > > which can happen in parallel. There is already a check against
> > > capturing multiple dumps sequentially, but without locking it doesn't
> > > guarantee to work against concurrent dumps. And if two dumps do happen
> > > in parallel, they can end up doing Bad Things such as one call stack
> > > freeing the data the other call stack is still processing. Which leads
> > > to a crashed kernel. So add the lock.
> > >
> > > Signed-off-by: John Harrison <John.C.Harrison at Intel.com>
> > > ---
> > > drivers/gpu/drm/xe/xe_devcoredump.c | 18 ++++++++++++++----
> > > drivers/gpu/drm/xe/xe_devcoredump_types.h | 2 ++
> > > 2 files changed, 16 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
> > > index b5aebe8fb06d..05f3c9185ef3 100644
> > > --- a/drivers/gpu/drm/xe/xe_devcoredump.c
> > > +++ b/drivers/gpu/drm/xe/xe_devcoredump.c
> > > @@ -235,9 +235,13 @@ static void xe_devcoredump_free(void *data)
> > > /* To prevent stale data on next snapshot, clear everything */
> > > memset(&coredump->snapshot, 0, sizeof(coredump->snapshot));
> > > - coredump->captured = false;
> > > +
> > > drm_info(&coredump_to_xe(coredump)->drm,
> > > "Xe device coredump has been deleted.\n");
> > > +
> > > + spin_lock_irq(&coredump->lock);
> > > + coredump->captured = false;
> > > + spin_unlock_irq(&coredump->lock);
> > I don't think you need the IRQ version here, or in the other usage. I
> > don't believe in either case here we are in IRQ contexts, or if we were
> > one of the cases would use spin_lock (already in an IRQ context) and
> > the other would use spin_lock_irq (not in an IRQ context but another
> > case can be). Certainly taking spin_lock_irq in all use cases of a lock
> > seems unnecessary.
> Yeah, that was a copy/paste that I was meaning to go back and update as
> necessary but evidently forgot to.
>
> >
> > Is there some follow on patches where this changes?
> There will be. Still more re-work to be done for the GT targetted capture
> support.
>
> >
> > Adding the lock seems right though and should enable coredumps on
> > multi-GT or from which are not the GT ordered WQ.
> Yeah, with the capture on GT reset I was hitting the race and got a very
> dead kernel a couple of times (because there were parallel captures for the
> same failure - one via GT reset and one via DRM timeout). Not sure if it is
> possible to hit with the current job specific captures but I don't see why
> not. If you had parallel independent hangs on both GTs of a multi-GT system
> then it seems like it would still go bang. Just much harder to hit.
>
Agree this is certainly possible to hit.
> >
> > Maybe also add helpers like...
> >
> > void xe_devcoredump_release()
> > {
> > spin_lock(&coredump->lock);
> > coredump->captured = false;
> > spin_unlock(&coredump->lock);
> > }
> >
> > bool xe_devcoredump_acquire();
> > {
> > bool acquired;
> >
> > spin_lock(&coredump->lock);
> > acquired = !coredump->captured;
> > coredump->captured = true;
> > spin_unlock(&coredump->lock);
> >
> > return acquired;
> > }
> >
> > IMO - this makes this locking semantic a bit more clear.
> Not convinced. There are exactly one places where each of these functions
> would be used. And acquire/release doesn't seem like the right terminology
> to me. Especially when the release is from a totally different call path.
> The 'acquire' is an internally triggered thing by the driver but the
> 'release' is in response to a sysfs write from the user. The intent is
> really just alloc and free but for a singleton entity so there can only ever
> be one instance at a time. However, the above code is not the actual
> alloc/free so calling it that would be wrong.
>
This comment is more about having a locking semantic that protects code
paths rather than data. I think having a helper makes this part more
palatable.
However, now that I've written this, maybe it's better to just use a
single large mutex that is held across xe_devcoredump,
xe_devcoredump_free, and xe_devcoredump_read. Looking at the code, I'm
pretty sure that xe_devcoredump_free and xe_devcoredump_read can race,
with xe_devcoredump_read possibly dereferencing a NULL pointer in
read.buffer. This mutex would also need to be primed with reclaim, so I
guess it wouldn't be held in flush_work(&ss->work) or
cancel_work_sync(&coredump->snapshot.work), but everywhere else in the
aforementioned functions.
Do you think that would work? If so, it seems cleaner and would adhere
to Sima's locking guidelines [1].
Matt
[1] https://blog.ffwll.ch/2022/07/locking-engineering.html
> The implementation is more about back-off-when-busy than regular mutex
> locking. It could potentially be done with a kref thing and inc-if-not-zero
> functionality, but that also seemed more confusion than just having a
> spinlock around a simple bool.
>
> John.
>
> >
> > > }
> > > static void devcoredump_snapshot(struct xe_devcoredump *coredump,
> > > @@ -310,14 +314,18 @@ void xe_devcoredump(struct xe_exec_queue *q, struct xe_sched_job *job, const cha
> > > struct xe_device *xe = gt_to_xe(q->gt);
> > > struct xe_devcoredump *coredump = &xe->devcoredump;
> > > va_list varg;
> > > + bool captured;
> > > - if (coredump->captured) {
> > > + spin_lock_irq(&coredump->lock);
> > > + captured = coredump->captured;
> > > + coredump->captured = true;
> > > + spin_unlock_irq(&coredump->lock);
> > > +
> > > + if (captured) {
> > Then here...
> >
> > if (!xe_devcoredump_acquire())
> > msg...
> > return;
> >
> > > drm_dbg(&xe->drm, "Multiple hangs are occurring, but only the first snapshot was taken\n");
> > > return;
> > > }
> > > - coredump->captured = true;
> > > -
> > > va_start(varg, fmt);
> > > coredump->snapshot.reason = kvasprintf(GFP_KERNEL, fmt, varg);
> > > va_end(varg);
> > > @@ -342,6 +350,8 @@ static void xe_driver_devcoredump_fini(void *arg)
> > > int xe_devcoredump_init(struct xe_device *xe)
> > > {
> > > + spin_lock_init(&xe->devcoredump.lock);
> > > +
> > > return devm_add_action_or_reset(xe->drm.dev, xe_driver_devcoredump_fini, &xe->drm);
> > > }
> > > diff --git a/drivers/gpu/drm/xe/xe_devcoredump_types.h b/drivers/gpu/drm/xe/xe_devcoredump_types.h
> > > index e6234e887102..6b776abd8bf4 100644
> > > --- a/drivers/gpu/drm/xe/xe_devcoredump_types.h
> > > +++ b/drivers/gpu/drm/xe/xe_devcoredump_types.h
> > > @@ -80,6 +80,8 @@ struct xe_devcoredump_snapshot {
> > > * for reading the information.
> > > */
> > > struct xe_devcoredump {
> > > + /** @lock: protects access to 'captured' to prevent concurrent captures */
> > > + spinlock_t lock;
> > > /** @captured: The snapshot of the first hang has already been taken. */
> > Protected by @lock.
> >
> > Matt
> >
> > > bool captured;
> > > /** @snapshot: Snapshot is captured at time of the first crash */
> > > --
> > > 2.47.0
> > >
>
More information about the Intel-xe
mailing list