[PATCH] drm/etnaviv: lock MMU while dumping core

Fri May 24 16:22:55 UTC 2019

Am Freitag, den 24.05.2019, 16:31 +0100 schrieb Russell King - ARM Linux admin:
> On Wed, May 22, 2019 at 11:55:14AM +0200, Lucas Stach wrote:
> > The devcoredump needs to operate on a stable state of the MMU while
> > it is writing the MMU state to the coredump. The missing lock
> > allowed both the userspace submit, as well as the GPU job finish
> > paths to mutate the MMU state while a coredump is under way.
> 
> This locking does little to solve this problem.  We actually rely on the
> GPU being deadlocked at this point - we aren't expecting the GPU to make
> any progress.  The only thing that can realistically happen is for
> userspace to submit a new job, but adding this locking does little to
> avoid that.

The GPU is dead at the point where we do the dump. But there is nothing
that would stop the workers that clean up finished jobs before the GPU
stopped execution to manipulate the MMU mapping list, which happens
when buffers become unreferenced due to the finished GPU operation.
Also new mappings can be added to the MMU due to a userspace submit,
even if the GPU is blocked.

> > Fixes: a8c21a5451d8 (drm/etnaviv: add initial etnaviv DRM driver)
> > > > Reported-by: David Jander <david at protonic.nl>
> > > > Signed-off-by: Lucas Stach <l.stach at pengutronix.de>
> > > > Tested-by: David Jander <david at protonic.nl>
> > ---
> >  drivers/gpu/drm/etnaviv/etnaviv_dump.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_dump.c b/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> > index 33854c94cb85..515515ef24f9 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> > @@ -125,6 +125,8 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu)
> > > >  		return;
> > > >  	etnaviv_dump_core = false;
> >  
> > > > +	mutex_lock(&gpu->mmu->lock);
> > +
> > > >  	mmu_size = etnaviv_iommu_dump_size(gpu->mmu);
> >  
> > > >  	/* We always dump registers, mmu, ring and end marker */
> > @@ -167,6 +169,7 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu)
> > > >  	iter.start = __vmalloc(file_size, GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY,
> > > >  			       PAGE_KERNEL);
> > > >  	if (!iter.start) {
> > > > +		mutex_unlock(&gpu->mmu->lock);
> > > >  		dev_warn(gpu->dev, "failed to allocate devcoredump file\n");
> > > >  		return;
> > > >  	}
> > @@ -234,6 +237,8 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu)
> > > >  					 obj->base.size);
> > > >  	}
> >  
> > > > +	mutex_unlock(&gpu->mmu->lock);
> > +
> 
> All that this lock does is prevent the lists from changing while we
> build up what we're going to write out.  At this point, you drop the
> lock, before we've written anything out to the coredump.  Things
> can now change, including the ring buffer.

At this point we finished moving all the dump data into the vmalloced
memory allocated earlier. Why wound we need to hold the lock after this
is finished? Nothing is going to touch the MMU mappings after that
point.

> >  	etnaviv_core_dump_header(&iter, ETDUMP_BUF_END, iter.data);
> >  
> >  	dev_coredumpv(gpu->dev, iter.start, iter.data - iter.start, GFP_KERNEL);
> 
> Here we write out the data, which includes the command buffers, ring
> buffers, BOs, etc.  The data in the ring buffer can still change
> because the lock has been dropped.
> 
> However, all those objects should be pinned, so none of them should
> go away: things like the command buffers that have been submitted
> should be immutable at this point (if they aren't, it could well be
> a reason why the GPU has gone awol.)

We do not have a big etnaviv lock that would prevent cleanup or new
submit work to make progress while the GPU is busy or hung. All those
operations are able to mutate the MMU state, even when the GPU is
locked up. The only things that are guaranteed to be stable are the
objects referenced by the hanging job, which is also why I think we
should dump only the hanging job and the GPU state, but that's a much
bigger patch. I've kept this one small, so it can be applied to the
stable kernels without any conflicts.

> It is also not nice to hold the lock over the _big_ vmalloc() which
> may take some time.

At the time we detect the GPU lockup, we've already lost a lot of GPU
not executing pending jobs. I don't really care about blocking a
userspace submit a bit longer while the dump and recovery is under way.

> Have you actually seen problems here, or is this just theoretical?

This fixes a real kernel crashing issue as we overrun the vmalloced
buffer when new BOs are added into the MMU between the time we go over
the mappings to get the dump size and actually moving the BO data into
the dump. This has been reported and tracked down by David.

Regards,
Lucas