[PATCH] drm/msm: Fix hw_fence error path cleanup

Rob Clark robdclark at gmail.com
Thu Jul 13 20:28:11 UTC 2023


On Thu, Jul 13, 2023 at 1:03 PM Dmitry Baryshkov
<dmitry.baryshkov at linaro.org> wrote:
>
> On 13/07/2023 01:25, Rob Clark wrote:
> > From: Rob Clark <robdclark at chromium.org>
> >
> > In an error path where the submit is free'd without the job being run,
> > the hw_fence pointer is simply a kzalloc'd block of memory.  In this
> > case we should just kfree() it, rather than trying to decrement it's
> > reference count.  Fortunately we can tell that this is the case by
> > checking for a zero refcount, since if the job was run, the submit would
> > be holding a reference to the hw_fence.
> >
> > Fixes: f94e6a51e17c ("drm/msm: Pre-allocate hw_fence")
> > Signed-off-by: Rob Clark <robdclark at chromium.org>
> > ---
> >   drivers/gpu/drm/msm/msm_fence.c      |  6 ++++++
> >   drivers/gpu/drm/msm/msm_gem_submit.c | 14 +++++++++++++-
> >   2 files changed, 19 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
> > index 96599ec3eb78..1a5d4f1c8b42 100644
> > --- a/drivers/gpu/drm/msm/msm_fence.c
> > +++ b/drivers/gpu/drm/msm/msm_fence.c
> > @@ -191,6 +191,12 @@ msm_fence_init(struct dma_fence *fence, struct msm_fence_context *fctx)
> >
> >       f->fctx = fctx;
> >
> > +     /*
> > +      * Until this point, the fence was just some pre-allocated memory,
> > +      * no-one should have taken a reference to it yet.
> > +      */
> > +     WARN_ON(kref_read(&fence->refcount));
>
> It this really correct to return a refcounted object with 0 refcount
> (I'm looking at submit_create() / msm_fence_alloc() )? Maybe it would be
> better to move dma_fence_get() to msm_fence_alloc() ? But don't
> immediately see, which one should be moved.

The issue is that we can't really initialize the fence until
msm_job_run(), when it is known what order the fence would be
signaled.  But we don't want to do any allocations in msm_job_run()
because that could trigger the shrinker, which could need to wait
until jobs complete to release memory, forming a deadlock.

BR,
-R

> > +
> >       dma_fence_init(&f->base, &msm_fence_ops, &fctx->spinlock,
> >                      fctx->context, ++fctx->last_fence);
> >   }
> > diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
> > index 3f1aa4de3b87..9d66498cdc04 100644
> > --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> > +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> > @@ -86,7 +86,19 @@ void __msm_gem_submit_destroy(struct kref *kref)
> >       }
> >
> >       dma_fence_put(submit->user_fence);
> > -     dma_fence_put(submit->hw_fence);
> > +
> > +     /*
> > +      * If the submit is freed before msm_job_run(), then hw_fence is
> > +      * just some pre-allocated memory, not a reference counted fence.
> > +      * Once the job runs and the hw_fence is initialized, it will
> > +      * have a refcount of at least one, since the submit holds a ref
> > +      * to the hw_fence.
> > +      */
> > +     if (kref_read(&submit->hw_fence->refcount) == 0) {
> > +             kfree(submit->hw_fence);
> > +     } else {
> > +             dma_fence_put(submit->hw_fence);
> > +     }
> >
> >       put_pid(submit->pid);
> >       msm_submitqueue_put(submit->queue);
>
> --
> With best wishes
> Dmitry
>


More information about the dri-devel mailing list