[PATCH] drm/msm: Fix fence rollover issue

Rob Clark robdclark at gmail.com
Thu Jun 16 14:04:14 UTC 2022


On Thu, Jun 16, 2022 at 1:27 AM Dmitry Baryshkov
<dmitry.baryshkov at linaro.org> wrote:
>
> On 15/06/2022 19:24, Rob Clark wrote:
> > From: Rob Clark <robdclark at chromium.org>
> >
> > And while we are at it, let's start the fence counter close to the
> > rollover point so that if issues slip in, they are more obvious.
> >
> > Signed-off-by: Rob Clark <robdclark at chromium.org>
>
> Should it also have
>
> Fixes: fde5de6cb461 ("drm/msm: move fence code to it's own file")
>
> Or maybe
>
> Fixes: 5f3aee4ceb5b ("drm/msm: Handle fence rollover")

arguably it fixes the first commit that added GPU support (and
finishes up a couple spots that the above commit missed)

I guess I could use the fixes tag just to indicate how far back it
would be reasonable to backport to stable branches.

> Otherwise:
>
> Reviewed: Dmitry Baryshkov <dmitry.baryshkov at linaro.org>
>
>
> > ---
> >   drivers/gpu/drm/msm/msm_fence.c | 13 +++++++++++--
> >   1 file changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
> > index 3df255402a33..a35a6746c7cd 100644
> > --- a/drivers/gpu/drm/msm/msm_fence.c
> > +++ b/drivers/gpu/drm/msm/msm_fence.c
> > @@ -28,6 +28,14 @@ msm_fence_context_alloc(struct drm_device *dev, volatile uint32_t *fenceptr,
> >       fctx->fenceptr = fenceptr;
> >       spin_lock_init(&fctx->spinlock);
> >
> > +     /*
> > +      * Start out close to the 32b fence rollover point, so we can
> > +      * catch bugs with fence comparisons.
> > +      */
> > +     fctx->last_fence = 0xffffff00;
> > +     fctx->completed_fence = fctx->last_fence;
> > +     *fctx->fenceptr = fctx->last_fence;
>
> This looks like a debugging hack. But probably it's fine to have it, as
> it wouldn't cause any side effects.

I was originally going to add a modparam or kconfig to enable this..
but then thought, if there is a bug and thing are to go wrong, it's
best for that to happen ASAP rather than after 200-400 days of
uptime.. the latter case can be rather hard to reproduce bugs ;-)

IIRC the kernel does something similar with jiffies to ensure the
rollover point is hit quickly

BR,
-R

> > +
> >       return fctx;
> >   }
> >
> > @@ -46,11 +54,12 @@ bool msm_fence_completed(struct msm_fence_context *fctx, uint32_t fence)
> >               (int32_t)(*fctx->fenceptr - fence) >= 0;
> >   }
> >
> > -/* called from workqueue */
> > +/* called from irq handler */
> >   void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence)
> >   {
> >       spin_lock(&fctx->spinlock);
> > -     fctx->completed_fence = max(fence, fctx->completed_fence);
> > +     if (fence_after(fence, fctx->completed_fence))
> > +             fctx->completed_fence = fence;
> >       spin_unlock(&fctx->spinlock);
> >   }
> >
>
>
> --
> With best wishes
> Dmitry


More information about the dri-devel mailing list