[Intel-xe] [RFC PATCH v2 22/23] drm/i915: Handle dma fences in dirtyfb callback

Hogander, Jouni jouni.hogander at intel.com
Thu Jul 27 05:44:27 UTC 2023


Thank you Ville for your comments. See my inline responses below.

On Thu, 2023-07-13 at 23:08 +0300, Ville Syrjälä wrote:
> On Wed, May 10, 2023 at 03:11:51PM +0300, Jouni Högander wrote:
> > Take into account dma fences in dirtyfb callback. If there is no
> > unsignaled dma fences perform flush immediately. If there are
> > unsignaled dma fences perform invalidate and add callback which
> > will
> > queue flush when the fence gets signaled.
> > 
> > Signed-off-by: Jouni Högander <jouni.hogander at intel.com>
> > ---
> >  drivers/gpu/drm/i915/display/intel_fb.c | 55
> > +++++++++++++++++++++++--
> >  1 file changed, 52 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/display/intel_fb.c
> > b/drivers/gpu/drm/i915/display/intel_fb.c
> > index fa4464d433b7..fc325f2299a4 100644
> > --- a/drivers/gpu/drm/i915/display/intel_fb.c
> > +++ b/drivers/gpu/drm/i915/display/intel_fb.c
> > @@ -8,6 +8,9 @@
> >  #include <drm/drm_framebuffer.h>
> >  #include <drm/drm_modeset_helper.h>
> >  
> > +#include <linux/dma-fence.h>
> > +#include <linux/dma-resv.h>
> > +
> >  #include "i915_drv.h"
> >  #include "intel_display.h"
> >  #include "intel_display_types.h"
> > @@ -1888,6 +1891,20 @@ static int
> > intel_user_framebuffer_create_handle(struct drm_framebuffer *fb,
> >  }
> >  
> >  #ifdef I915
> > +struct frontbuffer_fence_cb {
> > +       struct dma_fence_cb base;
> > +       struct intel_frontbuffer *front;
> > +};
> > +
> > +static void intel_user_framebuffer_fence_wake(struct dma_fence
> > *dma,
> > +                                             struct dma_fence_cb
> > *data)
> > +{
> > +       struct frontbuffer_fence_cb *cb = container_of(data,
> > typeof(*cb), base);
> > +
> > +       intel_frontbuffer_queue_flush(cb->front);
> > +       kfree(cb);
> > +}
> > +
> >  static int intel_user_framebuffer_dirty(struct drm_framebuffer
> > *fb,
> >                                         struct drm_file *file,
> >                                         unsigned int flags,
> > unsigned int color,
> > @@ -1895,11 +1912,43 @@ static int
> > intel_user_framebuffer_dirty(struct drm_framebuffer *fb,
> >                                         unsigned int num_clips)
> >  {
> >         struct drm_i915_gem_object *obj = intel_fb_obj(fb);
> > +       struct intel_frontbuffer *front = to_intel_frontbuffer(fb);
> > +       struct dma_resv_iter cursor;
> > +       struct dma_fence *fence;
> > +       int ret;
> > +
> > +       if (dma_resv_test_signaled(intel_bo_to_drm_bo(obj).resv,
> > dma_resv_usage_rw(false))) {
> > +               intel_bo_flush_if_display(obj);
> > +               intel_frontbuffer_flush(front, ORIGIN_DIRTYFB);
> > +               return 0;
> > +       }
> >  
> > -       intel_bo_flush_if_display(obj);
> > -       intel_frontbuffer_flush(to_intel_frontbuffer(fb),
> > ORIGIN_DIRTYFB);
> > +       intel_frontbuffer_invalidate(front, ORIGIN_DIRTYFB);
> >  
> > -       return 0;
> > +       dma_resv_iter_begin(&cursor, intel_bo_to_drm_bo(obj).resv,
> > +                           dma_resv_usage_rw(false));
> > +       dma_resv_for_each_fence_unlocked(&cursor, fence) {
> > +               struct frontbuffer_fence_cb *cb =
> > +                       kmalloc(sizeof(struct
> > frontbuffer_fence_cb), GFP_KERNEL);
> > +               if (!cb) {
> > +                       ret = -ENOMEM;
> > +                       break;
> > +               }
> > +               cb->front = front;
> > +
> > +               ret = dma_fence_add_callback(fence, &cb->base,
> > +                                           
> > intel_user_framebuffer_fence_wake);
> > +               if (ret) {
> > +                       intel_user_framebuffer_fence_wake(fence,
> > &cb->base);
> > +                       if (ret == -ENOENT)
> > +                               ret = 0;
> > +                       else
> > +                               break;
> > +               }
> > +       }
> > +       dma_resv_iter_end(&cursor);
> 
> AFAICS we could use dma_resv_get_singleton() here to get just a
> single callback once all the included fences have signalled. It
> might also reduce the amount of kmallocs() a bit, though
> dma_resv_get_singleton() does seem to end up doing multiple
> allocations as well, but perhaps it could be optimized further.
> 
> The other thing dma_resv_get_singleton() does is is reference
> counting of the fences. But I'm not sure that's needed here.
> Ie. I'm not sure what the lifetime rules are.

I sent a new version using dma_resv_get_singleton.

> 
> 
> I was also pondering what kind of scenarios we might hit here that
> might
> be a bit problematic. This is what I came up with:
> 
> * scenario 1:
> 
>  flip(PLANE A):
>   -> FB A.bits=PLANE A
>  set fence(FB A):
>   -> FB A.fence = fence 1
>  dirtyfb(FB A):
>   -> fence 1 !signalled -> invalidate FB A.bits==PLANE A
>   -> fence 1 queue cb
>  flip(PLANE A):
>   -> FB A.bits = 0
>   -> FB B.bits = PLANE A
>  fence 1 cb -> flush FB A.bits=0
> 
>  In the end tracking is left in invalidated state, at least for
>  FBC AFAICS. Possible fix would be to clear FBC busy_bits on flip
> [1]?
>  DRRS is fine I think since every flip already clears busy_bits.
>  Not sure what PSR does.

Your suggestiong below is part of new set here:

https://patchwork.freedesktop.org/series/116620/

Also PSR busy bits are taken care there.

> 
> 
> [1]
> @@ -1299,11 +1299,9 @@ static void __intel_fbc_post_update(struct
> intel_fbc *fbc)
>         lockdep_assert_held(&fbc->lock);
> 
>         fbc->flip_pending = false;
> +       fbc->busy_bits = 0;
> 
> -       if (!fbc->busy_bits)
> -               intel_fbc_activate(fbc);
> -       else
> -               intel_fbc_deactivate(fbc, "frontbuffer write");
> +       intel_fbc_activate(fbc);
>  }
> 
> 
> * scenario 2:
> 
>  flip(PLANE A):
>   -> FB A.bits=PLANE A
>  set fence(FB A):
>   -> FB A.fence = fence 1
>  dirtyfb(FB A):
>   -> fence 1 !signalled -> invalidate FB A.bits==PLANE A
>   -> fence 1 queue cb
>  set fence(FB A):
>   -> FB A.fence = fence 2
>  dirtyfb(FB A):
>   -> fence 2 !signalled -> invalidate FB A.bits==PLANE A
>   -> fence 2 queue cb
>  fence 1 cb -> flush FB A.bits==PLANE A
>   -> frontbuffer tracking flushed before fence 2 has signalled
>  ...
>  fence 2 cb -> flush FB A.bits==PLANE A
> 
>  Perhaps we should keep track of how many fences are actually
> pending,
>  and only do the frontbuffer flush when the count drops to zero?
>  OTOH the final flush should still guarantee some kind of correctness
>  in the end, so not sure this is really a big problem.
> 

I left fence count out in new version as well. Final flush will take
care of correctness. Drawback are extra flush(es) in this scenario.
That is kind of optimization IMO. Please point out if you think fence
count should be implemented before merging.

> > +
> > +       return ret;
> >  }
> >  #endif
> >  
> > -- 
> > 2.34.1
> 

BR,

Jouni Högander


More information about the Intel-xe mailing list