[Intel-gfx] [PATCH 1/5] drm/i915: Always pin the fence for scanout on gen2/3 and primary planes
Ville Syrjälä
ville.syrjala at linux.intel.com
Wed Nov 22 14:42:58 UTC 2017
On Fri, Nov 17, 2017 at 09:29:33AM -0800, Rodrigo Vivi wrote:
> On Fri, Nov 17, 2017 at 01:21:35PM +0000, Ville Syrjälä wrote:
> > On Thu, Nov 16, 2017 at 12:49:23PM -0800, Rodrigo Vivi wrote:
> > > On Thu, Nov 16, 2017 at 07:14:47PM +0000, Ville Syrjala wrote:
> > > > From: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > > >
> > > > The current code is trying to be lazy with fences on scanout buffers.
> > > > That looks broken for several reasons:
> > > > * gen2/3 always need a fence for tiled scanout
> > > > * the unpin doesn't know whether we pinned the fence or not so it
> > > > may unpin something we don't own
> > > > * FBC GTT tracking needs a fence (not sure we have proper fallback
> > > > for when there is no fence)
> > >
> > > Ohh! I wonder if this would also solve few of old PSR cases... PSR re-uses
> > > FBC GTT tracking...
> >
> > That whole concept seems a bit broken. AFAICS we have no "is FBC enabled
> > on the appropriate plane?" checks in the PSR code. I'm not quite sure
> > how it would handle multiple planes either. I guess we should be
> > disabling PSR when multiple planes are enabled?
>
> Ironically this case is good on PSR afai can remember...
OK, so it works by luck then :)
> The old problem with PSR that is back to picture is just primary plane with
> fbdev/fbcon... :/
>
> >
> > >
> > > And "fallback" for both is the frontbuffer_tracking
> >
> > I'm not sure how fronbuffer tracking handles GTT mmaps. I thought it
> > didn't even try. If I'm mistaken then I'm thinking we should perhaps
> > even stop using the GTT tracking entirely just to make the whole thing
> > more consistent. Having two ways to do the same thing doesn't appeal to
> > me.
>
> Well... the frontbuffer tracking would kill the benefits of PSR2.
Which benefit is that? Partial updates or something? I don't think FBC
tracking would help much there because modern platforms just nuke the
entire compressed buffer on any modification. Well, maybe GTT tracking
still has smaller granularity, but that seems like a niche use case so
not much benefit I think.
> So if we could make that HW tracking really working I would prefer.
All it would take is someone resurrecting my old FBC work. I had full
HW tracking implemented there, but at the time no one wanted it.
> Otherwise we will have to have both... at least one for cases where
> hw tracking works and other for the cases hw tracking doesn't work... :/
>
> >
> > >
> > > >
> > > > So to fix this always use a fence for gen2/3, and for primary planes on
> > > > other platforms (for FBC). For sprites and cursor we never need a fence
> > > > so don't even try to get one. Since we now know whether a fence was
> > > > pinned we can safely unpin it too.
> > > >
> > > > Signed-off-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > > > ---
> > > > drivers/gpu/drm/i915/i915_drv.h | 2 +-
> > > > drivers/gpu/drm/i915/i915_gem.c | 4 +--
> > > > drivers/gpu/drm/i915/intel_display.c | 55 ++++++++++++++++++++++++++++--------
> > > > drivers/gpu/drm/i915/intel_drv.h | 7 +++--
> > > > drivers/gpu/drm/i915/intel_fbdev.c | 17 +++++++++--
> > > > drivers/gpu/drm/i915/intel_overlay.c | 2 +-
> > > > 6 files changed, 66 insertions(+), 21 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > > index 2158a758a17d..8c6d0b7ac9a5 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > @@ -3783,7 +3783,7 @@ int __must_check
> > > > i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write);
> > > > struct i915_vma * __must_check
> > > > i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > > - u32 alignment,
> > > > + u32 alignment, bool needs_fence,
> > > > const struct i915_ggtt_view *view);
> > > > void i915_gem_object_unpin_from_display_plane(struct i915_vma *vma);
> > > > int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj,
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > > index 61ba321e9970..af18168e48e3 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > > @@ -3944,7 +3944,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > > > */
> > > > struct i915_vma *
> > > > i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > > - u32 alignment,
> > > > + u32 alignment, bool needs_fence,
> > > > const struct i915_ggtt_view *view)
> > > > {
> > > > struct i915_vma *vma;
> > > > @@ -3997,7 +3997,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > > * happy to scanout from anywhere within its global aperture.
> > > > */
> > > > flags = 0;
> > > > - if (HAS_GMCH_DISPLAY(i915))
> > > > + if (HAS_GMCH_DISPLAY(i915) || needs_fence)
> > > > flags = PIN_MAPPABLE;
> > > > vma = i915_gem_object_ggtt_pin(obj, view, 0, alignment, flags);
> > > > }
> > > > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > > > index e6fcbc5abc75..0657e03e871a 100644
> > > > --- a/drivers/gpu/drm/i915/intel_display.c
> > > > +++ b/drivers/gpu/drm/i915/intel_display.c
> > > > @@ -2154,8 +2154,21 @@ static unsigned int intel_surf_alignment(const struct drm_framebuffer *fb,
> > > > }
> > > > }
> > > >
> > > > +static bool intel_plane_needs_fence(const struct intel_plane_state *plane_state)
> > > > +{
> > > > + struct intel_plane *plane = to_intel_plane(plane_state->base.plane);
> > > > + struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
> > > > + struct drm_i915_gem_object *obj = intel_fb_obj(plane_state->base.fb);
> > > > +
> > > > + if (i915_gem_object_get_tiling(obj) == I915_TILING_NONE)
> > > > + return false;
> > > > +
> > > > + return INTEL_GEN(dev_priv) < 4 || plane->id == PLANE_PRIMARY;
> > > > +}
> > > > +
> > > > struct i915_vma *
> > > > -intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb, unsigned int rotation)
> > > > +intel_pin_and_fence_fb_obj(const struct drm_framebuffer *fb,
> > > > + unsigned int rotation, bool needs_fence)
> > > > {
> > > > struct drm_device *dev = fb->dev;
> > > > struct drm_i915_private *dev_priv = to_i915(dev);
> > > > @@ -2189,11 +2202,16 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb, unsigned int rotation)
> > > >
> > > > atomic_inc(&dev_priv->gpu_error.pending_fb_pin);
> > > >
> > > > - vma = i915_gem_object_pin_to_display_plane(obj, alignment, &view);
> > > > + vma = i915_gem_object_pin_to_display_plane(obj, alignment,
> > > > + needs_fence, &view);
> > > > if (IS_ERR(vma))
> > > > goto err;
> > > >
> > > > - if (i915_vma_is_map_and_fenceable(vma)) {
> > > > + if (needs_fence) {
> > > > + int ret;
> > > > +
> > > > + WARN_ON(!i915_vma_is_map_and_fenceable(vma));
> > > > +
> > > > /* Install a fence for tiled scan-out. Pre-i965 always needs a
> > > > * fence, whereas 965+ only requires a fence if using
> > > > * framebuffer compression. For simplicity, we always, when
> > > > @@ -2210,7 +2228,11 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb, unsigned int rotation)
> > > > * something and try to run the system in a "less than optimal"
> > > > * mode that matches the user configuration.
> > > > */
> > > > - i915_vma_pin_fence(vma);
> > > > + ret = i915_vma_pin_fence(vma);
> > > > + if (ret) {
> > > > + vma = ERR_PTR(ret);
> > > > + goto err;
> > > > + }
> > > > }
> > > >
> > > > i915_vma_get(vma);
> > > > @@ -2221,11 +2243,12 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb, unsigned int rotation)
> > > > return vma;
> > > > }
> > > >
> > > > -void intel_unpin_fb_vma(struct i915_vma *vma)
> > > > +void intel_unpin_fb_vma(struct i915_vma *vma, bool needs_fence)
> > > > {
> > > > lockdep_assert_held(&vma->vm->i915->drm.struct_mutex);
> > > >
> > > > - i915_vma_unpin_fence(vma);
> > > > + if (needs_fence)
> > > > + i915_vma_unpin_fence(vma);
> > > > i915_gem_object_unpin_from_display_plane(vma);
> > > > i915_vma_put(vma);
> > > > }
> > > > @@ -2816,6 +2839,7 @@ intel_find_initial_plane_obj(struct intel_crtc *intel_crtc,
> > > > struct intel_plane_state *intel_state =
> > > > to_intel_plane_state(plane_state);
> > > > struct drm_framebuffer *fb;
> > > > + bool needs_fence;
> > > >
> > > > if (!plane_config->fb)
> > > > return;
> > > > @@ -2869,8 +2893,10 @@ intel_find_initial_plane_obj(struct intel_crtc *intel_crtc,
> > > >
> > > > valid_fb:
> > > > mutex_lock(&dev->struct_mutex);
> > > > + needs_fence = intel_plane_needs_fence(intel_state);
> > > > intel_state->vma =
> > > > - intel_pin_and_fence_fb_obj(fb, primary->state->rotation);
> > > > + intel_pin_and_fence_fb_obj(fb, primary->state->rotation,
> > > > + needs_fence);
> > > > mutex_unlock(&dev->struct_mutex);
> > > > if (IS_ERR(intel_state->vma)) {
> > > > DRM_ERROR("failed to pin boot fb on pipe %d: %li\n",
> > > > @@ -12753,9 +12779,12 @@ intel_prepare_plane_fb(struct drm_plane *plane,
> > > >
> > > > ret = i915_gem_object_attach_phys(obj, align);
> > > > } else {
> > > > + bool needs_fence =
> > > > + intel_plane_needs_fence(to_intel_plane_state(new_state));
> > > > struct i915_vma *vma;
> > > >
> > > > - vma = intel_pin_and_fence_fb_obj(fb, new_state->rotation);
> > > > + vma = intel_pin_and_fence_fb_obj(fb, new_state->rotation,
> > > > + needs_fence);
> > > > if (!IS_ERR(vma))
> > > > to_intel_plane_state(new_state)->vma = vma;
> > > > else
> > > > @@ -12809,8 +12838,11 @@ intel_cleanup_plane_fb(struct drm_plane *plane,
> > > > /* Should only be called after a successful intel_prepare_plane_fb()! */
> > > > vma = fetch_and_zero(&to_intel_plane_state(old_state)->vma);
> > > > if (vma) {
> > > > + bool needs_fence =
> > > > + intel_plane_needs_fence(to_intel_plane_state(old_state));
> > > > +
> > > > mutex_lock(&plane->dev->struct_mutex);
> > > > - intel_unpin_fb_vma(vma);
> > > > + intel_unpin_fb_vma(vma, needs_fence);
> > > > mutex_unlock(&plane->dev->struct_mutex);
> > > > }
> > > > }
> > > > @@ -13152,7 +13184,8 @@ intel_legacy_cursor_update(struct drm_plane *plane,
> > > > goto out_unlock;
> > > > }
> > > > } else {
> > > > - vma = intel_pin_and_fence_fb_obj(fb, new_plane_state->rotation);
> > > > + vma = intel_pin_and_fence_fb_obj(fb, new_plane_state->rotation,
> > > > + false);
> > > > if (IS_ERR(vma)) {
> > > > DRM_DEBUG_KMS("failed to pin object\n");
> > > >
> > > > @@ -13183,7 +13216,7 @@ intel_legacy_cursor_update(struct drm_plane *plane,
> > > >
> > > > old_vma = fetch_and_zero(&to_intel_plane_state(old_plane_state)->vma);
> > > > if (old_vma)
> > > > - intel_unpin_fb_vma(old_vma);
> > > > + intel_unpin_fb_vma(old_vma, false);
> > > >
> > > > out_unlock:
> > > > mutex_unlock(&dev_priv->drm.struct_mutex);
> > > > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> > > > index e9b66e0cb647..c13f15ef342b 100644
> > > > --- a/drivers/gpu/drm/i915/intel_drv.h
> > > > +++ b/drivers/gpu/drm/i915/intel_drv.h
> > > > @@ -905,7 +905,7 @@ struct cxsr_latency {
> > > > #define to_intel_framebuffer(x) container_of(x, struct intel_framebuffer, base)
> > > > #define to_intel_plane(x) container_of(x, struct intel_plane, base)
> > > > #define to_intel_plane_state(x) container_of(x, struct intel_plane_state, base)
> > > > -#define intel_fb_obj(x) (x ? to_intel_framebuffer(x)->obj : NULL)
> > > > +#define intel_fb_obj(x) ((x) ? to_intel_framebuffer(x)->obj : NULL)
> > > >
> > > > struct intel_hdmi {
> > > > i915_reg_t hdmi_reg;
> > > > @@ -1423,8 +1423,9 @@ void intel_release_load_detect_pipe(struct drm_connector *connector,
> > > > struct intel_load_detect_pipe *old,
> > > > struct drm_modeset_acquire_ctx *ctx);
> > > > struct i915_vma *
> > > > -intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb, unsigned int rotation);
> > > > -void intel_unpin_fb_vma(struct i915_vma *vma);
> > > > +intel_pin_and_fence_fb_obj(const struct drm_framebuffer *fb,
> > > > + unsigned int rotation, bool needs_fence);
> > > > +void intel_unpin_fb_vma(struct i915_vma *vma, bool needs_fence);
> > > > struct drm_framebuffer *
> > > > intel_framebuffer_create(struct drm_i915_gem_object *obj,
> > > > struct drm_mode_fb_cmd2 *mode_cmd);
> > > > diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
> > > > index b8af35187d22..4cbc7fde5e30 100644
> > > > --- a/drivers/gpu/drm/i915/intel_fbdev.c
> > > > +++ b/drivers/gpu/drm/i915/intel_fbdev.c
> > > > @@ -165,6 +165,13 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
> > > > return ret;
> > > > }
> > > >
> > > > +static bool intel_fbdev_needs_fence(struct intel_fbdev *ifbdev)
> > > > +{
> > > > + struct drm_i915_gem_object *obj = ifbdev->fb->obj;
> > > > +
> > > > + return i915_gem_object_get_tiling(obj) != I915_TILING_NONE;
> > > > +}
> > > > +
> > > > static int intelfb_create(struct drm_fb_helper *helper,
> > > > struct drm_fb_helper_surface_size *sizes)
> > > > {
> > > > @@ -180,6 +187,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
> > > > struct i915_vma *vma;
> > > > bool prealloc = false;
> > > > void __iomem *vaddr;
> > > > + bool needs_fence;
> > > > int ret;
> > > >
> > > > if (intel_fb &&
> > > > @@ -212,7 +220,9 @@ static int intelfb_create(struct drm_fb_helper *helper,
> > > > * This also validates that any existing fb inherited from the
> > > > * BIOS is suitable for own access.
> > > > */
> > > > - vma = intel_pin_and_fence_fb_obj(&ifbdev->fb->base, DRM_MODE_ROTATE_0);
> > > > + needs_fence = intel_fbdev_needs_fence(ifbdev);
> > > > + vma = intel_pin_and_fence_fb_obj(&ifbdev->fb->base,
> > > > + DRM_MODE_ROTATE_0, needs_fence);
> > > > if (IS_ERR(vma)) {
> > > > ret = PTR_ERR(vma);
> > > > goto out_unlock;
> > > > @@ -276,7 +286,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
> > > > return 0;
> > > >
> > > > out_unpin:
> > > > - intel_unpin_fb_vma(vma);
> > > > + intel_unpin_fb_vma(vma, needs_fence);
> > > > out_unlock:
> > > > intel_runtime_pm_put(dev_priv);
> > > > mutex_unlock(&dev->struct_mutex);
> > > > @@ -514,7 +524,8 @@ static void intel_fbdev_destroy(struct intel_fbdev *ifbdev)
> > > >
> > > > if (ifbdev->vma) {
> > > > mutex_lock(&ifbdev->helper.dev->struct_mutex);
> > > > - intel_unpin_fb_vma(ifbdev->vma);
> > > > + intel_unpin_fb_vma(ifbdev->vma,
> > > > + intel_fbdev_needs_fence(ifbdev));
> > > > mutex_unlock(&ifbdev->helper.dev->struct_mutex);
> > > > }
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> > > > index 1b397b41cb4f..2b54526fbf3c 100644
> > > > --- a/drivers/gpu/drm/i915/intel_overlay.c
> > > > +++ b/drivers/gpu/drm/i915/intel_overlay.c
> > > > @@ -801,7 +801,7 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay,
> > > >
> > > > atomic_inc(&dev_priv->gpu_error.pending_fb_pin);
> > > >
> > > > - vma = i915_gem_object_pin_to_display_plane(new_bo, 0, NULL);
> > > > + vma = i915_gem_object_pin_to_display_plane(new_bo, 0, false, NULL);
> > > > if (IS_ERR(vma)) {
> > > > ret = PTR_ERR(vma);
> > > > goto out_pin_section;
> > > > --
> > > > 2.13.6
> > > >
> > > > _______________________________________________
> > > > Intel-gfx mailing list
> > > > Intel-gfx at lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >
> > --
> > Ville Syrjälä
> > Intel OTC
--
Ville Syrjälä
Intel OTC
More information about the Intel-gfx
mailing list