[Intel-gfx] [PATCH 53/53] !UPSTREAM: drm/i915: Use MMIO flips

oscar.mateo at intel.com oscar.mateo at intel.com
Fri Jun 13 17:38:11 CEST 2014


From: Sourab Gupta <sourab.gupta at intel.com>

If we want flips to work, either we create an Execlists-aware version
of intel_gen7_queue_flip, or we don't place commands directly in the
ringbuffer.

When upstreamed, this patch should implement the second option:

    drm/i915: Replaced Blitter ring based flips with MMIO flips

    This patch enables the framework for using MMIO based flip calls,
    in contrast with the CS based flip calls which are being used currently.

    MMIO based flip calls can be enabled on architectures where
    Render and Blitter engines reside in different power wells. The
    decision to use MMIO flips can be made based on workloads to give
    100% residency for Media power well.

    v2: The MMIO flips now use the interrupt driven mechanism for issuing the
    flips when target seqno is reached. (Incorporating Ville's idea)

    v3: Rebasing on latest code. Code restructuring after incorporating
    Damien's comments

    v4: Addressing Ville's review comments
        -general cleanup
        -updating only base addr instead of calling update_primary_plane
        -extending patch for gen5+ platforms

    v5: Addressed Ville's review comments
        -Making mmio flip vs cs flip selection based on module parameter
        -Adding check for DRIVER_MODESET feature in notify_ring before calling
         notify mmio flip.
        -Other changes mostly in function arguments

    v6: -Having a seperate function to check condition for using mmio flips (Ville)
        -propogating error code from i915_gem_check_olr (Ville)

    v7: -Adding __must_check with i915_gem_check_olr (Chris)
        -Renaming mmio_flip_data to mmio_flip (Chris)
        -Rebasing on latest nightly

    v8: -Rebasing on latest code
        -squash 3rd patch in series(mmio setbase vs page flip race) with this patch
        -Added new tiling mode update in intel_do_mmio_flip (Chris)

    v9: -check for obj->last_write_seqno being 0 instead of obj->ring being NULL in
    intel_postpone_flip, as this is a more restrictive condition (Chris)

    v10: -Applied Chris's suggestions for squashing patches 2,3 into this patch.
    These patches make the selection of CS vs MMIO flip at the page flip time, and
    make the module parameter for using mmio flips as tristate, the states being
    'force CS flips', 'force mmio flips', 'driver discretion'.
    Changed the logic for driver discretion (Chris)

    v11: Minor code cleanup(better readability, fixing whitespace errors, using
    lockdep to check mutex locked status in postpone_flip, removal of __must_check
    !UPSTREAM: drm/i915: Fix for flips

    If we want flips to work, either we create an Execlists-aware version
    of intel_gen7_queue_flip, or we don't place commands directly in the
    ringbuffer.

    When upstreamed, this patch should implement the second option:

    drm/i915: Replaced Blitter ring based flips with MMIO flips

    This patch enables the framework for using MMIO based flip calls,
    in contrast with the CS based flip calls which are being used currently.

    MMIO based flip calls can be enabled on architectures where
    Render and Blitter engines reside in different power wells. The
    decision to use MMIO flips can be made based on workloads to give
    100% residency for Media power well.

    v2: The MMIO flips now use the interrupt driven mechanism for issuing the
    flips when target seqno is reached. (Incorporating Ville's idea)

    v3: Rebasing on latest code. Code restructuring after incorporating
    Damien's comments

    v4: Addressing Ville's review comments
        -general cleanup
        -updating only base addr instead of calling update_primary_plane
        -extending patch for gen5+ platforms

    v5: Addressed Ville's review comments
        -Making mmio flip vs cs flip selection based on module parameter
        -Adding check for DRIVER_MODESET feature in notify_ring before calling
         notify mmio flip.
        -Other changes mostly in function arguments

    v6: -Having a seperate function to check condition for using mmio flips (Ville)
        -propogating error code from i915_gem_check_olr (Ville)

    v7: -Adding __must_check with i915_gem_check_olr (Chris)
        -Renaming mmio_flip_data to mmio_flip (Chris)
        -Rebasing on latest nightly

    v8: -Rebasing on latest code
        -squash 3rd patch in series(mmio setbase vs page flip race) with this patch
        -Added new tiling mode update in intel_do_mmio_flip (Chris)

    v9: -check for obj->last_write_seqno being 0 instead of obj->ring being NULL in
    intel_postpone_flip, as this is a more restrictive condition (Chris)

    v10: -Applied Chris's suggestions for squashing patches 2,3 into this patch.
    These patches make the selection of CS vs MMIO flip at the page flip time, and
    make the module parameter for using mmio flips as tristate, the states being
    'force CS flips', 'force mmio flips', 'driver discretion'.
    Changed the logic for driver discretion (Chris)

    v11: Minor code cleanup(better readability, fixing whitespace errors, using
    lockdep to check mutex locked status in postpone_flip, removal of __must_check
    in function definition) (Chris)

    Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
    Signed-off-by: Sourab Gupta <sourab.gupta at intel.com>

Signed-off-by: Oscar Mateo <oscar.mateo at intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c      |   1 +
 drivers/gpu/drm/i915/i915_drv.h      |   8 ++
 drivers/gpu/drm/i915/i915_gem.c      |   2 +-
 drivers/gpu/drm/i915/i915_irq.c      |   3 +
 drivers/gpu/drm/i915/i915_params.c   |   5 ++
 drivers/gpu/drm/i915/intel_display.c | 148 ++++++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_drv.h     |   6 ++
 7 files changed, 171 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 93c0e1a..681d736 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1607,6 +1607,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 	spin_lock_init(&dev_priv->backlight_lock);
 	spin_lock_init(&dev_priv->uncore.lock);
 	spin_lock_init(&dev_priv->mm.object_stat_lock);
+	spin_lock_init(&dev_priv->mmio_flip_lock);
 	mutex_init(&dev_priv->dpio_lock);
 	mutex_init(&dev_priv->modeset_restore_lock);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b62b342..f519b6c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1377,6 +1377,9 @@ struct drm_i915_private {
 	/* protects the irq masks */
 	spinlock_t irq_lock;
 
+	/* protects the mmio flip data */
+	spinlock_t mmio_flip_lock;
+
 	bool display_irqs_enabled;
 
 	/* To control wakeup latency, e.g. for irq-driven dp aux transfers. */
@@ -2064,6 +2067,7 @@ struct i915_params {
 	bool reset;
 	bool disable_display;
 	bool disable_vtd_wa;
+	int use_mmio_flip;
 };
 extern struct i915_params i915 __read_mostly;
 
@@ -2274,6 +2278,8 @@ bool i915_gem_retire_requests(struct drm_device *dev);
 void i915_gem_retire_requests_ring(struct intel_engine_cs *ring);
 int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
 				      bool interruptible);
+int __must_check i915_gem_check_olr(struct intel_engine_cs *ring, u32 seqno);
+
 static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
 {
 	return unlikely(atomic_read(&error->reset_counter)
@@ -2649,6 +2655,8 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
 int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file);
 
+void intel_notify_mmio_flip(struct intel_engine_cs *ring);
+
 /* overlay */
 extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
 extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 86bfb8a..093af37 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1095,7 +1095,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
  * Compare seqno against outstanding lazy request. Emit a request if they are
  * equal.
  */
-static int
+int
 i915_gem_check_olr(struct intel_engine_cs *ring, u32 seqno)
 {
 	int ret;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index b0fa1ed..824d956 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1218,6 +1218,9 @@ static void notify_ring(struct drm_device *dev,
 
 	trace_i915_gem_request_complete(ring);
 
+	if (drm_core_check_feature(dev, DRIVER_MODESET))
+		intel_notify_mmio_flip(ring);
+
 	wake_up_all(&ring->irq_queue);
 	i915_queue_hangcheck(dev);
 }
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index b7455f8..6bca4b2 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -49,6 +49,7 @@ struct i915_params i915 __read_mostly = {
 	.disable_display = 0,
 	.enable_cmd_parser = 1,
 	.disable_vtd_wa = 0,
+	.use_mmio_flip = 1,
 };
 
 module_param_named(modeset, i915.modeset, int, 0400);
@@ -162,3 +163,7 @@ MODULE_PARM_DESC(disable_vtd_wa, "Disable all VT-d workarounds (default: false)"
 module_param_named(enable_cmd_parser, i915.enable_cmd_parser, int, 0600);
 MODULE_PARM_DESC(enable_cmd_parser,
 		 "Enable command parsing (1=enabled [default], 0=disabled)");
+
+module_param_named(use_mmio_flip, i915.use_mmio_flip, int, 0600);
+MODULE_PARM_DESC(use_mmio_flip, "use MMIO flips (-1=never, 0=driver "
+	"discretion, 1=always [default])");
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index b5cbb28..43fd4e7 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9255,6 +9255,147 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 	return 0;
 }
 
+static bool use_mmio_flip(struct intel_engine_cs *ring,
+			  struct drm_i915_gem_object *obj)
+{
+	/*
+	 * This is not being used for older platforms, because
+	 * non-availability of flip done interrupt forces us to use
+	 * CS flips. Older platforms derive flip done using some clever
+	 * tricks involving the flip_pending status bits and vblank irqs.
+	 * So using MMIO flips there would disrupt this mechanism.
+	 */
+
+	if (INTEL_INFO(ring->dev)->gen < 5)
+		return false;
+
+	if (i915.use_mmio_flip < 0)
+		return false;
+	else if (i915.use_mmio_flip > 0)
+		return true;
+	else
+		return ring != obj->ring;
+}
+
+static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
+{
+	struct drm_device *dev = intel_crtc->base.dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_framebuffer *intel_fb =
+		to_intel_framebuffer(intel_crtc->base.primary->fb);
+	struct drm_i915_gem_object *obj = intel_fb->obj;
+	u32 dspcntr;
+	u32 reg;
+
+	intel_mark_page_flip_active(intel_crtc);
+
+	reg = DSPCNTR(intel_crtc->plane);
+	dspcntr = I915_READ(reg);
+
+	if (INTEL_INFO(dev)->gen >= 4) {
+		if (obj->tiling_mode != I915_TILING_NONE)
+			dspcntr |= DISPPLANE_TILED;
+		else
+			dspcntr &= ~DISPPLANE_TILED;
+	}
+	I915_WRITE(reg, dspcntr);
+
+	I915_WRITE(DSPSURF(intel_crtc->plane),
+			intel_crtc->unpin_work->gtt_offset);
+	POSTING_READ(DSPSURF(intel_crtc->plane));
+}
+
+static int intel_postpone_flip(struct drm_i915_gem_object *obj)
+{
+	struct intel_engine_cs *ring;
+	int ret;
+
+	lockdep_assert_held(&obj->base.dev->struct_mutex);
+
+	if (!obj->last_write_seqno)
+		return 0;
+
+	ring = obj->ring;
+
+	if (i915_seqno_passed(ring->get_seqno(ring, true),
+				obj->last_write_seqno))
+		return 0;
+
+	ret = i915_gem_check_olr(ring, obj->last_write_seqno);
+	if (ret)
+		return ret;
+
+	if (WARN_ON(!ring->irq_get(ring)))
+		return 0;
+
+	return 1;
+}
+
+void intel_notify_mmio_flip(struct intel_engine_cs *ring)
+{
+	struct drm_i915_private *dev_priv = to_i915(ring->dev);
+	struct intel_crtc *intel_crtc;
+	unsigned long irq_flags;
+	u32 seqno;
+
+	seqno = ring->get_seqno(ring, false);
+
+	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
+	for_each_intel_crtc(ring->dev, intel_crtc) {
+		struct intel_mmio_flip *mmio_flip;
+
+		mmio_flip = &intel_crtc->mmio_flip;
+		if (mmio_flip->seqno == 0)
+			continue;
+
+		if (ring->id != mmio_flip->ring_id)
+			continue;
+
+		if (i915_seqno_passed(seqno, mmio_flip->seqno)) {
+			intel_do_mmio_flip(intel_crtc);
+			mmio_flip->seqno = 0;
+			ring->irq_put(ring);
+		}
+	}
+	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
+}
+
+static int intel_queue_mmio_flip(struct drm_device *dev,
+		struct drm_crtc *crtc,
+		struct drm_framebuffer *fb,
+		struct drm_i915_gem_object *obj,
+		struct intel_engine_cs *ring,
+		uint32_t flags)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	unsigned long irq_flags;
+	int ret;
+
+	if (WARN_ON(intel_crtc->mmio_flip.seqno))
+		return -EBUSY;
+
+	ret = intel_postpone_flip(obj);
+	if (ret < 0)
+		return ret;
+	if (ret == 0) {
+		intel_do_mmio_flip(intel_crtc);
+		return 0;
+	}
+
+	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
+	intel_crtc->mmio_flip.seqno = obj->last_write_seqno;
+	intel_crtc->mmio_flip.ring_id = obj->ring->id;
+	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
+
+	/*
+	 * Double check to catch cases where irq fired before
+	 * mmio flip data was ready
+	 */
+	intel_notify_mmio_flip(obj->ring);
+	return 0;
+}
+
 static int intel_default_queue_flip(struct drm_device *dev,
 				    struct drm_crtc *crtc,
 				    struct drm_framebuffer *fb,
@@ -9362,7 +9503,12 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 	work->gtt_offset =
 		i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset;
 
-	ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, ring, page_flip_flags);
+	if (use_mmio_flip(ring, obj))
+		ret = intel_queue_mmio_flip(dev, crtc, fb, obj, ring,
+				page_flip_flags);
+	else
+		ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, ring,
+				page_flip_flags);
 	if (ret)
 		goto cleanup_unpin;
 
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 78d4124..b38e88d 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -358,6 +358,11 @@ struct intel_pipe_wm {
 	bool sprites_scaled;
 };
 
+struct intel_mmio_flip {
+	u32 seqno;
+	u32 ring_id;
+};
+
 struct intel_crtc {
 	struct drm_crtc base;
 	enum pipe pipe;
@@ -412,6 +417,7 @@ struct intel_crtc {
 	wait_queue_head_t vbl_wait;
 
 	int scanline_offset;
+	struct intel_mmio_flip mmio_flip;
 };
 
 struct intel_plane_wm_parameters {
-- 
1.9.0




More information about the Intel-gfx mailing list