Mesa (master): i965: Throttle rendering to an fbo

Wed Mar 18 09:37:44 UTC 2015

Module: Mesa
Branch: master
Commit: 8b9bd19021c0efef33d66ae24f8871b826d66e8a
URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=8b9bd19021c0efef33d66ae24f8871b826d66e8a

Author: Chris Wilson <chris at chris-wilson.co.uk>
Date:   Thu Feb 26 11:25:18 2015 +0000

i965: Throttle rendering to an fbo

When rendering to an fbo, even though it may be acting as a winsys
frontbuffer or just generally, we never throttle. However, when rendering
to an fbo, there is no natural frame boundary. Conventionally we use
SwapBuffers and glFinish, but potential callers avoid often glFinish for
being too heavy handed (waiting on all outstanding rendering to complete).
The kernel provides a soft-throttling option for this case that waits for
rendering older than 20ms to be complete (that's a little too lax to be
used for swapbuffers, but is here a useful safety net). The remaining
choice is then either never to throttle, throttle after every draw call,
or at after intermediate user defined point such as glFlush and thus all the
implied flushes. This patch opts for the latter as that is the current
method used for flushing to front buffers.

v2: Defer the throttling from inside the flush to the next
intel_prepare_render() and switch non-fbo frontbuffer throttling over to
use the same lax method. The issuing being that
glFlush()/intel_prepare_read() is just as likely to be called inside a
tight loop and not at "frame" boundaries.

v3: Rename from need_front_throttle to need_flush_throttle to avoid any
ambiguity between front buffer rendering and fbo rendering. (Chad)

v4: Whitespace

Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
Cc: Kenneth Graunke <kenneth at whitecape.org>
Cc: Ben Widawsky <ben at bwidawsk.net>
Cc: Kristian Høgsberg <krh at bitplanet.net>
Cc: Chad Versace <chad.versace at linux.intel.com>
Cc: Ian Romanick <idr at freedesktop.org>
Reviewed-by: Chad Versace <chad.versace at linux.intel.com>

---

 src/mesa/drivers/dri/i965/brw_context.c  |   16 ++++++++++++----
 src/mesa/drivers/dri/i965/brw_context.h  |   14 +++++++++++++-
 src/mesa/drivers/dri/i965/intel_screen.c |    8 ++++----
 3 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c
index 0881e48..5120372 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -234,8 +234,8 @@ intel_glFlush(struct gl_context *ctx)
 
    intel_batchbuffer_flush(brw);
    intel_flush_front(ctx);
-   if (brw_is_front_buffer_drawing(ctx->DrawBuffer))
-      brw->need_throttle = true;
+
+   brw->need_flush_throttle = true;
 }
 
 static void
@@ -1245,12 +1245,20 @@ intel_prepare_render(struct brw_context *brw)
     * the swap, and getting our hands on that doesn't seem worth it,
     * so we just us the first batch we emitted after the last swap.
     */
-   if (brw->need_throttle && brw->first_post_swapbuffers_batch) {
+   if (brw->need_swap_throttle && brw->first_post_swapbuffers_batch) {
       if (!brw->disable_throttling)
          drm_intel_bo_wait_rendering(brw->first_post_swapbuffers_batch);
       drm_intel_bo_unreference(brw->first_post_swapbuffers_batch);
       brw->first_post_swapbuffers_batch = NULL;
-      brw->need_throttle = false;
+      brw->need_swap_throttle = false;
+      /* Throttling here is more precise than the throttle ioctl, so skip it */
+      brw->need_flush_throttle = false;
+   }
+
+   if (brw->need_flush_throttle) {
+      __DRIscreen *psp = brw->intelScreen->driScrnPriv;
+      drmCommandNone(psp->fd, DRM_I915_GEM_THROTTLE);
+      brw->need_flush_throttle = false;
    }
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h
index 682fbe9..eebd7ce 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1031,7 +1031,19 @@ struct brw_context
 
    /** Framerate throttling: @{ */
    drm_intel_bo *first_post_swapbuffers_batch;
-   bool need_throttle;
+
+   /* Limit the number of outstanding SwapBuffers by waiting for an earlier
+    * frame of rendering to complete. This gives a very precise cap to the
+    * latency between input and output such that rendering never gets more
+    * than a frame behind the user. (With the caveat that we technically are
+    * not using the SwapBuffers itself as a barrier but the first batch
+    * submitted afterwards, which may be immediately prior to the next
+    * SwapBuffers.)
+    */
+   bool need_swap_throttle;
+
+   /** General throttling, not caught by throttling between SwapBuffers */
+   bool need_flush_throttle;
    /** @} */
 
    GLuint stats_wm;
diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c
index cea7ddf..3640b67 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -174,10 +174,10 @@ intel_dri2_flush_with_flags(__DRIcontext *cPriv,
    if (flags & __DRI2_FLUSH_DRAWABLE)
       intel_resolve_for_dri2_flush(brw, dPriv);
 
-   if (reason == __DRI2_THROTTLE_SWAPBUFFER ||
-       reason == __DRI2_THROTTLE_FLUSHFRONT) {
-      brw->need_throttle = true;
-   }
+   if (reason == __DRI2_THROTTLE_SWAPBUFFER)
+      brw->need_swap_throttle = true;
+   if (reason == __DRI2_THROTTLE_FLUSHFRONT)
+      brw->need_flush_throttle = true;
 
    intel_batchbuffer_flush(brw);