[Mesa-dev] [PATCH 8/8] i965: Avoid flushing the batch for every blorp op.

Thu Aug 29 08:47:06 PDT 2013

On 28 August 2013 15:31, Eric Anholt <eric at anholt.net> wrote:

> Paul Berry <stereotype441 at gmail.com> writes:
>
> > On 27 August 2013 15:21, Eric Anholt <eric at anholt.net> wrote:
> >
> >> This brings over the batch-wrap-prevention and aperture space checking
> >> code from the normal brw_draw.c path, so that we don't need to flush the
> >> batch every time.
> >>
> >> There's a risk here if the intel_emit_post_sync_nonzero_flush() call
> isn't
> >> high enough up in the state emit sequences -- before, we implicitly had
> >> one at the batch flush before any state was emitted, so Mesa's
> workaround
> >> emits didn't really matter.
> >>
> >> Improves cairo-gl performance by 13.7733% +/- 1.74876% (n=30/32)
> >> Improves minecraft apitrace performance by 1.03183% +/- 0.482297%
> (n=90).
> >> Reduces low-resolution GLB 2.7 performance by 1.17553% +/- 0.432263%
> (n=88)
> >> Reduces Lightsmark performance by 3.70246% +/- 0.322432% (n=126)
> >> No statistically significant performance difference on unigine tropics
> >> (n=10)
> >> No statistically significant performance difference on openarena (n=755)
> >>
> >> The two apps that are hurt happen to include stalls on busy buffer
> >> objects, so I think this is an effect of missing out on an opportune
> >> flush.
> >> ---
> >>  src/mesa/drivers/dri/i965/brw_blorp.cpp  | 50
> >> ++++++++++++++++++++++++++++++++
> >>  src/mesa/drivers/dri/i965/brw_blorp.h    |  4 ---
> >>  src/mesa/drivers/dri/i965/gen6_blorp.cpp | 12 --------
> >>  src/mesa/drivers/dri/i965/gen7_blorp.cpp |  1 -
> >>  4 files changed, 50 insertions(+), 17 deletions(-)
> >>
> >> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp
> >> b/src/mesa/drivers/dri/i965/brw_blorp.cpp
> >> index 1576ff2..c566d1d 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_blorp.cpp
> >> +++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp
> >> @@ -21,6 +21,7 @@
> >>   * IN THE SOFTWARE.
> >>   */
> >>
> >> +#include <errno.h>
> >>  #include "intel_batchbuffer.h"
> >>  #include "intel_fbo.h"
> >>
> >> @@ -191,6 +192,26 @@ intel_hiz_exec(struct brw_context *brw, struct
> >> intel_mipmap_tree *mt,
> >>  void
> >>  brw_blorp_exec(struct brw_context *brw, const brw_blorp_params *params)
> >>  {
> >> +   struct gl_context *ctx = &brw->ctx;
> >> +   uint32_t estimated_max_batch_usage = 1500;
> >> +   bool check_aperture_failed_once = false;
> >> +
> >> +   /* Flush the sampler and render caches.  We definitely need to flush
> >> the
> >> +    * sampler cache so that we get updated contents from the render
> cache
> >> for
> >> +    * the glBlitFramebuffer() source.  Also, we are sometimes warned in
> >> the
> >> +    * docs to flush the cache between reinterpretations of the same
> >> surface
> >> +    * data with different formats, which blorp does for stencil and
> depth
> >> +    * data.
> >> +    */
> >> +   intel_batchbuffer_emit_mi_flush(brw);
> >> +
> >> +retry:
> >> +   intel_batchbuffer_require_space(brw, estimated_max_batch_usage,
> false);
> >> +   intel_batchbuffer_save_state(brw);
> >> +   drm_intel_bo *saved_bo = brw->batch.bo;
> >> +   uint32_t saved_used = brw->batch.used;
> >> +   uint32_t saved_state_batch_offset = brw->batch.state_batch_offset;
> >> +
> >>     switch (brw->gen) {
> >>     case 6:
> >>        gen6_blorp_exec(brw, params);
> >> @@ -204,6 +225,35 @@ brw_blorp_exec(struct brw_context *brw, const
> >> brw_blorp_params *params)
> >>        break;
> >>     }
> >>
> >>
> > Would it be feasible to add an assertion here to verify that the amount
> of
> > batch space actually used by this blorp call is less than or equal to
> > estimated_max_batch_usage?  That would give me a lot of increased
> > confidence that the magic number 1500 is correct.
> >
> > With the added assertion, the series is:
> >
> > Reviewed-by: Paul Berry <stereotype441 at gmail.com>
>
> That's this code:
>
> +   /* Make sure we didn't wrap the batch unintentionally, and make sure we
> +    * reserved enough space that a wrap will never happen.
> +    */
> +   assert(brw->batch.bo == saved_bo);
> +   assert((brw->batch.used - saved_used) * 4 +
> +          (saved_state_batch_offset - brw->batch.state_batch_offset) <
> +          estimated_max_batch_usage);
>

You're right.  I misunderstood and thought these assertions were merely
checking that we didn't run out of room.  Patch is:

Reviewed-by: Paul Berry <stereotype441 at gmail.com>

as is.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20130829/aa1f2227/attachment.html>