[Mesa-dev] [PATCH 8/8] i965: Avoid flushing the batch for every blorp op.
Paul Berry
stereotype441 at gmail.com
Thu Aug 29 08:47:06 PDT 2013
On 28 August 2013 15:31, Eric Anholt <eric at anholt.net> wrote:
> Paul Berry <stereotype441 at gmail.com> writes:
>
> > On 27 August 2013 15:21, Eric Anholt <eric at anholt.net> wrote:
> >
> >> This brings over the batch-wrap-prevention and aperture space checking
> >> code from the normal brw_draw.c path, so that we don't need to flush the
> >> batch every time.
> >>
> >> There's a risk here if the intel_emit_post_sync_nonzero_flush() call
> isn't
> >> high enough up in the state emit sequences -- before, we implicitly had
> >> one at the batch flush before any state was emitted, so Mesa's
> workaround
> >> emits didn't really matter.
> >>
> >> Improves cairo-gl performance by 13.7733% +/- 1.74876% (n=30/32)
> >> Improves minecraft apitrace performance by 1.03183% +/- 0.482297%
> (n=90).
> >> Reduces low-resolution GLB 2.7 performance by 1.17553% +/- 0.432263%
> (n=88)
> >> Reduces Lightsmark performance by 3.70246% +/- 0.322432% (n=126)
> >> No statistically significant performance difference on unigine tropics
> >> (n=10)
> >> No statistically significant performance difference on openarena (n=755)
> >>
> >> The two apps that are hurt happen to include stalls on busy buffer
> >> objects, so I think this is an effect of missing out on an opportune
> >> flush.
> >> ---
> >> src/mesa/drivers/dri/i965/brw_blorp.cpp | 50
> >> ++++++++++++++++++++++++++++++++
> >> src/mesa/drivers/dri/i965/brw_blorp.h | 4 ---
> >> src/mesa/drivers/dri/i965/gen6_blorp.cpp | 12 --------
> >> src/mesa/drivers/dri/i965/gen7_blorp.cpp | 1 -
> >> 4 files changed, 50 insertions(+), 17 deletions(-)
> >>
> >> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp
> >> b/src/mesa/drivers/dri/i965/brw_blorp.cpp
> >> index 1576ff2..c566d1d 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_blorp.cpp
> >> +++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp
> >> @@ -21,6 +21,7 @@
> >> * IN THE SOFTWARE.
> >> */
> >>
> >> +#include <errno.h>
> >> #include "intel_batchbuffer.h"
> >> #include "intel_fbo.h"
> >>
> >> @@ -191,6 +192,26 @@ intel_hiz_exec(struct brw_context *brw, struct
> >> intel_mipmap_tree *mt,
> >> void
> >> brw_blorp_exec(struct brw_context *brw, const brw_blorp_params *params)
> >> {
> >> + struct gl_context *ctx = &brw->ctx;
> >> + uint32_t estimated_max_batch_usage = 1500;
> >> + bool check_aperture_failed_once = false;
> >> +
> >> + /* Flush the sampler and render caches. We definitely need to flush
> >> the
> >> + * sampler cache so that we get updated contents from the render
> cache
> >> for
> >> + * the glBlitFramebuffer() source. Also, we are sometimes warned in
> >> the
> >> + * docs to flush the cache between reinterpretations of the same
> >> surface
> >> + * data with different formats, which blorp does for stencil and
> depth
> >> + * data.
> >> + */
> >> + intel_batchbuffer_emit_mi_flush(brw);
> >> +
> >> +retry:
> >> + intel_batchbuffer_require_space(brw, estimated_max_batch_usage,
> false);
> >> + intel_batchbuffer_save_state(brw);
> >> + drm_intel_bo *saved_bo = brw->batch.bo;
> >> + uint32_t saved_used = brw->batch.used;
> >> + uint32_t saved_state_batch_offset = brw->batch.state_batch_offset;
> >> +
> >> switch (brw->gen) {
> >> case 6:
> >> gen6_blorp_exec(brw, params);
> >> @@ -204,6 +225,35 @@ brw_blorp_exec(struct brw_context *brw, const
> >> brw_blorp_params *params)
> >> break;
> >> }
> >>
> >>
> > Would it be feasible to add an assertion here to verify that the amount
> of
> > batch space actually used by this blorp call is less than or equal to
> > estimated_max_batch_usage? That would give me a lot of increased
> > confidence that the magic number 1500 is correct.
> >
> > With the added assertion, the series is:
> >
> > Reviewed-by: Paul Berry <stereotype441 at gmail.com>
>
> That's this code:
>
> + /* Make sure we didn't wrap the batch unintentionally, and make sure we
> + * reserved enough space that a wrap will never happen.
> + */
> + assert(brw->batch.bo == saved_bo);
> + assert((brw->batch.used - saved_used) * 4 +
> + (saved_state_batch_offset - brw->batch.state_batch_offset) <
> + estimated_max_batch_usage);
>
You're right. I misunderstood and thought these assertions were merely
checking that we didn't run out of room. Patch is:
Reviewed-by: Paul Berry <stereotype441 at gmail.com>
as is.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20130829/aa1f2227/attachment.html>
More information about the mesa-dev
mailing list