[Mesa-dev] [PATCH 3/4] i965: Prevent infinite finish_batch recursion
Ben Widawsky
ben at bwidawsk.net
Fri Feb 27 10:26:17 PST 2015
On Fri, Feb 27, 2015 at 10:22:10AM -0800, Ben Widawsky wrote:
> From the comments in the code:
>
> Called from intel_batchbuffer_flush before emitting MI_BATCHBUFFER_END and
> sending it off.
>
> This fixes a possible, unlikely infinite recursion in our batch flush path. More
> importantly it allows me to add some code here.
>
> The relevant part of the call chain for flush
> intel_batchbuffer_flush()->brw_finish_batch(). The problem arises if anything in
> the time from intel_batchbuffer_flush, until brw_finish_batch ends up calling
> itself. This can happen as a result of a call to
> intel_batchbuffer_begin()->intel_batchbuffer_require_space()->intel_batchbuffer_flush().
>
> There are two possible cases today which can spawn this recursion.
> 1. There is a ring switch occurring (impossible, see below)
> 2. The ring is out of space (fairly unlikely)
>
> brw_finish_batch() currently only has two cases where we end up calling
> intel_batchbuffer_begin().
> 1. Query objects pre hardware contexts:
> Post hardware contexts, this is a NOP. Pre hardware contexts we did not have
> a blitter ring. As mesa does not currently use the BSD ring (and probably
> never will pre-GEN6), the only way to have a problem is a full ring (#2
> above).
>
> 2. brw_perf_monitor_finish_batch:
> Here too it's only used pre GEN6 which means no ring switches are possible,
> and so only a full ring can cause an issue.
>
> Therefore only when we run out of ring space at batchbuffer flush do we have an
> issue.
>
> I've implemented the cheap hack to fix this. It is totally not thread safe. The
> proper way to fix this is with a lock as callers do not expect an early return
> before the flush has occurred. I'm not really certain if we can ever actually
> hit this path with mulththreaded situations, so cheap hack wins.
>
> The same bug and reasoning (in fact it's more likely) may exist in the i915
> driver as well.
>
> Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
> ---
> src/mesa/drivers/dri/i965/intel_batchbuffer.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> index db2f345..b0ebec7 100644
> --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> @@ -209,6 +209,19 @@ brw_new_batch(struct brw_context *brw)
> static void
> brw_finish_batch(struct brw_context *brw)
> {
> + /* FIXME: not threadsafe */
> + static int nest_depth = 0;
> + assert (nest_depth >= 0);
> +
> + if (nest_depth++) {
> + /* It's actually possible to get here without multithreading (no ringspace
> + * + ring switch). Oh well. */
> + if (nest_depth >= 2)
> + _mesa_warning(NULL, "Multhreaded batch flushing is out of order\n");
> +
> + return;
> + }
> +
> /* Capture the closing pipeline statistics register values necessary to
> * support query objects (in the non-hardware context world).
> */
> @@ -223,6 +236,8 @@ brw_finish_batch(struct brw_context *brw)
> * next batch.
> */
> brw->cache.bo_used_by_gpu = true;
> +
> + nest_depth--;
>
Whoops, forgot to commit.. this is nest_depth = 0
More information about the mesa-dev
mailing list