[Mesa-dev] [PATCH] r600g: Use a fake reloc to sleep for fences

Michel Dänzer michel at daenzer.net
Wed Feb 1 08:40:53 PST 2012


[ Dropping dri-devel list ]

On Mit, 2012-02-01 at 15:01 +0000, Simon Farnsworth wrote: 
> r300g is able to sleep until a fence completes rather than busywait because
> it creates a special buffer object and relocation that stays busy until the
> CS containing the fence is finished.
> 
> Copy the idea into r600g, and use it to sleep if the user asked for an
> infinite wait, falling back to busywaiting if the user provided a timeout.
> 
> Signed-off-by: Simon Farnsworth <simon.farnsworth at onelan.co.uk>
> ---
> This is a userspace only fix for my problem, as suggested by Mark Olšák. It
> doesn't fix the case with a timeout (which doesn't matter to me), but works
> on existing kernels.
> 
> Given that my previous suggested fix opened a can of worms (mostly because I
> don't fully understand modern Radeon hardware), this is probably enough of a
> fix to apply for now, leaving a proper fix for someone with more
> understanding of the way multiple rings interact.
[...] 
> diff --git a/src/gallium/drivers/r600/r600_hw_context.c b/src/gallium/drivers/r600/r600_hw_context.c
> index 8eb8e6d..cc81c48 100644
> --- a/src/gallium/drivers/r600/r600_hw_context.c
> +++ b/src/gallium/drivers/r600/r600_hw_context.c
> @@ -1623,6 +1623,20 @@ void r600_context_emit_fence(struct r600_context *ctx, struct r600_resource *fen
>  	ctx->pm4[ctx->pm4_cdwords++] = 0;                       /* DATA_HI */
>  	ctx->pm4[ctx->pm4_cdwords++] = PKT3(PKT3_NOP, 0, 0);
>  	ctx->pm4[ctx->pm4_cdwords++] = r600_context_bo_reloc(ctx, fence_bo, RADEON_USAGE_WRITE);
> +
> +	if (sleep_bo) {
> +		unsigned reloc_index;
> +		/* Create a dummy BO so that fence_finish without a timeout can sleep waiting for completion */
> +		*sleep_bo = ctx->ws->buffer_create(ctx->ws, 1, 1,
> +						   PIPE_BIND_CUSTOM,
> +						   RADEON_DOMAIN_GTT);
> +		/* Add the fence as a dummy relocation. */
> +		reloc_index = ctx->ws->cs_add_reloc(ctx->cs,
> +						    ctx->ws->buffer_get_cs_handle(*sleep_bo),
> +						    RADEON_USAGE_READWRITE, RADEON_DOMAIN_GTT);
> +		if (reloc_index >= ctx->creloc)
> +			ctx->creloc = reloc_index+1;
> +	}

Is there a point in making sleep_bo optional?


> diff --git a/src/gallium/drivers/r600/r600_pipe.c b/src/gallium/drivers/r600/r600_pipe.c
> index c38fbc5..71e31b1 100644
> --- a/src/gallium/drivers/r600/r600_pipe.c
> +++ b/src/gallium/drivers/r600/r600_pipe.c
> @@ -605,6 +605,14 @@ static boolean r600_fence_finish(struct pipe_screen *pscreen,
>  	}
>  
>  	while (rscreen->fences.data[rfence->index] == 0) {
> +		/* Special-case infinite timeout */
> +		if (timeout == PIPE_TIMEOUT_INFINITE &&
> +		    rfence->sleep_bo) {
> +			rscreen->ws->buffer_wait(rfence->sleep_bo, RADEON_USAGE_READWRITE);
> +			pb_reference(&rfence->sleep_bo, NULL);
> +			continue;
> +		}

I think rfence->sleep_bo should only be unreferenced in
r600_fence_reference() when the fence is recycled, otherwise it'll be
leaked if r600_fence_finish() is never called for some reason.

If r600_fence_finish() only ever called os_time_sleep(), never
sched_yield() (like r300_fence_finish()), would that avoid your problem
even with a finite timeout?


-- 
Earthling Michel Dänzer           |                   http://www.amd.com
Libre software enthusiast         |          Debian, X and DRI developer


More information about the dri-devel mailing list