[Mesa-dev] [PATCH] r600g: Use a fake reloc to sleep for fences
Michel Dänzer
michel at daenzer.net
Wed Feb 1 08:40:53 PST 2012
[ Dropping dri-devel list ]
On Mit, 2012-02-01 at 15:01 +0000, Simon Farnsworth wrote:
> r300g is able to sleep until a fence completes rather than busywait because
> it creates a special buffer object and relocation that stays busy until the
> CS containing the fence is finished.
>
> Copy the idea into r600g, and use it to sleep if the user asked for an
> infinite wait, falling back to busywaiting if the user provided a timeout.
>
> Signed-off-by: Simon Farnsworth <simon.farnsworth at onelan.co.uk>
> ---
> This is a userspace only fix for my problem, as suggested by Mark Olšák. It
> doesn't fix the case with a timeout (which doesn't matter to me), but works
> on existing kernels.
>
> Given that my previous suggested fix opened a can of worms (mostly because I
> don't fully understand modern Radeon hardware), this is probably enough of a
> fix to apply for now, leaving a proper fix for someone with more
> understanding of the way multiple rings interact.
[...]
> diff --git a/src/gallium/drivers/r600/r600_hw_context.c b/src/gallium/drivers/r600/r600_hw_context.c
> index 8eb8e6d..cc81c48 100644
> --- a/src/gallium/drivers/r600/r600_hw_context.c
> +++ b/src/gallium/drivers/r600/r600_hw_context.c
> @@ -1623,6 +1623,20 @@ void r600_context_emit_fence(struct r600_context *ctx, struct r600_resource *fen
> ctx->pm4[ctx->pm4_cdwords++] = 0; /* DATA_HI */
> ctx->pm4[ctx->pm4_cdwords++] = PKT3(PKT3_NOP, 0, 0);
> ctx->pm4[ctx->pm4_cdwords++] = r600_context_bo_reloc(ctx, fence_bo, RADEON_USAGE_WRITE);
> +
> + if (sleep_bo) {
> + unsigned reloc_index;
> + /* Create a dummy BO so that fence_finish without a timeout can sleep waiting for completion */
> + *sleep_bo = ctx->ws->buffer_create(ctx->ws, 1, 1,
> + PIPE_BIND_CUSTOM,
> + RADEON_DOMAIN_GTT);
> + /* Add the fence as a dummy relocation. */
> + reloc_index = ctx->ws->cs_add_reloc(ctx->cs,
> + ctx->ws->buffer_get_cs_handle(*sleep_bo),
> + RADEON_USAGE_READWRITE, RADEON_DOMAIN_GTT);
> + if (reloc_index >= ctx->creloc)
> + ctx->creloc = reloc_index+1;
> + }
Is there a point in making sleep_bo optional?
> diff --git a/src/gallium/drivers/r600/r600_pipe.c b/src/gallium/drivers/r600/r600_pipe.c
> index c38fbc5..71e31b1 100644
> --- a/src/gallium/drivers/r600/r600_pipe.c
> +++ b/src/gallium/drivers/r600/r600_pipe.c
> @@ -605,6 +605,14 @@ static boolean r600_fence_finish(struct pipe_screen *pscreen,
> }
>
> while (rscreen->fences.data[rfence->index] == 0) {
> + /* Special-case infinite timeout */
> + if (timeout == PIPE_TIMEOUT_INFINITE &&
> + rfence->sleep_bo) {
> + rscreen->ws->buffer_wait(rfence->sleep_bo, RADEON_USAGE_READWRITE);
> + pb_reference(&rfence->sleep_bo, NULL);
> + continue;
> + }
I think rfence->sleep_bo should only be unreferenced in
r600_fence_reference() when the fence is recycled, otherwise it'll be
leaked if r600_fence_finish() is never called for some reason.
If r600_fence_finish() only ever called os_time_sleep(), never
sched_yield() (like r300_fence_finish()), would that avoid your problem
even with a finite timeout?
--
Earthling Michel Dänzer | http://www.amd.com
Libre software enthusiast | Debian, X and DRI developer
More information about the dri-devel
mailing list