[Mesa-dev] [PATCH] r600: fix relocs for PIPE_QUERY_SO_OVERFLOW_ANY_PREDICATE query

Wed Jan 17 12:54:16 UTC 2018

On 11.01.2018 23:54, sroland at vmware.com wrote:
> From: Roland Scheidegger <sroland at vmware.com>
> 
> The command parser is very sad if we don't emit the relocs per hw query...
> 
> However, don't enable it. It mostly works, but piglit
> arb_transform_feedback_overflow_query-basic shows 2 failures (it's really the
> same case for the hw), conditional_render_any and conditional_render_single.
> By some experimentation, it looks like the firmware combines the values wrongly
> for the non-inverted (i.e. hw-inverted) case - it will only not draw if all
> 4 streams overflow, rather than just at least one.
> Interestingly, radeonsi has a workaround for some VI firmware which looks like
> it was the exact same firmware bug. Hence, looks like it would need new
> firmware to properly fix this.
> (Tested on Juniper, not sure if firmware for all chips is broken.)

Yeah, that firmware thing was a very sad story. Apparently, people got 
confused about what the correct behavior is, and so the firmware went 
back and forth a couple of times.

I don't know about pre-GCN chips, but IIRC all SI + CI firmware is 
correct. VI and gfx9 firmware was broken in earlier versions; I don't 
know if the corrected VI firmware will ever be published though. 
Hopefully they won't break it again in the future :)

I think radeonsi is still our only driver that gets all cases completely 
right at full hardware speed as long as the firmware is correct, because 
people just can't be bothered.

Cheers,
Nicolai

> ---
>   src/gallium/drivers/r600/r600_query.c | 14 ++++++++++----
>   1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/src/gallium/drivers/r600/r600_query.c b/src/gallium/drivers/r600/r600_query.c
> index b4519830cc..5ff0570308 100644
> --- a/src/gallium/drivers/r600/r600_query.c
> +++ b/src/gallium/drivers/r600/r600_query.c
> @@ -742,9 +742,12 @@ static void r600_query_hw_do_emit_start(struct r600_common_context *ctx,
>   		emit_sample_streamout(cs, va, query->stream);
>   		break;
>   	case PIPE_QUERY_SO_OVERFLOW_ANY_PREDICATE:
> -		for (unsigned stream = 0; stream < R600_MAX_STREAMS; ++stream)
> +		for (unsigned stream = 0; stream < R600_MAX_STREAMS; ++stream) {
>   			emit_sample_streamout(cs, va + 32 * stream, stream);
> -		break;
> +			r600_emit_reloc(ctx, &ctx->gfx, query->buffer.buf,
> +					RADEON_USAGE_WRITE, RADEON_PRIO_QUERY);
> +		}
> +		return;
>   	case PIPE_QUERY_TIME_ELAPSED:
>   		/* Write the timestamp after the last draw is done.
>   		 * (bottom-of-pipe)
> @@ -827,9 +830,12 @@ static void r600_query_hw_do_emit_stop(struct r600_common_context *ctx,
>   		break;
>   	case PIPE_QUERY_SO_OVERFLOW_ANY_PREDICATE:
>   		va += 16;
> -		for (unsigned stream = 0; stream < R600_MAX_STREAMS; ++stream)
> +		for (unsigned stream = 0; stream < R600_MAX_STREAMS; ++stream) {
>   			emit_sample_streamout(cs, va + 32 * stream, stream);
> -		break;
> +			r600_emit_reloc(ctx, &ctx->gfx, query->buffer.buf,
> +					RADEON_USAGE_WRITE, RADEON_PRIO_QUERY);
> +		}
> +		return;
>   	case PIPE_QUERY_TIME_ELAPSED:
>   		va += 8;
>   		/* fall through */
> 

-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.