[Mesa-dev] [PATCH] nv50/ir: fix texture barriers insertion with combined load/store

Ilia Mirkin imirkin at alum.mit.edu
Sat Apr 30 17:37:35 UTC 2016


On Sat, Apr 30, 2016 at 12:35 PM, Samuel Pitoiset
<samuel.pitoiset at gmail.com> wrote:
> After the MemoryOpt pass, load and store instructions might have been
> combined and we have to take this in account when detecting if the
> destination register of a texture instruction is used by another
> instruction as srcs/defs.
>
> This fixes 11 dEQP-GLES31 subtests:
> dEQP-GLES3.functional.shaders.opaque_type_indexing.sampler.*.vertex.*
>
> There are still some samplers shadow fails with compute shaders but
> this is most likely an other issue because the barriers seem to be
> correctly added with this fix.
>
> This might also fix some rendering issues in real games.
>
> Fixes: 7752bbc ("gk104/ir: simplify and fool-proof texbar algorithm")
> Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
> Cc: "11.1 11.2" <mesa-stable at lists.freedesktop.org>
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> index 3bce962..15864c8 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> @@ -226,8 +226,11 @@ NVC0LegalizePostRA::findFirstUsesBB(
>           continue;
>
>        for (int d = 0; insn->defExists(d); ++d) {
> +         int numGPR = insn->def(d).rep()->reg.size / 4;
> +
>           if (insn->def(d).getFile() != FILE_GPR ||
> -             insn->def(d).rep()->reg.data.id < minGPR ||
> +             (insn->def(d).rep()->reg.data.id < minGPR &&
> +              insn->def(d).rep()->reg.data.id + numGPR - 1 < minGPR) ||

Let x = insn->def(d).rep()->reg.data.id. x is guaranteed to be
non-negative, as is numGPR. How could the first condition matter
there?

>               insn->def(d).rep()->reg.data.id > maxGPR)
>              continue;
>           addTexUse(uses, insn, texi);
> @@ -235,8 +238,12 @@ NVC0LegalizePostRA::findFirstUsesBB(
>        }
>
>        for (int s = 0; insn->srcExists(s); ++s) {
> +         int numGPR = insn->src(s).rep()->reg.size / 4;
> +
>           if (insn->src(s).getFile() != FILE_GPR ||
> -             insn->src(s).rep()->reg.data.id < minGPR ||
> +             (insn->src(s).rep()->reg.data.id < minGPR &&
> +              insn->src(s).rep()->reg.data.id +
> +              insn->src(s).rep()->reg.data.id + numGPR - 1 < minGPR) ||

That seems quite clearly wrong... you're doubling the reg id?

>               insn->src(s).rep()->reg.data.id > maxGPR)
>              continue;
>           addTexUse(uses, insn, texi);

What you're trying to do makes sense. However it appears that you got
the details a little wrong. Basically my original code was just
looking at the "base" register. But you're correctly adding a second
set of conditions for the "top" register. So instead of

if (x >= min || x <= max) { do stuff }

you need to do

if (x + n >= min || x <= max) { do stuff }

Note that the actual code is inverted - it has a condition to skip.
But you get the idea.

Nice find!

  -ilia


More information about the mesa-dev mailing list