[Mesa-stable] [Mesa-dev] [PATCH] nv50/ir: fix texture barriers insertion with combined load/store
Ilia Mirkin
imirkin at alum.mit.edu
Sat Apr 30 17:37:35 UTC 2016
On Sat, Apr 30, 2016 at 12:35 PM, Samuel Pitoiset
<samuel.pitoiset at gmail.com> wrote:
> After the MemoryOpt pass, load and store instructions might have been
> combined and we have to take this in account when detecting if the
> destination register of a texture instruction is used by another
> instruction as srcs/defs.
>
> This fixes 11 dEQP-GLES31 subtests:
> dEQP-GLES3.functional.shaders.opaque_type_indexing.sampler.*.vertex.*
>
> There are still some samplers shadow fails with compute shaders but
> this is most likely an other issue because the barriers seem to be
> correctly added with this fix.
>
> This might also fix some rendering issues in real games.
>
> Fixes: 7752bbc ("gk104/ir: simplify and fool-proof texbar algorithm")
> Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
> Cc: "11.1 11.2" <mesa-stable at lists.freedesktop.org>
> ---
> src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> index 3bce962..15864c8 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> @@ -226,8 +226,11 @@ NVC0LegalizePostRA::findFirstUsesBB(
> continue;
>
> for (int d = 0; insn->defExists(d); ++d) {
> + int numGPR = insn->def(d).rep()->reg.size / 4;
> +
> if (insn->def(d).getFile() != FILE_GPR ||
> - insn->def(d).rep()->reg.data.id < minGPR ||
> + (insn->def(d).rep()->reg.data.id < minGPR &&
> + insn->def(d).rep()->reg.data.id + numGPR - 1 < minGPR) ||
Let x = insn->def(d).rep()->reg.data.id. x is guaranteed to be
non-negative, as is numGPR. How could the first condition matter
there?
> insn->def(d).rep()->reg.data.id > maxGPR)
> continue;
> addTexUse(uses, insn, texi);
> @@ -235,8 +238,12 @@ NVC0LegalizePostRA::findFirstUsesBB(
> }
>
> for (int s = 0; insn->srcExists(s); ++s) {
> + int numGPR = insn->src(s).rep()->reg.size / 4;
> +
> if (insn->src(s).getFile() != FILE_GPR ||
> - insn->src(s).rep()->reg.data.id < minGPR ||
> + (insn->src(s).rep()->reg.data.id < minGPR &&
> + insn->src(s).rep()->reg.data.id +
> + insn->src(s).rep()->reg.data.id + numGPR - 1 < minGPR) ||
That seems quite clearly wrong... you're doubling the reg id?
> insn->src(s).rep()->reg.data.id > maxGPR)
> continue;
> addTexUse(uses, insn, texi);
What you're trying to do makes sense. However it appears that you got
the details a little wrong. Basically my original code was just
looking at the "base" register. But you're correctly adding a second
set of conditions for the "top" register. So instead of
if (x >= min || x <= max) { do stuff }
you need to do
if (x + n >= min || x <= max) { do stuff }
Note that the actual code is inverted - it has a condition to skip.
But you get the idea.
Nice find!
-ilia
More information about the mesa-stable
mailing list