[Mesa-dev] [PATCH 2/4] gallium: redefine ATOMINC_WRAP to be more hardware-friendly
Marek Olšák
maraeo at gmail.com
Wed Aug 7 03:18:57 UTC 2019
For the first two:
Reviewed-by: Marek Olšák <marek.olsak at amd.com>
Marek
On Tue, Aug 6, 2019 at 11:06 PM Ilia Mirkin <imirkin at alum.mit.edu> wrote:
> Both AMD and NVIDIA hardware define it this way. Instead of replicating
> the logic everywhere, just fix it up in one place.
>
> Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
> ---
>
> I'm open to also not doing this. Just seems like it'll make our
> collective lives a little easier, since this is what the hw wants (and
> presumably other APIs ... PTX definitely defines it this way).
>
> If there's push-back, I can also duplicate the logic in codegen.
>
> src/gallium/docs/source/tgsi.rst | 2 +-
> src/gallium/drivers/radeonsi/si_shader_tgsi_mem.c | 12 ------------
> src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 11 ++++++++++-
> 3 files changed, 11 insertions(+), 14 deletions(-)
>
> diff --git a/src/gallium/docs/source/tgsi.rst
> b/src/gallium/docs/source/tgsi.rst
> index 17ad097e85e..e72b047dbd5 100644
> --- a/src/gallium/docs/source/tgsi.rst
> +++ b/src/gallium/docs/source/tgsi.rst
> @@ -2846,7 +2846,7 @@ These atomic operations may only be used with 32-bit
> integer image formats.
>
> dst_x = resource[offset] + 1
>
> - resource[offset] = dst_x < src_x ? dst_x : 0
> + resource[offset] = dst_x <= src_x ? dst_x : 0
>
>
> .. opcode:: ATOMDEC_WRAP - Atomic decrement + wrap around
> diff --git a/src/gallium/drivers/radeonsi/si_shader_tgsi_mem.c
> b/src/gallium/drivers/radeonsi/si_shader_tgsi_mem.c
> index 4a4ba43780a..f79ed2c57e1 100644
> --- a/src/gallium/drivers/radeonsi/si_shader_tgsi_mem.c
> +++ b/src/gallium/drivers/radeonsi/si_shader_tgsi_mem.c
> @@ -828,18 +828,6 @@ static void atomic_emit(
> args.data[num_data++] =
> ac_to_integer(&ctx->ac, lp_build_emit_fetch(bld_base,
> inst, 2, 0));
>
> - if (inst->Instruction.Opcode == TGSI_OPCODE_ATOMINC_WRAP) {
> - /* ATOMIC_INC instruction does:
> - * value = (value + 1) % (data + 1)
> - * but we want:
> - * value = (value + 1) % data
> - * So replace 'data' by 'data - 1'.
> - */
> - args.data[0] = LLVMBuildSub(ctx->ac.builder,
> - args.data[0],
> - ctx->ac.i32_1, "");
> - }
> -
> args.cache_policy = get_cache_policy(ctx, inst, true, false,
> false);
>
> if (inst->Src[0].Register.File == TGSI_FILE_BUFFER) {
> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> index ff2ec0726e8..9b982569490 100644
> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> @@ -3938,9 +3938,18 @@ glsl_to_tgsi_visitor::visit_image_intrinsic(ir_call
> *ir)
> case ir_intrinsic_image_atomic_comp_swap:
> opcode = TGSI_OPCODE_ATOMCAS;
> break;
> - case ir_intrinsic_image_atomic_inc_wrap:
> + case ir_intrinsic_image_atomic_inc_wrap: {
> + /* There's a bit of disagreement between GLSL and the hardware.
> The
> + * hardware wants to wrap after the given wrap value, while GLSL
> + * wants to wrap at the value. Subtract 1 to make up the
> difference.
> + */
> + st_src_reg wrap = get_temp(glsl_type::uint_type);
> + emit_asm(ir, TGSI_OPCODE_ADD, st_dst_reg(wrap),
> + arg1, st_src_reg_for_int(-1));
> + arg1 = wrap;
> opcode = TGSI_OPCODE_ATOMINC_WRAP;
> break;
> + }
> case ir_intrinsic_image_atomic_dec_wrap:
> opcode = TGSI_OPCODE_ATOMDEC_WRAP;
> break;
> --
> 2.21.0
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20190806/f39ace87/attachment-0001.html>
More information about the mesa-dev
mailing list