[Mesa-dev] [PATCH] swr: [rasterizer jit] use signed integer representation for logic op

Wed Nov 30 01:21:59 UTC 2016

On Nov 27, 2016, at 11:13 PM, Ilia Mirkin <imirkin at alum.mit.edu<mailto:imirkin at alum.mit.edu>> wrote:

On Thu, Nov 24, 2016 at 6:11 PM, Ilia Mirkin <imirkin at alum.mit.edu<mailto:imirkin at alum.mit.edu>> wrote:
Instead of (incorrectly) biasing the snorm value to make it look like a
unorm, just use signed integer math.

This fixes arb_color_buffer_float-render GL_RGBA8_SNORM

Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu<mailto:imirkin at alum.mit.edu>>
---
src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp
index ad809c4..339ca52 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp
@@ -692,9 +692,13 @@ struct BlendJit : public Builder
                    dst[i] = BITCAST(dst[i], mSimdInt32Ty);
                    break;
                case SWR_TYPE_SNORM:
-                    src[i] = FADD(src[i], VIMMED1(0.5f));
-                    dst[i] = FADD(dst[i], VIMMED1(0.5f));
-                    /* fallthrough */
+                    src[i] = FP_TO_SI(
+                        FMUL(src[i], VIMMED1(scale[i])),
+                        mSimdInt32Ty);
+                    dst[i] = FP_TO_SI(
+                        FMUL(dst[i], VIMMED1(scale[i])),
+                        mSimdInt32Ty);
+                    break;
                case SWR_TYPE_UNORM:
                    src[i] = FP_TO_UI(
                        FMUL(src[i], VIMMED1(scale[i])),
@@ -728,11 +732,14 @@ struct BlendJit : public Builder
                    result[i] = BITCAST(result[i], mSimdFP32Ty);
                    break;
                case SWR_TYPE_SNORM:
+                    result[i] = SHL(result[i], 32 - info.bpc[i]);
+                    result[i] = ASHR(result[i], 32 - info.bpc[i]);

These two immediate arguments should probably have a C() around them.
I've fixed that up in my tree. Hopefully these will emit as VPSLLD and
VPSRAD. Not sure how to check that.

With the version of the patch from your tree, I’m seeing this IR:

  %24 = ashr exact <8 x i32> %23, i32 24
  %25 = sitofp <8 x i32> %24 to <8 x float>
  %26 = fmul <8 x float> %25, <float 0x3F80204080000000, float 0x3F80204080000000, float 0x3F80204080000000, float 0x3F80204080000000, float 0x3F80204080000000, float 0x3F80204080000000, float 0x3F80204080000000, float 0x3F80204080000000>
  store <8 x float> %26, <8 x float>* %result, align 32

Turn into this x86 code:

  9a:   vpslld ymm1,ymm3,0x18
  9f:   vpsrad ymm1,ymm1,0x18
  a4:   vcvtdq2ps ymm1,ymm1
  a8:   vmulps ymm1,ymm1,ymm2
  ac:   vmovaps YMMWORD PTR [rax+0x20],ymm1

So llvm does what you expected.

Version of this patch from your tree Reviewed-by: Tim Rowley <timothy.o.rowley at intel.com<mailto:timothy.o.rowley at intel.com>>



+                    result[i] = FMUL(SI_TO_FP(result[i], mSimdFP32Ty),
+                                     VIMMED1(1.0f / scale[i]));
+                    break;
                case SWR_TYPE_UNORM:
                    result[i] = FMUL(UI_TO_FP(result[i], mSimdFP32Ty),
                                     VIMMED1(1.0f / scale[i]));
-                    if (info.type[i] == SWR_TYPE_SNORM)
-                        result[i] = FADD(result[i], VIMMED1(-0.5f));
                    break;
                }

--
2.7.3

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20161130/80d8bf69/attachment-0001.html>