[Mesa-dev] [PATCH] swr: [rasterizer jit] use signed integer representation for logic op
Rowley, Timothy O
timothy.o.rowley at intel.com
Wed Nov 30 01:21:59 UTC 2016
On Nov 27, 2016, at 11:13 PM, Ilia Mirkin <imirkin at alum.mit.edu<mailto:imirkin at alum.mit.edu>> wrote:
On Thu, Nov 24, 2016 at 6:11 PM, Ilia Mirkin <imirkin at alum.mit.edu<mailto:imirkin at alum.mit.edu>> wrote:
Instead of (incorrectly) biasing the snorm value to make it look like a
unorm, just use signed integer math.
This fixes arb_color_buffer_float-render GL_RGBA8_SNORM
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu<mailto:imirkin at alum.mit.edu>>
---
src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp
index ad809c4..339ca52 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp
@@ -692,9 +692,13 @@ struct BlendJit : public Builder
dst[i] = BITCAST(dst[i], mSimdInt32Ty);
break;
case SWR_TYPE_SNORM:
- src[i] = FADD(src[i], VIMMED1(0.5f));
- dst[i] = FADD(dst[i], VIMMED1(0.5f));
- /* fallthrough */
+ src[i] = FP_TO_SI(
+ FMUL(src[i], VIMMED1(scale[i])),
+ mSimdInt32Ty);
+ dst[i] = FP_TO_SI(
+ FMUL(dst[i], VIMMED1(scale[i])),
+ mSimdInt32Ty);
+ break;
case SWR_TYPE_UNORM:
src[i] = FP_TO_UI(
FMUL(src[i], VIMMED1(scale[i])),
@@ -728,11 +732,14 @@ struct BlendJit : public Builder
result[i] = BITCAST(result[i], mSimdFP32Ty);
break;
case SWR_TYPE_SNORM:
+ result[i] = SHL(result[i], 32 - info.bpc[i]);
+ result[i] = ASHR(result[i], 32 - info.bpc[i]);
These two immediate arguments should probably have a C() around them.
I've fixed that up in my tree. Hopefully these will emit as VPSLLD and
VPSRAD. Not sure how to check that.
With the version of the patch from your tree, I’m seeing this IR:
%24 = ashr exact <8 x i32> %23, i32 24
%25 = sitofp <8 x i32> %24 to <8 x float>
%26 = fmul <8 x float> %25, <float 0x3F80204080000000, float 0x3F80204080000000, float 0x3F80204080000000, float 0x3F80204080000000, float 0x3F80204080000000, float 0x3F80204080000000, float 0x3F80204080000000, float 0x3F80204080000000>
store <8 x float> %26, <8 x float>* %result, align 32
Turn into this x86 code:
9a: vpslld ymm1,ymm3,0x18
9f: vpsrad ymm1,ymm1,0x18
a4: vcvtdq2ps ymm1,ymm1
a8: vmulps ymm1,ymm1,ymm2
ac: vmovaps YMMWORD PTR [rax+0x20],ymm1
So llvm does what you expected.
Version of this patch from your tree Reviewed-by: Tim Rowley <timothy.o.rowley at intel.com<mailto:timothy.o.rowley at intel.com>>
+ result[i] = FMUL(SI_TO_FP(result[i], mSimdFP32Ty),
+ VIMMED1(1.0f / scale[i]));
+ break;
case SWR_TYPE_UNORM:
result[i] = FMUL(UI_TO_FP(result[i], mSimdFP32Ty),
VIMMED1(1.0f / scale[i]));
- if (info.type[i] == SWR_TYPE_SNORM)
- result[i] = FADD(result[i], VIMMED1(-0.5f));
break;
}
--
2.7.3
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20161130/80d8bf69/attachment-0001.html>
More information about the mesa-dev
mailing list