[Mesa-dev] [PATCH RFC] radeonsi: Disable IEEE_MODE.
Jan Vesely
jan.vesely at rutgers.edu
Sat Mar 3 23:35:02 UTC 2018
Neither GL nor CL nor SPIRV needs the IEEE handling of sNaNs.
Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
---
This is the 3rd way to handle broken fmin/fmax in clover CL.
It can be worked around in libclc (to not use
llvm.fminnum/llvm.fmaxnum)[0]
LLVM can be patched to not use IEEE_MODE[1]
Or it can be fixed in the config reg in mesa.
The third option is arguably the worst (changes semantics of compiled
code), but since the first two efforts have been stalled for months.
Here's the third one.
Jan
[0] http://lists.llvm.org/pipermail/libclc-dev/2017-November/thread.html#2692
[1] https://reviews.llvm.org/D40514
src/gallium/drivers/radeonsi/si_compute.c | 8 ++++++++
src/gallium/drivers/radeonsi/si_shader.c | 6 +++++-
2 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/src/gallium/drivers/radeonsi/si_compute.c b/src/gallium/drivers/radeonsi/si_compute.c
index 41927988cf..d0fcba485b 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -70,6 +70,10 @@ static void code_object_to_config(const amd_kernel_code_t *code_object,
struct si_shader_config *out_config) {
uint32_t rsrc1 = code_object->compute_pgm_resource_registers;
+ /* Disable IEEE mode. IEEE_MODE determines sNaN handling by fmin/fmax
+ * instructions. Nothing wants the IEEE rules here.
+ * Ideally LLVM would not emit this configuration. */
+ rsrc1 &= C_00B848_IEEE_MODE;
uint32_t rsrc2 = code_object->compute_pgm_resource_registers >> 32;
out_config->num_sgprs = code_object->wavefront_sgpr_count;
out_config->num_vgprs = code_object->workitem_vgpr_count;
@@ -137,6 +141,10 @@ static void si_create_compute_state_async(void *job, int thread_index)
S_00B848_SGPRS((shader->config.num_sgprs - 1) / 8) |
S_00B848_DX10_CLAMP(1) |
S_00B848_FLOAT_MODE(shader->config.float_mode);
+ /* Disable IEEE mode. IEEE_MODE determines sNaN handling by
+ * fmin/fmax instructions. Nothing wants the IEEE rules.
+ * Ideally LLVM would not emit this configuration. */
+ shader->config.rsrc1 &= C_00B848_IEEE_MODE;
shader->config.rsrc2 =
S_00B84C_USER_SGPR(user_sgprs) |
diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c
index 2a50b266f6..01bad71a4e 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -5178,7 +5178,11 @@ void si_shader_binary_read_config(struct ac_shader_binary *binary,
conf->num_sgprs = MAX2(conf->num_sgprs, (G_00B028_SGPRS(value) + 1) * 8);
conf->num_vgprs = MAX2(conf->num_vgprs, (G_00B028_VGPRS(value) + 1) * 4);
conf->float_mode = G_00B028_FLOAT_MODE(value);
- conf->rsrc1 = value;
+ /* Disable IEEE mode. IEEE_MODE determines sNaN
+ * handling by fmin/fmax instructions.
+ * Nothing wants the IEEE rules here.
+ * Ideally LLVM would not emit this configuration. */
+ conf->rsrc1 = value & C_00B848_IEEE_MODE;
break;
case R_00B02C_SPI_SHADER_PGM_RSRC2_PS:
conf->lds_size = MAX2(conf->lds_size, G_00B02C_EXTRA_LDS_SIZE(value));
--
2.14.3
More information about the mesa-dev
mailing list