[Mesa-dev] [PATCH 15/14] i965/fs: Don't emit 16-wide BFI1 instructions.

Matt Turner mattst88 at gmail.com
Thu May 2 13:46:30 PDT 2013


The Haswell Bspec says "A SIMD16 instruction is not allowed." (but
16-wide BFI1 works for me so far). Since GLSL's bitfieldInsert()
function takes int parameters BFI1 produces the same results in all
channels, so there's never any reason to emit a 16-wide BFI1.
---
 src/mesa/drivers/dri/i965/brw_fs_emit.cpp    | 5 ++++-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 7 ++++++-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
index b7c85ef..d35438c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
@@ -1250,7 +1250,10 @@ fs_generator::generate_code(exec_list *instructions)
             brw_set_compression_control(p, BRW_COMPRESSION_NONE);
             brw_BFI2(p, dst, src[0], src[1], src[2]);
             brw_set_compression_control(p, BRW_COMPRESSION_2NDHALF);
-            brw_BFI2(p, sechalf(dst), sechalf(src[0]), sechalf(src[1]), sechalf(src[2]));
+            /* We don't emit 16-wide BFI1 instructions, so don't use sechalf()
+             * on src[0] (which comes from BFI1).
+             */
+            brw_BFI2(p, sechalf(dst), src[0], sechalf(src[1]), sechalf(src[2]));
             brw_set_compression_control(p, BRW_COMPRESSION_COMPRESSED);
          } else {
             brw_BFI2(p, dst, src[0], src[1], src[2]);
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 417e8a8..b62f996 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -617,7 +617,12 @@ fs_visitor::visit(ir_expression *ir)
       emit(BFE(this->result, op[2], op[1], op[0]));
       break;
    case ir_binop_bfm:
-      emit(BFI1(this->result, op[0], op[1]));
+      inst = emit(BFI1(this->result, op[0], op[1]));
+      /* Haswell doesn't allow 16-wide for this instruction, and since it only
+       * takes int parameters from GLSL it produces the same result in all
+       * channels, so there's no reason to ever do 16-wide.
+       */
+      inst->force_uncompressed = true;
       break;
    case ir_triop_bfi:
       emit(BFI2(this->result, op[0], op[1], op[2]));
-- 
1.8.1.5



More information about the mesa-dev mailing list