[Mesa-dev] [PATCH] i965/gen7: always lower textureGrad() on gen7

Chia-I Wu olvaffe at gmail.com
Thu Sep 5 01:35:56 PDT 2013


sample_d is slower than the lowered version on gen7.  For gen7, this improves
Xonotic benchmark with Ultimate effects by as much as 25%:

 before the change:                      40.06 fps
 after the change:                       51.10 fps
 after the change with INTEL_DEBUG=no16: 44.46 fps

As sample_d is not allowed in SIMD16 mode, I firstly thought the difference
was from SIMD8 versus SIMD16.  If that was the case, we would want to apply
brw_lower_texture_gradients() only on fragment shaders in SIMD16 mode.

But, as the numbers show, there is still 10% improvement when SIMD16 is forced
off after the change.  Thus textureGrad() is lowered unconditionally for now.
Due to this and that I haven't tried it on Haswell, this is still RFC.

No piglit regressions.

Signed-off-by: Chia-I Wu <olvaffe at gmail.com>
---
 .../dri/i965/brw_lower_texture_gradients.cpp       | 54 ++++++++++++++--------
 1 file changed, 36 insertions(+), 18 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_lower_texture_gradients.cpp b/src/mesa/drivers/dri/i965/brw_lower_texture_gradients.cpp
index 1589a20..f3fcb56 100644
--- a/src/mesa/drivers/dri/i965/brw_lower_texture_gradients.cpp
+++ b/src/mesa/drivers/dri/i965/brw_lower_texture_gradients.cpp
@@ -34,8 +34,8 @@ using namespace ir_builder;
 
 class lower_texture_grad_visitor : public ir_hierarchical_visitor {
 public:
-   lower_texture_grad_visitor(bool has_sample_d_c)
-      : has_sample_d_c(has_sample_d_c)
+   lower_texture_grad_visitor(bool has_sample_d, bool has_sample_d_c)
+      : has_sample_d(has_sample_d), has_sample_d_c(has_sample_d_c)
    {
       progress = false;
    }
@@ -44,6 +44,7 @@ public:
 
 
    bool progress;
+   bool has_sample_d;
    bool has_sample_d_c;
 
 private:
@@ -90,22 +91,33 @@ txs_type(const glsl_type *type)
 ir_visitor_status
 lower_texture_grad_visitor::visit_leave(ir_texture *ir)
 {
-   /* Only lower textureGrad with shadow samplers */
-   if (ir->op != ir_txd || !ir->shadow_comparitor)
+   if (ir->op != ir_txd)
       return visit_continue;
 
-   /* Lower textureGrad() with samplerCubeShadow even if we have the sample_d_c
-    * message.  GLSL provides gradients for the 'r' coordinate.  Unfortunately:
-    *
-    * From the Ivybridge PRM, Volume 4, Part 1, sample_d message description:
-    * "The r coordinate contains the faceid, and the r gradients are ignored
-    *  by hardware."
-    *
-    * We likely need to do a similar treatment for samplerCube and
-    * samplerCubeArray, but we have insufficient testing for that at the moment.
-    */
-   bool need_lowering = !has_sample_d_c ||
-      ir->sampler->type->sampler_dimensionality == GLSL_SAMPLER_DIM_CUBE;
+   bool need_lowering = false;
+
+   if (ir->shadow_comparitor) {
+      /* Lower textureGrad() with samplerCubeShadow even if we have the
+       * sample_d_c message.  GLSL provides gradients for the 'r' coordinate.
+       * Unfortunately:
+       *
+       * From the Ivybridge PRM, Volume 4, Part 1, sample_d message
+       * description: "The r coordinate contains the faceid, and the r
+       * gradients are ignored by hardware."
+       */
+      if (ir->sampler->type->sampler_dimensionality == GLSL_SAMPLER_DIM_CUBE)
+         need_lowering = true;
+      else if (!has_sample_d_c)
+         need_lowering = true;
+   }
+   else {
+      /* We likely need to do a similar treatment for samplerCube and
+       * samplerCubeArray, but we have insufficient testing for that at the
+       * moment.
+       */
+      if (!has_sample_d)
+         need_lowering = true;
+   }
 
    if (!need_lowering)
       return visit_continue;
@@ -154,7 +166,9 @@ lower_texture_grad_visitor::visit_leave(ir_texture *ir)
 			       expr(ir_unop_sqrt, dot(dPdy, dPdy)));
    }
 
-   /* lambda_base = log2(rho).  We're ignoring GL state biases for now. */
+   /* lambda_base = log2(rho).  It will be biased and clamped by values
+    * defined in SAMPLER_STATE to get the final lambda.
+    */
    ir->op = ir_txl;
    ir->lod_info.lod = expr(ir_unop_log2, rho);
 
@@ -168,8 +182,12 @@ bool
 brw_lower_texture_gradients(struct brw_context *brw,
                             struct exec_list *instructions)
 {
+   /* sample_d is slower than the lowered version on gen7, and is not allowed
+    * in SIMD16 mode.  Treating it as unsupported improves the performance.
+    */
+   bool has_sample_d = brw->gen != 7;
    bool has_sample_d_c = brw->gen >= 8 || brw->is_haswell;
-   lower_texture_grad_visitor v(has_sample_d_c);
+   lower_texture_grad_visitor v(has_sample_d, has_sample_d_c);
 
    visit_list_elements(&v, instructions);
 
-- 
1.8.3.1



More information about the mesa-dev mailing list