[Mesa-dev] [PATCH 1/2] i965: Update comments about Z16 being slow.
Kenneth Graunke
kenneth at whitecape.org
Sun Apr 13 22:04:51 PDT 2014
We've learned a few things since we originally disabled Z16; this attempts
to summarize the issue. I am no expert on this subject, though, so the
comment may not be totally accurate.
I did some benchmarking on GM45 and Ironlake, and discovered that for
GLBenchmark 2.7 EgyptHD, using Z16 was 3% slower on GM45 (n=15), and
4.5% slower on Ironlake (n=95). So, we can drop the "on Ivybridge"
aspect of the comment - it's always slower.
Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
---
src/mesa/drivers/dri/i965/brw_surface_formats.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c b/src/mesa/drivers/dri/i965/brw_surface_formats.c
index cef4020..196f139 100644
--- a/src/mesa/drivers/dri/i965/brw_surface_formats.c
+++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c
@@ -620,13 +620,16 @@ brw_init_surface_formats(struct brw_context *brw)
ctx->TextureFormatSupported[MESA_FORMAT_Z_FLOAT32] = true;
ctx->TextureFormatSupported[MESA_FORMAT_Z32_FLOAT_S8X24_UINT] = true;
- /* It appears that Z16 is slower than Z24 (on Intel Ivybridge and newer
- * hardware at least), so there's no real reason to prefer it unless you're
- * under memory (not memory bandwidth) pressure. Our speculation is that
- * this is due to either increased fragment shader execution from
- * GL_LEQUAL/GL_EQUAL depth tests at the reduced precision, or due to
- * increased depth stalls from a cacheline-based heuristic for detecting
- * depth stalls.
+ /* Benchmarking shows that Z16 is slower than Z24, so there's no reason to
+ * use it unless you're under memory (not memory bandwidth) pressure.
+ *
+ * Apparently, the GPU's depth scoreboarding works on a 32-bit granularity,
+ * which corresponds to one pixel in the depth buffer for Z24 or Z32 formats.
+ * However, it corresponds to two pixels with Z16, which means both need to
+ * hit the early depth case in order for it to happen.
+ *
+ * Other speculation is that we may be hitting increased fragment shader
+ * execution from GL_LEQUAL/GL_EQUAL depth tests at reduced precision.
*
* However, desktop GL 3.0+ require that you get exactly 16 bits when
* asking for DEPTH_COMPONENT16, so we have to respect that.
--
1.9.2
More information about the mesa-dev
mailing list