[Bug 99221] >2% perf drop in GfxBench T-Rex with "i965: disable loop unrolling in GLSL IR"

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Thu Dec 29 14:57:41 UTC 2016


            Bug ID: 99221
           Summary: >2% perf drop in GfxBench T-Rex with "i965: disable
                    loop unrolling in GLSL IR"
           Product: Mesa
           Version: git
          Hardware: Other
                OS: All
            Status: NEW
          Severity: normal
          Priority: medium
         Component: Drivers/DRI/i965
          Assignee: intel-3d-bugs at lists.freedesktop.org
          Reporter: eero.t.tamminen at intel.com
        QA Contact: intel-3d-bugs at lists.freedesktop.org

Patch series culminating in this commit changed performance in several tests:
commit 40e9f2f13847ddd94e1216088aa00456d7b02d2b
Author:     Timothy Arceri <timothy.arceri at collabora.com>
AuthorDate: Tue Dec 13 11:37:25 2016 +1100
Commit:     Timothy Arceri <timothy.arceri at collabora.com>
CommitDate: Fri Dec 23 10:15:36 2016 +1100

    i965: disable loop unrolling in GLSL IR

    There is a single regression in loop unrolling which is:

    loops HURT:   shaders/orbital_explorer.shader_test GS SIMD8:    0 -> 1

    However the loop is huge so it seems reasonable not to unroll it. It's
    surprising that GLSL IR does unroll it.

On SKL i5-6600K (GT2), the changes were following (in FullHD size):

Performance dropped due to "disable loop unrolling in GLSL IR":
- 2.7% SynMark PSPom
- 2.3% SynMark PSPhong
- 2.2% GfxBench T-Rex (GL version)
- 0.5% SynMark PSBump8

Performance increased due to "use nir loop unrolling pass":
+ 12.5% SynMark ShMapPcf
+ 3.9% SynMark CSDof (+8.4% from "use nir_lower_indirect_derefs() for GLSL")
+ 2.2% SynMark DevRes (composite test including other affected tests)
+ 0.5% Unigine Valley
+ 0.5% SynMark PSBump2

Results are similar on other platforms, except for CSDof where results depend a
lot on the HW.  GfxBench is v4.0, SynMark v7.0.

CSDof performance increased a lot on SKL GT2, KBL GT2 and a bit BYT, but its
perf dropped a lot SKL GT3e & GT4e, BDW GT3 & GT2, BSW, and a bit on HSW GT3e
and BXT.

After these 2 changes, SynMark shader compilation speed test is ~25% faster on
all platforms which is pretty good.

You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-3d-bugs/attachments/20161229/853b6e87/attachment.html>

More information about the intel-3d-bugs mailing list