[Mesa-dev] [PATCH 0/7] More i965 scheduling improvements
Connor Abbott
cwabbott0 at gmail.com
Fri Oct 30 18:02:51 PDT 2015
This series implements some more aggressive scheduler changes based on
the original series that I sent out and now has been merged to master.
In particular, it rewrites the scheduler to be bottom-up and top-down,
and gives it a fancy new strategy involving a combination of limit
scheduling and Sethi-Ullman numbering in order to tackle
register-pressure-limited scenarios while still providing good
parallelism otherwise. It also fixes some serious shortcomings in the
low register pressure case, making us actually use the critical path
information we were already computing before and using a better strategy
to minimize stalls. Finally, it changes the heuristic for whether we
should drop SIMD16 to something that, while probably not perfect, is
probably still better than what we had previously.
While each patch has shader-db numbers, it's a little hard to see the
forest from the trees while looking at each change individually, so here
are the shader-db numbers for the entire series on bdw, created using my
shader-db patch [1]:
total instructions in shared programs: 7392779 -> 7386851 (-0.08%)
instructions in affected programs: 24443 -> 18515 (-24.25%)
helped: 15
HURT: 0
total cycles in shared programs: 56128804 -> 48572820 (-13.46%)
cycles in affected programs: 54357022 -> 46801038 (-13.90%)
helped: 60142
HURT: 801
LOST: 392
GAINED: 59
But note that most of the SIMD16 shaders we lost were bad SIMD16 shaders
that probably wouldn't have helped us; the remaining gained shaders went
from spilling and thrown out to actually useful. And, of course, the
intervening patches ensure that many more SIMD16 programs are considered
"useful."
Notably, after this series, there are no more SIMD8 programs in my
shader-db that spill anymore!
Note that this series is a little different from the one that some
people have been looking at before. In particular, I dropped an attempt
to replace the LIFO heuristic that turned out to not be useful at all by
the end (instead, I just nuked it), and I fixed a slight issue with how
the amount above the register pressure threshold was being computed in
the Sethi-Ullman patch. As this is the first version I'm actually
sending out for review, I didn't bother to mark those changes in the
commit messages. It's probably worth re-doing the benchmarks, since now
my shader-db shows no regressions or lost SIMD16 shaders in any SynMark
benchmark for whatever reason.
This series is also available at
git://people.freedesktop.org/~cwabbott0/mesa i965-sched-v3
[1] http://lists.freedesktop.org/archives/mesa-dev/2015-October/097431.html
Connor Abbott (7):
i965: use real latencies in the pre-RA scheduler
i965/sched: use a critical path heuristic
i965/sched: get rid of the LIFO heuristic
i965/sched: switch to register pressure scheduling dynamically
i965/sched: switch to bottom-up scheduling
i965/sched: use Sethi-Ullman numbering
i965/fs: use a better heuristic for SIMD16
src/mesa/drivers/dri/i965/brw_fs.cpp | 59 +--
src/mesa/drivers/dri/i965/brw_fs.h | 8 +-
.../drivers/dri/i965/brw_schedule_instructions.cpp | 443 +++++++++++++--------
src/mesa/drivers/dri/i965/brw_shader.h | 1 -
4 files changed, 313 insertions(+), 198 deletions(-)
--
2.4.3
More information about the mesa-dev
mailing list