[Mesa-dev] [PATCH 00/11] i965: UBO pushing for fun and profit?

Fri Jul 7 00:22:09 UTC 2017

Hello,

This series begins pushing UBOs (rather than resorting to pull loads)
for scalar shaders on Gen7.5+, for the OpenGL driver.  Future work is
to hook it up for Vulkan (haven't started), for the vec4 shader stages
(I have about 75% of the code written), and for Gen7 (I have a plan).

Note that compute shaders unfortunately still resort to pull messages,
because I haven't found a way to make the constant commands absolute
addresses instead of being relative to dynamic state base address.

This has long been a gap in our UBO support - we pushed regular
uniform data, but always resorted to pulls for UBOs, making them
slower than regular uniforms.

I started this project a year and a half ago, and it initially looked
very promising - up to 30% faster in Tomb Raider, for example.  However,
Curro improved the performance of pull messages significantly since then.
Now, it doesn't seem to have as large of an impact.  Jason thinks this
would help close the GL/Vulkan gap in Talos Principle, when we finally
hook it up in Vulkan.  One place where it does help is GLBenchmark 3.1
Manhattan, which improves 3-4% on most platforms, and 6-7% on SKL GT4.
This is primarily because it avoids doing a pull load in a loop, though,
which could be solved by using the global code motion pass...

I figured I'd at least send it out for an initial review, and we can
continue collecting benchmark data...

--Ken