[Mesa-dev] [PATCH 0/8] i965: gl_TessLevel rescrambling in NIR
Eero Tamminen
eero.t.tamminen at intel.com
Wed Jan 4 13:16:41 UTC 2017
Hi,
On 04.01.2017 13:07, Kenneth Graunke wrote:
> This series reworks i965's handling of gl_TessLevelInner/Outer[] arrays.
> Instead of using lower_tess_levels to turn them into vec4/vec2s, we pass
> them through to NIR and make them compact arrays (where array indexing
> translates to enhanced layouts components).
>
> This has some nice benefits. In the last patch, we're able to drop
> reswizzling and writemask-munging for load_output and store_output
> in both the scalar TCS and vec4 TCS backends, as well as code to do
> the same for TES system values. That's 5 copies of backend code
> replaced by a small amount of extra code in remap_patch_urb_offsets.
>
> It also means we can drop TES handling entirely - the ordinary input
> handling code will handle it just fine.
Have you tried whether this makes any perf difference in GpuTest v0.7
TessMark, GfxBench v4 tessellation, or in SynMark2 v7.0 terrain
tessellation tests?
> This is the first step toward tessellation support in Vulkan (anv).
> (lower_tess_levels is written in GLSL IR, so we need a replacement.)
Are there yet other use-cases for Vulkan tessellation besides Sacha
Willems' three tests here:
https://github.com/SaschaWillems/Vulkan
?
- Eero
> This has an impact on shader-db's TCS shaders (but not a single TES):
>
> With scalar TCS/TES:
>
> total instructions in shared programs: 13388151 -> 13387794 (-0.00%)
> instructions in affected programs: 31920 -> 31563 (-1.12%)
> helped: 75
> HURT: 0
>
> total cycles in shared programs: 257010676 -> 257008504 (-0.00%)
> cycles in affected programs: 165632 -> 163460 (-1.31%)
> helped: 75
> HURT: 0
>
> With vec4 TCS/TES:
>
> total instructions in shared programs: 13345621 -> 13345681 (0.00%)
> instructions in affected programs: 18593 -> 18653 (0.32%)
> helped: 36
> HURT: 25
>
> total cycles in shared programs: 256761898 -> 256759952 (-0.00%)
> cycles in affected programs: 266644 -> 264698 (-0.73%)
> helped: 172
> HURT: 44
>
> The vec4 stats are not great, but I don't expect it to make much of a
> performance difference - TCS isn't usually the bottleneck (TES is).
> They could be improved by writing a peephole pass to detect load/stores
> to the same base+offset with consecutive scalar components and turn them
> into vec2/vec4 load/stores.
More information about the mesa-dev
mailing list