[Intel-gfx] [PATCH v4 4/4] Try inlining sg_next()

Fri May 20 07:35:33 UTC 2016

On Fri, May 20, 2016 at 01:31:30AM +0100, Dave Gordon wrote:
> Signed-off-by: Dave Gordon <david.s.gordon at intel.com>
> Cc: Chris Wilson <chris at chris-wilson.co.uk>

Much better. The effect of the inline

         gem:exec:fault:1MiB:    +4.90%
  gem:exec:fault:1MiB:forked:    +7.99%
        gem:exec:fault:16MiB:   +22.94%
 gem:exec:fault:16MiB:forked:   +19.96%
       gem:exec:fault:256MiB:   +27.45%
gem:exec:fault:256MiB:forked:   +36.89%

And it brings this series into parity with mine.

----

Avoiding the out-of-line call to sg_next() reduces the kernel execution
overhead by 10% in some workloads (for example the Unreal Engine 4 demo
Atlantis on 2GiB GTTs) which are dominated by the cost of inserting PTEs
due to texture thrashing. We can demonstrate this in a microbenchmark
that forces us to rebind the object on every execbuf, where we can
measure a 25% improvement, in the time required to execute an execbuf
requiring a texture to be rebound, for inlining the sg_next() for large
texture sizes.

Benchmark: igt/benchmarks/gem_exec_fault
Benchmark: igt/benchmarks/gem_exec_trace/Atlantis
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre