[Intel-gfx] [PATCH v4] drm/i915: Optimistically spin for the request completion
Chris Wilson
chris at chris-wilson.co.uk
Fri Mar 20 15:59:50 PDT 2015
On Fri, Mar 20, 2015 at 04:19:02PM +0000, Chris Wilson wrote:
> I guess one test would be to see how many 1x1 [xN overdraw, say 1x1
> Window, but rendering internally at 1080p] clients we can run in
> parallel whilst hitting 60fps. And then whether allowing multiple
> spinners helps or hinders.
I was thinking of a nice easy test that could demonstrate any advantage
for spinning over waiting, and realised we already had such an igt. The
trick is that it has to generate sufficient GPU load to actually require
a wait, but not too high a GPU load such that we can see the impact from
slow completion.
I present igt/gem_exec_blt (modified to repeat the measurement and do an
average over several runs):
Time to blt 16384 bytes x 1: 21.000µs -> 5.800µs
Time to blt 16384 bytes x 2: 11.500µs -> 4.500µs
Time to blt 16384 bytes x 4: 6.750µs -> 3.750µs
Time to blt 16384 bytes x 8: 4.950µs -> 3.375µs
Time to blt 16384 bytes x 16: 3.825µs -> 3.175µs
Time to blt 16384 bytes x 32: 3.356µs -> 3.000µs
Time to blt 16384 bytes x 64: 3.259µs -> 2.909µs
Time to blt 16384 bytes x 128: 3.083µs -> 3.095µs
Time to blt 16384 bytes x 256: 3.104µs -> 2.979µs
Time to blt 16384 bytes x 512: 3.080µs -> 3.089µs
Time to blt 16384 bytes x 1024: 3.077µs -> 3.040µs
Time to blt 16384 bytes x 2048: 3.127µs -> 3.304µs
Time to blt 16384 bytes x 4096: 3.279µs -> 3.265µs
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list