[Intel-gfx] [PATCH v4] drm/i915: Optimistically spin for the request completion
Daniel Vetter
daniel at ffwll.ch
Mon Mar 23 01:31:38 PDT 2015
On Fri, Mar 20, 2015 at 10:59:50PM +0000, Chris Wilson wrote:
> On Fri, Mar 20, 2015 at 04:19:02PM +0000, Chris Wilson wrote:
> > I guess one test would be to see how many 1x1 [xN overdraw, say 1x1
> > Window, but rendering internally at 1080p] clients we can run in
> > parallel whilst hitting 60fps. And then whether allowing multiple
> > spinners helps or hinders.
>
> I was thinking of a nice easy test that could demonstrate any advantage
> for spinning over waiting, and realised we already had such an igt. The
> trick is that it has to generate sufficient GPU load to actually require
> a wait, but not too high a GPU load such that we can see the impact from
> slow completion.
>
> I present igt/gem_exec_blt (modified to repeat the measurement and do an
> average over several runs):
>
> Time to blt 16384 bytes x 1: 21.000µs -> 5.800µs
> Time to blt 16384 bytes x 2: 11.500µs -> 4.500µs
> Time to blt 16384 bytes x 4: 6.750µs -> 3.750µs
> Time to blt 16384 bytes x 8: 4.950µs -> 3.375µs
> Time to blt 16384 bytes x 16: 3.825µs -> 3.175µs
> Time to blt 16384 bytes x 32: 3.356µs -> 3.000µs
> Time to blt 16384 bytes x 64: 3.259µs -> 2.909µs
> Time to blt 16384 bytes x 128: 3.083µs -> 3.095µs
> Time to blt 16384 bytes x 256: 3.104µs -> 2.979µs
> Time to blt 16384 bytes x 512: 3.080µs -> 3.089µs
> Time to blt 16384 bytes x 1024: 3.077µs -> 3.040µs
> Time to blt 16384 bytes x 2048: 3.127µs -> 3.304µs
> Time to blt 16384 bytes x 4096: 3.279µs -> 3.265µs
We probably need to revisit this when the scheduler lands - that one will
want to keep a short queue and generally will block for some request to
complete.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the Intel-gfx
mailing list