[Intel-gfx] [PATCH] RFC drm/i915: Slaughter the thundering i915_wait_request herd

Chris Wilson chris at chris-wilson.co.uk
Wed Nov 4 06:48:11 PST 2015


On Wed, Nov 04, 2015 at 01:20:33PM +0000, Gong, Zhipeng wrote:
> 
> 
> > -----Original Message-----
> > From: Chris Wilson [mailto:chris at chris-wilson.co.uk]
> > Sent: Wednesday, November 04, 2015 5:54 PM
> > On Wed, Nov 04, 2015 at 06:19:33AM +0000, Gong, Zhipeng wrote:
> > > > From: Chris Wilson [mailto:chris at chris-wilson.co.uk] On Tue, Nov 03,
> > > > 2015 at 01:31:22PM +0000, Gong, Zhipeng wrote:
> > > > >
> > > > > > From: Chris Wilson [mailto:chris at chris-wilson.co.uk]
> > > > > >
> > > > > > Do you also have a relative perf statistics like op/s we can
> > > > > > compare to make sure we aren't just stalling the whole system?
> > > > > >
> > > > > Could you please provide the commands about how to check it?
> > > >
> > > > I was presuming your workload has some measure of
> > efficiency/throughput?
> > > > It is one thing to say we are using 10% less CPU (per second), but
> > > > the task is running 2x as long!
> > > We use execute time as a measurement, the patch affects the execution
> > > time for our cases slightly.
> > >
> > > Exec time(s)    |   w/o patch   |   w/patch
> > > -----------------------------------------------
> > > BDW async 1     |    65.00      |    65.25
> > > BDW async 5     |    68.50      |    66.42
> > 
> > That's reassuring.
> > 
> > > >
> > > > > > How much cpu time is left in the i915_wait_request branch? i.e.
> > > > > > how close to the limit are we with chasing this path?
> > > > > Could you please provide the commands here either? :)
> > > >
> > > > Check the perf callgraph.
> > >
> > > Now the most of time is in io_schedule_timeout __i915_wait_request
> > > |--64.04%-- io_schedule_timeout
> > > |--22.04%-- intel_engine_add_wakeup
> > > |--3.13%-- prepare_to_wait
> > > |--2.99%-- gen6_rps_boost
> > > |-...
> > 
> > No more busywaits, and most of the time is spent kicking the next process or
> > doing the insertion sort into the waiting rbtree.
> > 
> > What's the ratio now of __i915_wait_request to the next hot function?
> > And who are the chief callers of __i915_wait_request?
> > -Chris
> Please check the attachments for the details, I post a piece of it here:
> |--17.89%-- i915_gem_object_sync
>          |--73.19%-- __i915_wait_request
>          |--12.60%-- i915_gem_object_retire_request

Interesting. Most of the time is spent shuffling requests around in the
execbuffer rather than doing useful work. I've been working on moving
that work around, but even then we are likely to be spending our time
instantiating all those new objects. As far as trimming the CPU time
from __i915_wait_request() that looks about as far as we can go.

If you have some free cycles on those machines, I would very much
appreciate seeing the same callgraphs from a
http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=nightly&id=134211e33719ef698f9bd51b72ad2fc434cb51f9
kernel

Thanks,
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list