[Intel-gfx] [PATCH 2/3] drm/i915: put ring frequency and turbo setup into a work queue v2

Wed Oct 31 21:44:27 CET 2012

On Tue, 30 Oct 2012 18:17:58 +0100
Daniel Vetter <daniel at ffwll.ch> wrote:

> On Fri, Oct 26, 2012 at 7:08 PM, Jesse Barnes <jbarnes at virtuousgeek.org> wrote:
> > Communicating via the mailbox registers with the PCU can take quite
> > awhile.  And updating the ring frequency or enabling turbo is not
> > something that needs to happen synchronously, so take it out of our init
> > and resume paths to speed things up (~200ms on my T420).
> >
> > v2: add comment about why we use a work queue (Daniel)
> >     make sure work queue is idle on suspend (Daniel)
> >     use a delayed work queue since there's no hurry (Daniel)
> >
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=54089
> >
> > Signed-of-by: Jesse Barnes <jbarnes at virtuougseek.org>
> 
> Unfortunately I have to tear this patch apart ;-)
> 
> - The cleanup sequence is inconsistent afaict: Check the ordering of
> intel_gt_cleanup vs. intel_disable_gt_powersave

Ok will fix.

> - cancel_*_sync is on a level with *_lock in it's ability to hang
> machines in a deadlock. Hiding it in an innocent sounding functions
> like gt_cleanup (which _only_ cancels the work item) is not a good
> idea. Also, I'd really prefer more symmetric in our setup/teardown
> code, so if that cancel_work can't be in in disable_gt_powersave, we
> need a good reason for it. I know, lock inversion, but my next point
> will solve that ;-)

Note we have timeouts and such in the cancellation path, so I don't
think there's any way it could hang?

> - you still grab the dev->struct_mutex lock, which means you've only
> moved that 200ms delay to someplace you don't measure it. Most likely
> it'll be right when userspace wants to do a nice 15 frames fade-in for
> the unlock screen, which we'll completely drop on the floor due to
> hogging the execbuf lock for an eternity.

Oh it'll probably happen around VT switch time while we do the silly
dance (that's another thing I'd like to remove).

> Iow we need to add a new dev_priv->rps.lock mutex in a first patch,
> then move things around like you do here. That should also take care
> of any deadlock problems with the work item itself, so you can move
> the cancel_work into disable_gt_powersave right before we shut down
> rps/rc6.

Yeah that would be nicer.  There's really no need to block anyone on
the ring freq and RPS stuff; they can happen totally asynchronously (at
least I'm pretty sure of this, we do have bugs that indicate RPS vs RC6
may be an issue we need to handle, but that's neither here nor there).

-- 
Jesse Barnes, Intel Open Source Technology Center