[Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock

Daniel Vetter daniel.vetter at ffwll.ch
Tue Jan 3 22:49:52 CET 2012


On Tue, Jan 3, 2012 at 22:13, Keith Packard <keithp at keithp.com> wrote:
> On Tue, 3 Jan 2012 20:12:35 +0100, Daniel Vetter <daniel.vetter at ffwll.ch> wrote:
>
>> I'm a bit confused by this. With the current code forcewake is
>> protected by dev->struct_mutex. Which doesn't work out because we need
>> to access registers that require the forcewake dance from non-process
>> context.
>
> Right, I like adding a spinlock around this to allow it to be called
> without needing to be able to lock the struct_mutex. (I remember
> suggesting that a spinlock would be necessary when the force wake code
> first showed up...)
>
> However, the commit message talks about the error capture and
> hang check code, but doesn't appear to change them at all.
>
> I think all this patch does is replace the locking for forcewake_count
> From struct_mutex to a new irq-safe spinlock, the commit message makes
> it sound like it's actually fixing stuff, which it isn't; it just
> enables fixing stuff in future patches, right?

Nope, current hangcheck blows up, and we have an i-g-t testcase for it
(which the commit msg clearly states). There are also numerous bug
reports where a dying gpu results in tons of
WARN_ON(!mutex_locked(dev->struct_mutex)) noise in dmesg (which drowns
out the gpu hang warning). The locking change fixes this.

> Reading through this a bit more, I think your patch opens up a hole in
> i915_reset. i915_reset takes struct_mutex, then resets the chip and
> restores the forcewake status. If we aren't relying on struct_mutex to
> protect the forcewake bits, then there's nothing preventing a thread
> From accessing the registers with the chip sleeping between the reset
> and the force wake reset.

The patch adds the required locking to i915_reset.

>> Afaik the atomic ops stuff is just ducttape for paranoia reasons.
>
> The atomic ops stuff would allow reading of the value without holding
> struct_mutex, if that were actually useful.

... but is currently unused and inherently racy. Which is why the
patch drops it.
-- 
Daniel Vetter
daniel.vetter at ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch



More information about the Intel-gfx mailing list