[Intel-gfx] [PATCH 06/43] drm/i915: protect force_wake_(get|put) with the gt_lock

Keith Packard keithp at keithp.com
Thu Jan 5 03:22:41 CET 2012


On Wed, 4 Jan 2012 19:12:57 +0100, Daniel Vetter <daniel at ffwll.ch> wrote:

> The "Correct?" was just to check my understanding of your concern, I still
> think its invalid. You've cut away the second part of my mail where I
> explain why and I don't see you responding to that here. Also
> micro-optimizing the gpu reset code sounds a bit strange.

Sorry, I didn't explain things very well.

Right now, our register access looks like:

        get(struct_mutex);
        if (++forcewake_count == 1)
                force_wake_get()

        value = read32(reg)     or      write32(reg, val)

        if (--forcewake_count == 0)
                force_wake_put();

        /* more register accesses may follow ... */
        put(struct_mutex);

All very sensible, the whole register sequence is covered by
struct_mutex, which ensures that the forcewake is set across the
register access.

The patch does:

        get(spin_lock)
        if (++forcewake_count == 1)
                force_wake_get()
        put(spin_lock)
        value = read32(reg)     or     write32(reg, val)
        get(spin_lock)
        if (--forcewake_count == 0)
                force_wake_put()
        put(spin_lock)

I realize that all current users hold the struct_mutex across this whole
sequence, aside from the reset path, but we're removing that requirement
explicitly (the patch removes the WARN_ON calls).

Without a lock held across the whole sequence, it's easy to see how a
race could occur. We're also dropping and re-acquiring a spinlock with a
single instruction between, which seems wasteful. Instead, we should be
doing:

        get(spin_lock)
        force_wake_get();
        value = read32(reg)     or      write32(reg,val)
        force_wake_put();
        put(spin_lock);
        
No need here to deal with the wake lock at reset time; the whole
operation is atomic wrt interrupts. It's more efficient *and* correct,
without depending on the old struct_mutex locking.

If you want to continue to expose the user-mode wake lock stuff, you
could add:

        get(spin_lock);
        if (!forcewake_count)
                force_wake_get();
        value = read32(reg)     or      write32(reg,val)
        if (!forcewake_count)
                force_wake_put();
        put(spin_lock);

This would require that the reset code also deal with the
forcewake_count to restore the user-mode force wake.

A further optimization would hold the force_wake active for 'a while' to
avoid the extra bus cycles required, but we'd want to see a performance
problem from this before doing that.


-- 
keith.packard at intel.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 827 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/intel-gfx/attachments/20120104/626ccee7/attachment.sig>


More information about the Intel-gfx mailing list