[Intel-gfx] [PATCH 06/19] drm/vmwgfx: Drop the cursor locking hack

Daniel Vetter daniel at ffwll.ch
Thu Mar 23 12:56:20 UTC 2017


On Thu, Mar 23, 2017 at 11:32:49AM +0100, Thomas Hellstrom wrote:
> On 03/23/2017 11:10 AM, Daniel Vetter wrote:
> > On Thu, Mar 23, 2017 at 09:35:25AM +0100, Thomas Hellstrom wrote:
> >> Hi, Daniel,
> >>
> >> On 03/23/2017 08:31 AM, Daniel Vetter wrote:
> >>> On Thu, Mar 23, 2017 at 08:28:32AM +0100, Daniel Vetter wrote:
> >>>> On Thu, Mar 23, 2017 at 07:22:31AM +0100, Thomas Hellstrom wrote:
> >>>>> On 03/22/2017 10:50 PM, Daniel Vetter wrote:
> >>>>>> It's been around forever, no one bothered to address the FIXME, so I
> >>>>>> presume it's all fine.
> >>>>>>
> >>>>>> Cc: Sinclair Yeh <syeh at vmware.com>
> >>>>>> Cc: Thomas Hellstrom <thellstrom at vmware.com>
> >>>>>> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
> >>>>> NAK. We need to properly address this. Probably as part of the atomic
> >>>>> update.
> >>>> So could someone with vmwgfx understanding explain this? Note that the
> >>>> FIXME was originally added by me years ago, because I wasn't sure (only
> >>>> about 90%) that this is safe, and was essentially pleading for a vmwgfx
> >>>> expert to review this?
> >>>>
> >>>> Since it didn't happen I presume it's not that terribly and probably safe
> >>>> ...
> >>>>
> >>>> I'm still 90% sure that this is correct, but I'd love for a vmwgfx to
> >>>> audit it. Replying with a NAK is kinda not the response I was hoping for
> >>>> (and yes I guess I should have explained what's going on here better, but
> >>>> it's just a git blame of the FIXME comment away).
> >> So the code has been left in place because it works. Altering it now
> >> will create unnecessary merge conflicts with the atomic code, and the
> >> change isn't tested and audited which means we need to drop focus from
> >> what we're doing and audit and test code that isn't going to be used
> >> anyway for not apparent reason? But otoh put in the below context there
> >> indeed is a reason.
> >>
> >> From a quick audit of the existing code it seems like at least
> >> vmw_cursor_update_position is touching global device state so I think at
> >> a minimum we need to take a spinlock in that function. Otherwise it
> >> seems to be safe.
> > Note that you're holding the crtc lock already, which gives you exclusion
> > against concurrent page_flips, mode_sets and property changes. Note also
> > that page_flips themselves also only hold the crtc lock, so you can run
> > multiple page_flips in parallel on different crtc (iirc vmwgfx has
> > multiple crtc, if not this discussion is entirely moot).
> >
> > tbh I'd be surprised if my patch really breaks something that hasn't been
> > a pre-existing issue for a long time. The original commit which added this
> > FIXME comment is from 2012. Note also that because it's a hack, you
> > already have a pretty a real race with the core drm state keeping, and no
> > one seems to have hit that either.
> >
> > I mean I can dig through vmwgfx code and do the audit, but it'll take a
> > few hours and vmwgfx is it's own world, so much harder to understand (for
> > me).
> >
> 
> I'm thinking of the situation when someone would call a cursor_set ioctl
> in parallell
> for two crtcs at the same time and race writing the position registers?
> Note that the device has only a single global cursor.
> Admittedly the effects of a race would probably be small, but I'd rather
> see it being
> properly protected.

Hm, didn't realize you only have 1 cursor for everything together. In that
case you indeed have a problem. Not sure why that didn't come up 4 years
ago with the original patch, would be pretty easy to add a quite mutex in
v2 ... Since read-only global state is perfectly fine, having the crtc
lock gives you a read-only global state lock (for legacy drivers at least,
not for atomic).
>
> >> But I prefer if we can do that as part of the atomic update?
> > When does that vmwgfx atomic happen?
> 
> We're targeting 4.12, which means the code that is currently under
> testing will need to be sent out for review pretty soon.
> It's already in our standalone testing repo at
> 
> git://git.freedesktop.org/git/mesa/vmwgfx

Deadline is in 2 weeks for 4.12 feature work, per the discussion we've had
after the 4.11 merge window fallout with Linus. You pretty much have to
submit the patches now to have a reasonable chance of them landing in
time. Since vmwgfx has traditionally been the odd kms driver out I'd
really like to give the new atomic code at least a quick read-through, to
make sure it's aligned as much as possible with the other 20+ atomic
drivers.

> but the cursor code hasn't been fixed in that repo yet.

Well if you switched to universal planes it's pretty easy to fix with the
acquire ctx and grabbing mode_config.connection_mutex. Without that you
can just add a global cursor mutex (equally few lines) to patch it up.
> 
> BTW is this blocking some other core drm work you're doing?

Just removing lock_crtc and preventing abuse from spreading. Somehow both
tegra and tilcdc starting using it in places it was definitely not meant
for. vmwgfx (with this FIXME here) was the only legit user of this
function. So not high priority really, but something that'd be really nice
to remove from the exported set of functions to prevent future misuse by
new drivers.

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the Intel-gfx mailing list