[PATCH 6/6] drm/tinydrm: Support device unplug

Mon Sep 4 16:39:07 UTC 2017

On Mon, Sep 4, 2017 at 5:54 PM, Noralf Trønnes <noralf at tronnes.org> wrote:
>
> Den 04.09.2017 17.20, skrev Daniel Vetter:
>>
>> On Mon, Sep 04, 2017 at 02:30:05PM +0200, Noralf Trønnes wrote:
>>>
>>> Den 04.09.2017 11.04, skrev Daniel Vetter:
>>>>
>>>> On Mon, Sep 04, 2017 at 11:41:05AM +0300, Laurent Pinchart wrote:
>>>>>
>>>>> Hi Daniel,
>>>>>
>>>>> On Monday, 4 September 2017 10:26:15 EEST Daniel Vetter wrote:
>>>>>>
>>>>>> On Thu, Aug 31, 2017 at 09:22:03PM +0200, Noralf Trønnes wrote:
>>>>>>>
>>>>>>> Den 31.08.2017 14.59, skrev Laurent Pinchart:
>>>>>>>>
>>>>>>>> On Wednesday, 30 August 2017 20:18:49 EEST Daniel Vetter wrote:
>>>>>>>>>
>>>>>>>>> On Wed, Aug 30, 2017 at 6:31 PM, Noralf Trønnes wrote:
>>>>>>>>>>
>>>>>>>>>> Den 28.08.2017 23.56, skrev Daniel Vetter:
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Aug 28, 2017 at 07:17:48PM +0200, Noralf Trønnes wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> +
>>>>>>>>>>>> +       drm_fbdev_cma_dev_unplug(tdev->fbdev_cma);
>>>>>>>>>>>> +       drm_dev_unplug(&tdev->drm);
>>>>>>>>>>>> +
>>>>>>>>>>>> +       /* Make sure framebuffer flushing is done */
>>>>>>>>>>>> +       mutex_lock(&tdev->dirty_lock);
>>>>>>>>>>>> +       mutex_unlock(&tdev->dirty_lock);
>>>>>>>>>>>
>>>>>>>>>>> Is this really needed? Or, doesn't it just paper over a driver
>>>>>>>>>>> bug
>>>>>>>>>>> you have already anyway, since native kms userspace can directly
>>>>>>>>>>> call fb->funcs->dirty too, and you already protect against that.
>>>>>>>>>>>
>>>>>>>>>>> This definitely looks like the fbdev helper is leaking
>>>>>>>>>>> implementation details to callers where it shouldn't do that.
>>>>>>>>>>
>>>>>>>>>> Flushing can happen while drm_dev_unplug() is called, and when we
>>>>>>>>>> leave this function the device facing resources controlled by
>>>>>>>>>> devres
>>>>>>>>>> will be removed. Thus I have to make sure any such flushing is
>>>>>>>>>> done
>>>>>>>>>> before leaving so the next flush is stopped by the
>>>>>>>>>> drm_dev_is_unplugged() check. I don't see any other way of
>>>>>>>>>> ensuring
>>>>>>>>>> that.
>>>>>>>>>>
>>>>>>>>>> I see now that I should move the call to
>>>>>>>>>> drm_atomic_helper_shutdown() after drm_dev_unplug() to properly
>>>>>>>>>> protect the pipe .enable/.disable callbacks.
>>>>>>>>>
>>>>>>>>> Hm, calling _shutdown when the hw is gone already won't end well.
>>>>>>>>> Fundamentally this race exists for all use-cases, and I'm somewhat
>>>>>>>>> leaning towards plugging it in the core.
>>>>>>>>>
>>>>>>>>> The general solution probably involves something that smells a lot
>>>>>>>>> like srcu, i.e. at every possible entry point into a drm driver
>>>>>>>>> (ioctl, fbdev, dma-buf sharing, everything really) we take that
>>>>>>>>> super-cheap read-side look, and drop it when we leave.
>>>>>>>>
>>>>>>>> That's similar to what we plan to do in V4L2. The idea is to set a
>>>>>>>> device removed flag at the beginning of the .remove() handler and
>>>>>>>> wait
>>>>>>>> for all pending operations to complete. The core will reject any new
>>>>>>>> operation when the flag is set. To wait for completion, every entry
>>>>>>>> point would increase a use count, and decrease it on exit. When the
>>>>>>>> use
>>>>>>>> count is decreased to 0 waiters will be woken up. This should solve
>>>>>>>> the
>>>>>>>> unplug/user race.
>>>>>>>
>>>>>>> Ah, such a simple solution, easy to understand and difficult to get
>>>>>>> wrong!
>>>>>>> And it's even nestable, no danger of deadlocking.
>>>>>>>
>>>>>>> Maybe I can use it with tinydrm:
>>>>>>>
>>>>>>> * @dev_use: Tracks use of functions acessing the parent device.
>>>>>>> *           If it is zero, the device is gone. See ...
>>>>>>> struct tinydrm_device {
>>>>>>>       atomic_t dev_use;
>>>>>>> };
>>>>>>>
>>>>>>> /**
>>>>>>>    * tinydrm_dev_enter - Enter device accessing function
>>>>>>>    * @tdev: tinydrm device
>>>>>>>    *
>>>>>>>    * This function protects against using a device and it's resources
>>>>>>> after
>>>>>>>    * it's removed. Should be called at the beginning of the function.
>>>>>>>    *
>>>>>>>    * Returns:
>>>>>>>    * False if the device is still present, true if it is gone.
>>>>>>>    */
>>>>>>> static inline bool tinydrm_dev_enter(struct tinydrm_device *tdev)
>>>>>>> {
>>>>>>>       return !atomic_inc_not_zero(&tdev->dev_use);
>>>>>>> }
>>>>>>>
>>>>>>> static inline void tinydrm_dev_exit(struct tinydrm_device *tdev)
>>>>>>> {
>>>>>>>       atomic_dec(&tdev->dev_use);
>>>>>>> }
>>>>>>>
>>>>>>> static inline bool tinydrm_is_unplugged(struct tinydrm_device *tdev)
>>>>>>> {
>>>>>>>       bool ret = !atomic_read(&tdev->dev_use);
>>>>>>>       smp_rmb();
>>>>>>>       return ret;
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> static int tinydrm_init(...)
>>>>>>> {
>>>>>>>       /* initialize */
>>>>>>>
>>>>>>>       /* Set device is present */
>>>>>>>       atomic_set(&tdev->dev_use, 1);
>>>>>>> }
>>>>>>>
>>>>>>> static void tinydrm_unregister(...)
>>>>>>> {
>>>>>>>       /* Set device gone */
>>>>>>>       atomic_dec(&tdev->dev_use);
>>>>>>>
>>>>>>>       /* Wait for all device facing functions to finish */
>>>>>>>       while (!tinydrm_is_unplugged(tdev)) {
>>>>>>>           cond_resched();
>>>>>>>       }
>>>>>>>
>>>>>>>       /* proceed with unregistering */
>>>>>>> }
>>>>>>>
>>>>>>> static int mipi_dbi_fb_dirty(...)
>>>>>>> {
>>>>>>>       if (tinydrm_dev_enter(tdev))
>>>>>>>           return -ENODEV;
>>>>>>>
>>>>>>>       /* flush framebuffer */
>>>>>>>
>>>>>>>       tinydrm_dev_exit(tdev);
>>>>>>> }
>>>>>>
>>>>>> Yup, expect imo this should be done in drm core (as much as possible,
>>>>>> i.e.
>>>>>> for ioctl at least, standard sysfs), plus then enter/exit exported to
>>>>>> drivers for their stuff.
>>>>>>
>>>>>> And it really needs to be srcu. For kms drivers the atomic_inc/dec
>>>>>> wont
>>>>>> matter, for render drivers it will show up (some of our ioctls are
>>>>>> 100%
>>>>>> lock less in the fastpath, not a single atomic op in there).
>>>>>
>>>>> Don't forget that we need to check the disconnect flag when entering
>>>>> ioctls to
>>>>> return -ENODEV. I'm open to clever solutions, but I'd first aim for
>>>>> correctness with a lock and then replace the internal implementation.
>>>>
>>>> As Noralf pointed out, we already check for drm_dev_is_unplugged().
>>>> Maybe
>>>> not in all places, but that can be fixed.
>>>>
>>>> And you can't first make all ioctl slower and then fix it up, at least
>>>> not
>>>> spread over multiple patch series. I guess for developing, doing the
>>>> simpler atomic counter first is ok. But the same patch series needs to
>>>> move over to srcu at the end.
>>>
>>> I've never used srcu/rcu before, care to explain a little bit more?
>>> Is it drm_device we are protecting here?
>>>
>>> How can srcu be better for the fast path you're talking about if it's
>>> half of srcu (assuming that makes srcu slower)?
>>
>> We're just protecting drm_device->unplugged with srcu, but the great thing
>> is that srcu_read_lock/unlock are extremely cheap, and still guarantee
>> that we can fully synchronize with writers.
>
>
> Ok, so what makes this cheap is this_cpu_inc(), which on modern cpu's is
> a single fast instruction?

Yup. Atomics, locks and stuff like that usually require that the cpu
flushes it entire pipeline, which is really expensive.

> static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
> {
>     int retval;
>
>     retval = __srcu_read_lock(sp);
>     rcu_lock_acquire(&(sp)->dep_map);
>     return retval;
> }
>
> int __srcu_read_lock(struct srcu_struct *sp)
> {
>     int idx;
>
>     idx = READ_ONCE(sp->srcu_idx) & 0x1;
>     this_cpu_inc(sp->sda->srcu_lock_count[idx]);
>     smp_mb(); /* B */  /* Avoid leaking the critical section. */
>     return idx;
> }
>
>> Here's the rough sketch:
>>
>> /* srcu is real cheap on the read side, we can have one for all of drm
>> DEFINE_STATIC_SRCU(drm_unplug_srcu);
>>
>> drm_dev_enter()
>> {
>>         srcu_read_lock(&drm_srcu);
>> }
>
>
> When entering I want to check at the same time. Is this OK?
>
> bool drm_dev_enter(struct drm_device *dev)
> {
>     srcu_read_lock(&drm_srcu);
>     if (drm_dev_is_unplugged(dev)) {
>         srcu_read_unlock(&drm_srcu);
>         return false;
>     }
>
>     return true;
> }

Yeah that makes sense. Still please keep the locking check in
_is_unplugged() (but maybe we want it to be lockdep_assert_held, so
it's compiled away to nothing for production builds).
-Daniel

>
> Noralf.
>
>
>> drm_dev_exit()
>> {
>>         srcu_read_unlock(&drm_srcu);
>> }
>>
>> drm_dev_is_unplugged()
>> {
>>         /* checking unplugged state outside of holding the srcu read side
>>          * lock is racy, catch this. */
>>         WARN_ON(!srcu_read_lock_head(&drm_unplug_srcu));
>>
>>         /* with rcu we can demote this to a simple read, no need for any
>>          * atomic or memory barries
>>         return dev->unplugged;
>> }
>>
>> drm_dev_unplug()
>> {
>>         dev->unplugged = true;
>>
>>         /* This is were the real magic is. After it finished any critical
>>          * read section is guaranteed to see the new value of ->unplugged,
>>          * and any critical section which might still have seen the old
>>          * value of ->unplugged is guaranteed to have finished.
>>          * It also takes like forever to complete, but that's kinda the
>>          * point. */
>>         synchronize_srcu(&drm_unplug_srcu);
>> }
>>
>> Excercise for the reader is poking holes into the simpler atomic_t based
>> approach and highlighting all the races. You probably need to reach the
>> complexity of srcu until it's really race free (and fast).
>>
>> Oh, one more: please make sure you enable CONFIG_PROVE_LOCKING and
>> especilly all the rcu options in there when working on this, to make sure
>> we catch all the deadlocks and bugs. This is some really tricky locking
>> stuff.
>>
>> Cheers, Daniel
>
>

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch