[PATCH 6/6] drm/tinydrm: Support device unplug

Mon Sep 4 15:54:46 UTC 2017

Den 04.09.2017 17.20, skrev Daniel Vetter:
> On Mon, Sep 04, 2017 at 02:30:05PM +0200, Noralf Trønnes wrote:
>> Den 04.09.2017 11.04, skrev Daniel Vetter:
>>> On Mon, Sep 04, 2017 at 11:41:05AM +0300, Laurent Pinchart wrote:
>>>> Hi Daniel,
>>>>
>>>> On Monday, 4 September 2017 10:26:15 EEST Daniel Vetter wrote:
>>>>> On Thu, Aug 31, 2017 at 09:22:03PM +0200, Noralf Trønnes wrote:
>>>>>> Den 31.08.2017 14.59, skrev Laurent Pinchart:
>>>>>>> On Wednesday, 30 August 2017 20:18:49 EEST Daniel Vetter wrote:
>>>>>>>> On Wed, Aug 30, 2017 at 6:31 PM, Noralf Trønnes wrote:
>>>>>>>>> Den 28.08.2017 23.56, skrev Daniel Vetter:
>>>>>>>>>> On Mon, Aug 28, 2017 at 07:17:48PM +0200, Noralf Trønnes wrote:
>>>>>>>>>>> +
>>>>>>>>>>> +       drm_fbdev_cma_dev_unplug(tdev->fbdev_cma);
>>>>>>>>>>> +       drm_dev_unplug(&tdev->drm);
>>>>>>>>>>> +
>>>>>>>>>>> +       /* Make sure framebuffer flushing is done */
>>>>>>>>>>> +       mutex_lock(&tdev->dirty_lock);
>>>>>>>>>>> +       mutex_unlock(&tdev->dirty_lock);
>>>>>>>>>> Is this really needed? Or, doesn't it just paper over a driver bug
>>>>>>>>>> you have already anyway, since native kms userspace can directly
>>>>>>>>>> call fb->funcs->dirty too, and you already protect against that.
>>>>>>>>>>
>>>>>>>>>> This definitely looks like the fbdev helper is leaking
>>>>>>>>>> implementation details to callers where it shouldn't do that.
>>>>>>>>> Flushing can happen while drm_dev_unplug() is called, and when we
>>>>>>>>> leave this function the device facing resources controlled by devres
>>>>>>>>> will be removed. Thus I have to make sure any such flushing is done
>>>>>>>>> before leaving so the next flush is stopped by the
>>>>>>>>> drm_dev_is_unplugged() check. I don't see any other way of ensuring
>>>>>>>>> that.
>>>>>>>>>
>>>>>>>>> I see now that I should move the call to
>>>>>>>>> drm_atomic_helper_shutdown() after drm_dev_unplug() to properly
>>>>>>>>> protect the pipe .enable/.disable callbacks.
>>>>>>>> Hm, calling _shutdown when the hw is gone already won't end well.
>>>>>>>> Fundamentally this race exists for all use-cases, and I'm somewhat
>>>>>>>> leaning towards plugging it in the core.
>>>>>>>>
>>>>>>>> The general solution probably involves something that smells a lot
>>>>>>>> like srcu, i.e. at every possible entry point into a drm driver
>>>>>>>> (ioctl, fbdev, dma-buf sharing, everything really) we take that
>>>>>>>> super-cheap read-side look, and drop it when we leave.
>>>>>>> That's similar to what we plan to do in V4L2. The idea is to set a
>>>>>>> device removed flag at the beginning of the .remove() handler and wait
>>>>>>> for all pending operations to complete. The core will reject any new
>>>>>>> operation when the flag is set. To wait for completion, every entry
>>>>>>> point would increase a use count, and decrease it on exit. When the use
>>>>>>> count is decreased to 0 waiters will be woken up. This should solve the
>>>>>>> unplug/user race.
>>>>>> Ah, such a simple solution, easy to understand and difficult to get wrong!
>>>>>> And it's even nestable, no danger of deadlocking.
>>>>>>
>>>>>> Maybe I can use it with tinydrm:
>>>>>>
>>>>>> * @dev_use: Tracks use of functions acessing the parent device.
>>>>>> *           If it is zero, the device is gone. See ...
>>>>>> struct tinydrm_device {
>>>>>>       atomic_t dev_use;
>>>>>> };
>>>>>>
>>>>>> /**
>>>>>>    * tinydrm_dev_enter - Enter device accessing function
>>>>>>    * @tdev: tinydrm device
>>>>>>    *
>>>>>>    * This function protects against using a device and it's resources after
>>>>>>    * it's removed. Should be called at the beginning of the function.
>>>>>>    *
>>>>>>    * Returns:
>>>>>>    * False if the device is still present, true if it is gone.
>>>>>>    */
>>>>>> static inline bool tinydrm_dev_enter(struct tinydrm_device *tdev)
>>>>>> {
>>>>>>       return !atomic_inc_not_zero(&tdev->dev_use);
>>>>>> }
>>>>>>
>>>>>> static inline void tinydrm_dev_exit(struct tinydrm_device *tdev)
>>>>>> {
>>>>>>       atomic_dec(&tdev->dev_use);
>>>>>> }
>>>>>>
>>>>>> static inline bool tinydrm_is_unplugged(struct tinydrm_device *tdev)
>>>>>> {
>>>>>>       bool ret = !atomic_read(&tdev->dev_use);
>>>>>>       smp_rmb();
>>>>>>       return ret;
>>>>>> }
>>>>>>
>>>>>>
>>>>>> static int tinydrm_init(...)
>>>>>> {
>>>>>>       /* initialize */
>>>>>>
>>>>>>       /* Set device is present */
>>>>>>       atomic_set(&tdev->dev_use, 1);
>>>>>> }
>>>>>>
>>>>>> static void tinydrm_unregister(...)
>>>>>> {
>>>>>>       /* Set device gone */
>>>>>>       atomic_dec(&tdev->dev_use);
>>>>>>
>>>>>>       /* Wait for all device facing functions to finish */
>>>>>>       while (!tinydrm_is_unplugged(tdev)) {
>>>>>>           cond_resched();
>>>>>>       }
>>>>>>
>>>>>>       /* proceed with unregistering */
>>>>>> }
>>>>>>
>>>>>> static int mipi_dbi_fb_dirty(...)
>>>>>> {
>>>>>>       if (tinydrm_dev_enter(tdev))
>>>>>>           return -ENODEV;
>>>>>>
>>>>>>       /* flush framebuffer */
>>>>>>
>>>>>>       tinydrm_dev_exit(tdev);
>>>>>> }
>>>>> Yup, expect imo this should be done in drm core (as much as possible, i.e.
>>>>> for ioctl at least, standard sysfs), plus then enter/exit exported to
>>>>> drivers for their stuff.
>>>>>
>>>>> And it really needs to be srcu. For kms drivers the atomic_inc/dec wont
>>>>> matter, for render drivers it will show up (some of our ioctls are 100%
>>>>> lock less in the fastpath, not a single atomic op in there).
>>>> Don't forget that we need to check the disconnect flag when entering ioctls to
>>>> return -ENODEV. I'm open to clever solutions, but I'd first aim for
>>>> correctness with a lock and then replace the internal implementation.
>>> As Noralf pointed out, we already check for drm_dev_is_unplugged(). Maybe
>>> not in all places, but that can be fixed.
>>>
>>> And you can't first make all ioctl slower and then fix it up, at least not
>>> spread over multiple patch series. I guess for developing, doing the
>>> simpler atomic counter first is ok. But the same patch series needs to
>>> move over to srcu at the end.
>> I've never used srcu/rcu before, care to explain a little bit more?
>> Is it drm_device we are protecting here?
>>
>> How can srcu be better for the fast path you're talking about if it's
>> half of srcu (assuming that makes srcu slower)?
> We're just protecting drm_device->unplugged with srcu, but the great thing
> is that srcu_read_lock/unlock are extremely cheap, and still guarantee
> that we can fully synchronize with writers.

Ok, so what makes this cheap is this_cpu_inc(), which on modern cpu's is
a single fast instruction?

static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
{
     int retval;

     retval = __srcu_read_lock(sp);
     rcu_lock_acquire(&(sp)->dep_map);
     return retval;
}

int __srcu_read_lock(struct srcu_struct *sp)
{
     int idx;

     idx = READ_ONCE(sp->srcu_idx) & 0x1;
     this_cpu_inc(sp->sda->srcu_lock_count[idx]);
     smp_mb(); /* B */  /* Avoid leaking the critical section. */
     return idx;
}

> Here's the rough sketch:
>
> /* srcu is real cheap on the read side, we can have one for all of drm
> DEFINE_STATIC_SRCU(drm_unplug_srcu);
>
> drm_dev_enter()
> {
> 	srcu_read_lock(&drm_srcu);
> }

When entering I want to check at the same time. Is this OK?

bool drm_dev_enter(struct drm_device *dev)
{
     srcu_read_lock(&drm_srcu);
     if (drm_dev_is_unplugged(dev)) {
         srcu_read_unlock(&drm_srcu);
         return false;
     }

     return true;
}

Noralf.

> drm_dev_exit()
> {
> 	srcu_read_unlock(&drm_srcu);
> }
>
> drm_dev_is_unplugged()
> {
> 	/* checking unplugged state outside of holding the srcu read side
> 	 * lock is racy, catch this. */
> 	WARN_ON(!srcu_read_lock_head(&drm_unplug_srcu));
>
> 	/* with rcu we can demote this to a simple read, no need for any
> 	 * atomic or memory barries
> 	return dev->unplugged;
> }
>
> drm_dev_unplug()
> {
> 	dev->unplugged = true;
>
> 	/* This is were the real magic is. After it finished any critical
> 	 * read section is guaranteed to see the new value of ->unplugged,
> 	 * and any critical section which might still have seen the old
> 	 * value of ->unplugged is guaranteed to have finished.
> 	 * It also takes like forever to complete, but that's kinda the
> 	 * point. */
> 	synchronize_srcu(&drm_unplug_srcu);
> }
>
> Excercise for the reader is poking holes into the simpler atomic_t based
> approach and highlighting all the races. You probably need to reach the
> complexity of srcu until it's really race free (and fast).
>
> Oh, one more: please make sure you enable CONFIG_PROVE_LOCKING and
> especilly all the rcu options in there when working on this, to make sure
> we catch all the deadlocks and bugs. This is some really tricky locking
> stuff.
>
> Cheers, Daniel