[Intel-gfx] [PATCH 05/13] drm: locking&new iterators for connector_list

Wed Dec 14 08:35:21 UTC 2016

On Wed, Dec 14, 2016 at 12:08:06AM +0100, Daniel Vetter wrote:
> The requirements for connector_list locking are a bit tricky:
> - We need to be able to jump over zombie conectors (i.e. with refcount
>   == 0, but not yet removed from the list). If instead we require that
>   there's no zombies on the list then the final kref_put must happen
>   under the list protection lock, which means that locking context
>   leaks all over the place. Not pretty - better to deal with zombies
>   and wrap the locking just around the list_del in the destructor.
> 
> - When we walk the list we must _not_ hold the connector list lock. We
>   walk the connector list at an absolutely massive amounts of places,
>   if all those places can't ever call drm_connector_unreference the
>   code would get unecessarily complicated.
> 
> - connector_list needs it own lock, again too many places that walk it
>   that we could reuse e.g. mode_config.mutex without resulting in
>   inversions.
> 
> - Lots of code uses these loops to look-up a connector, i.e. they want
>   to be able to call drm_connector_reference. But on the other hand we
>   want connectors to stay on that list until they're dead (i.e.
>   connector_list can't hold a full reference), which means despite the
>   "can't hold lock for the loop body" rule we need to make sure a
>   connector doesn't suddenly become a zombie.
> 
> At first Dave&I discussed various horror-show approaches using srcu,
> but turns out it's fairly easy:
> 
> - For the loop body we always hold an additional reference to the
>   current connector. That means it can't zombify, and it also means
>   it'll stay on the list, which means we can use it as our iterator to
>   find the next connector.
> 
> - When we try to find the next connector we only have to jump over
>   zombies. To make sure we don't chase bad pointers that entire loop
>   is protected with the new connect_list_lock spinlock. And because we
>   know that we're starting out with a non-zombie (need to drop our
>   reference for the old connector only after we have our new one),
>   we're guranteed to still be on the connector_list and either find
>   the next non-zombie or complete the iteration.
> 
> - Only downside is that we need to make sure that the temporary
>   reference for the loop body doesn't leak. iter_get/put() functions +
>   lockdep make sure that's the case.
> 
> - To avoid a flag day the new iterator macro has an _iter postfix. We
>   can rename it back once all the users of the unsafe version are gone
>   (there's about 100 list walkers for the connector_list).
> 
> For now this patch only converts all the list walking in the core,
> leaving helpers and drivers for later patches. The nice thing is that
> we can now finally remove 2 FIXME comments from the
> register/unregister functions.
> 
> v2:
> - use irqsafe spinlocks, so that we can use this in drm_state_dump
>   too.
> - nuke drm_modeset_lock_all from drm_connector_init, now entirely
>   cargo-culted nonsense.
> 
> v3:
> - do {} while (!kref_get_unless_zero), makes for a tidier loop (Dave).
> - pretty kerneldoc
> - add EXPORT_SYMBOL, helpers&drivers are supposed to use this.
> 
> v4: Change lockdep annotations to only check whether we release the
> iter fake lock again (i.e. make sure that iter_put is called), but
> not check any locking dependecies itself. That seams to require a
> recursive read lock in trylock mode.
> 
> Cc: Dave Airlie <airlied at gmail.com>
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
> ---
>  drivers/gpu/drm/drm_atomic.c      |  14 ++++-
>  drivers/gpu/drm/drm_connector.c   | 116 ++++++++++++++++++++++++++++++++------
>  drivers/gpu/drm/drm_encoder.c     |   6 +-
>  drivers/gpu/drm/drm_mode_config.c |  34 +++++------
>  include/drm/drm_connector.h       |  38 +++++++++++++
>  include/drm/drm_mode_config.h     |  12 +++-
>  6 files changed, 177 insertions(+), 43 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
> index 60697482b94c..b23b4abd67be 100644
>  int drm_connector_register_all(struct drm_device *dev)
>  {
>  	struct drm_connector *connector;
> -	int ret;
> +	struct drm_connector_list_iter conn_iter;
> +	int ret = 0;
>  
> -	/* FIXME: taking the mode config mutex ends up in a clash with
> -	 * fbcon/backlight registration */
> -	list_for_each_entry(connector, &dev->mode_config.connector_list, head) {
> +	drm_connector_list_iter_get(dev, &conn_iter);
> +	drm_for_each_connector_iter(connector, &conn_iter) {
>  		ret = drm_connector_register(connector);
>  		if (ret)
> -			goto err;
> +			break;
>  	}
> +	drm_connector_list_iter_put(&conn_iter);
>  
> -	return 0;
> -
> -err:
> -	mutex_unlock(&dev->mode_config.mutex);
> -	drm_connector_unregister_all(dev);
> +	if (ret)
> +		drm_connector_unregister_all(dev);

Just poked my head up here, because the thought of a hotplug connector
register vs an error here made my head hurt. I think the only way to
solve that (and general initialisation vs hpd) is to ensure/document no
hpd until after the registration phase.

The locking looks solid, the list is guarded by the spinlock, iterators
hold a reference, and the kref_unless_zero vs the list is guarded by the
spinlock inside connector release.

Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>

> --- a/drivers/gpu/drm/drm_atomic.c
> +++ b/drivers/gpu/drm/drm_atomic.c
> @@ -1417,6 +1417,7 @@ drm_atomic_add_affected_connectors(struct drm_atomic_state *state,
>  	struct drm_mode_config *config = &state->dev->mode_config;
>  	struct drm_connector *connector;
>  	struct drm_connector_state *conn_state;
> +	struct drm_connector_list_iter conn_iter;
>  	int ret;
>  
>  	ret = drm_modeset_lock(&config->connection_mutex, state->acquire_ctx);
> @@ -1430,14 +1431,18 @@ drm_atomic_add_affected_connectors(struct drm_atomic_state *state,
>  	 * Changed connectors are already in @state, so only need to look at the
>  	 * current configuration.
>  	 */
> -	drm_for_each_connector(connector, state->dev) {
> +	drm_connector_list_iter_get(state->dev, &conn_iter);
> +	drm_for_each_connector_iter(connector, &conn_iter) {
>  		if (connector->state->crtc != crtc)
>  			continue;
>  
>  		conn_state = drm_atomic_get_connector_state(state, connector);
> -		if (IS_ERR(conn_state))
> +		if (IS_ERR(conn_state)) {
> +			drm_connector_list_iter_put(&conn_iter);
>  			return PTR_ERR(conn_state);

I wuold have gone with ret = PTR_ERR(conn_state); break; just because I
don't like have more paths through the critical section than necessary
(i.e. more than one unlock).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre