[Intel-gfx] [PATCH v4.5 1/8] cgroup: Allow registration and lookup of cgroup private data (v3)

Chris Wilson chris at chris-wilson.co.uk
Thu Mar 22 18:04:44 UTC 2018


Quoting Matt Roper (2018-03-21 23:23:37)
> There are cases where other parts of the kernel may wish to store data
> associated with individual cgroups without building a full cgroup
> controller.  Let's add interfaces to allow them to register and lookup
> this private data for individual cgroups.
> 
> A kernel system (e.g., a driver) that wishes to register private data
> for a cgroup should start by obtaining a unique private data key by
> calling cgroup_priv_getkey().  It may then associate private data
> with a cgroup by calling cgroup_priv_install(cgrp, key, ref) where 'ref'
> is a pointer to a kref field inside the private data structure.  The
> private data may later be looked up by calling cgroup_priv_get(cgrp,
> key) to obtain a new reference to the private data.  Private data may be
> unregistered via cgroup_priv_release(cgrp, key).
> 
> If a cgroup is removed, the reference count for all private data objects
> will be decremented.
> 
> v2:  Significant overhaul suggested by Tejun, Alexei, and Roman
>  - Rework interface to make consumers obtain an ida-based key rather
>    than supplying their own arbitrary void*
>  - Internal implementation now uses per-cgroup radixtrees which should
>    allow much faster lookup than the previous hashtable approach
>  - Private data is registered via kref, allowing a single private data
>    structure to potentially be assigned to multiple cgroups.
> 
> v3:
>  - Spare warning fixes (kbuild test robot)
> 
> Cc: Tejun Heo <tj at kernel.org>
> Cc: Alexei Starovoitov <ast at fb.com>
> Cc: Roman Gushchin <guro at fb.com>
> Cc: cgroups at vger.kernel.org
> Signed-off-by: Matt Roper <matthew.d.roper at intel.com>
> 
> fixup! cgroup: Allow registration and lookup of cgroup private data (v2)
> ---
>  include/linux/cgroup-defs.h |  10 +++
>  include/linux/cgroup.h      |   7 ++
>  kernel/cgroup/cgroup.c      | 183 +++++++++++++++++++++++++++++++++++++++++++-
>  3 files changed, 198 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
> index 9f242b876fde..9086d963cc0a 100644
> --- a/include/linux/cgroup-defs.h
> +++ b/include/linux/cgroup-defs.h
> @@ -427,6 +427,16 @@ struct cgroup {
>         /* used to store eBPF programs */
>         struct cgroup_bpf bpf;
>  
> +       /*
> +        * cgroup private data registered by other non-controller parts of the
> +        * kernel.  Insertions are protected by privdata_lock, lookups by
> +        * rcu_read_lock().
> +        */
> +       struct radix_tree_root privdata;
> +
> +       /* Protect against concurrent insertions/deletions to privdata */
> +       spinlock_t privdata_lock;
> +
>         /* ids of the ancestors at each level including self */
>         int ancestor_ids[];
>  };
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index 473e0c0abb86..63d22dfa00bd 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -833,4 +833,11 @@ static inline void put_cgroup_ns(struct cgroup_namespace *ns)
>                 free_cgroup_ns(ns);
>  }
>  
> +/* cgroup private data handling */
> +int cgroup_priv_getkey(void (*free)(struct kref *));
> +void cgroup_priv_destroykey(int key);
> +int cgroup_priv_install(struct cgroup *cgrp, int key, struct kref *ref);
> +struct kref *cgroup_priv_get(struct cgroup *cgrp, int key);
> +void cgroup_priv_release(struct cgroup *cgrp, int key);
> +
>  #endif /* _LINUX_CGROUP_H */
> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
> index 8cda3bc3ae22..3268a21e8158 100644
> --- a/kernel/cgroup/cgroup.c
> +++ b/kernel/cgroup/cgroup.c
> @@ -81,8 +81,15 @@ EXPORT_SYMBOL_GPL(css_set_lock);
>  #endif
>  
>  /*
> - * Protects cgroup_idr and css_idr so that IDs can be released without
> - * grabbing cgroup_mutex.
> + * ID allocator for cgroup private data keys; the ID's allocated here will
> + * be used to index all per-cgroup radix trees.  The radix tree built into
> + * the IDR itself will store a key-specific function to be passed to kref_put.
> + */
> +static DEFINE_IDR(cgroup_priv_idr);

Fun two radixtrees, one to hold the (*free)() and the other the void*.

Would not just a
struct cgroup_privdata {
	struct rcu_head rcu;
	void (*free)(struct kref *);
}
suffice for subclassing? (Also this is a prime candidate for switching
to XArray.)

> +struct kref *
> +cgroup_priv_get(struct cgroup *cgrp, int key)
> +{
> +       struct kref *ref;
> +
> +       WARN_ON(cgrp->root != &cgrp_dfl_root);
> +       WARN_ON(key == 0);
> +
> +       rcu_read_lock();
> +
> +       ref = radix_tree_lookup(&cgrp->privdata, key);
> +       if (ref)
> +               kref_get(ref);

This is not safe, you need

if (ref && !kref_get_unless_zero(ref))
	ref = NULL;

otherwise the cgroup_priv_release() may drop the last reference prior to
you obtaining yours. Also requires the privdata to be RCU protected.
-Chris


More information about the Intel-gfx mailing list