[Intel-gfx] [PATCH v3 1/6] cgroup: Allow registration and lookup of cgroup private data
Alexei Starovoitov
ast at fb.com
Tue Mar 13 21:47:45 UTC 2018
On 3/13/18 2:37 PM, Roman Gushchin wrote:
> On Tue, Mar 13, 2018 at 02:27:58PM -0700, Alexei Starovoitov wrote:
>> On 3/13/18 1:50 PM, Tejun Heo wrote:
>>> Hello, Matt.
>>>
>>> cc'ing Roman and Alexei.
>>>
>>> On Tue, Mar 06, 2018 at 03:46:55PM -0800, Matt Roper wrote:
>>>> There are cases where other parts of the kernel may wish to store data
>>>> associated with individual cgroups without building a full cgroup
>>>> controller. Let's add interfaces to allow them to register and lookup
>>>> this private data for individual cgroups.
>>>>
>>>> A kernel system (e.g., a driver) that wishes to register private data
>>>> for a cgroup will do so by subclassing the 'struct cgroup_priv'
>>>> structure to describe the necessary data to store. Before registering a
>>>> private data structure to a cgroup, the caller should fill in the 'key'
>>>> and 'free' fields of the base cgroup_priv structure.
>>>>
>>>> * 'key' should be a unique void* that will act as a key for future
>>>> privdata lookups/removals. Note that this allows drivers to store
>>>> per-device private data for a cgroup by using a device pointer as a key.
>>>>
>>>> * 'free' should be a function pointer to a function that may be used
>>>> to destroy the private data. This function will be called
>>>> automatically if the underlying cgroup is destroyed.
>>>
>>> This feature turned out to have more users than I originally
>>> anticipated and bpf also wants something like this to track network
>>> states. The requirements are pretty similar but not quite the same.
>>> The extra requirements are...
>>>
>>> * Lookup must be really cheap. Probably using pointer hash or walking
>>> list isn't great, so maybe idr based lookup + RCU protected index
>>> table per cgroup?
>>>
>>> * It should support both regular memory and percpu pointers. Given
>>> that what cgroup does is pretty much cgroup:key -> pointer lookup,
>>> it's mostly about getting the interface right so that it's not too
>>> error-prone.
>>
>> from bpf side there should be _zero_ lookups.
>> If bpf do a lookup it can equally use its own map to do that.
>>
>> From bpf program point of view it will look like:
>> struct my_data {
>> u64 a;
>> u32 b;
>> } *ptr;
>> ptr = bpf_get_cgroup_buf(skb, sizeof(struct my_data));
>>
>> bpf_get_cgroup_buf() is lookup-free. Worst case it will do pointer
>> dereferences
>> skb->sk->sk_cgrp_data->val to get to cgroup and from cgroup to get pointer
>> to the buffer.
>
> Having strictly one buffer per-cgroup sounds very limiting.
> What if two independent bpf programs start using it?
>
>> In good case it may be optimized (inlined) by the verifier into absolute
>> address of that cgroup scratch buffer at attach time.
>
> Maybe we can have an idr-based index table (as Tejun suggested above),
> but avoid performance penalty by optimizing the lookup out at the attach time?
it has to be zero lookups. If idr lookup is involved, it's cleaner
to add idr as new bpf map type and use cgroup ino as an id.
More information about the Intel-gfx
mailing list