[PATCH 24/29] mm: vmscan: make global slab shrink lockless

Fri Jun 23 13:10:57 UTC 2023

On 2023/6/23 14:29, Dave Chinner wrote:
> On Thu, Jun 22, 2023 at 05:12:02PM +0200, Vlastimil Babka wrote:
>> On 6/22/23 10:53, Qi Zheng wrote:
>>> @@ -1067,33 +1068,27 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
>>>   	if (!mem_cgroup_disabled() && !mem_cgroup_is_root(memcg))
>>>   		return shrink_slab_memcg(gfp_mask, nid, memcg, priority);
>>>   
>>> -	if (!down_read_trylock(&shrinker_rwsem))
>>> -		goto out;
>>> -
>>> -	list_for_each_entry(shrinker, &shrinker_list, list) {
>>> +	rcu_read_lock();
>>> +	list_for_each_entry_rcu(shrinker, &shrinker_list, list) {
>>>   		struct shrink_control sc = {
>>>   			.gfp_mask = gfp_mask,
>>>   			.nid = nid,
>>>   			.memcg = memcg,
>>>   		};
>>>   
>>> +		if (!shrinker_try_get(shrinker))
>>> +			continue;
>>> +		rcu_read_unlock();
>>
>> I don't think you can do this unlock?
>>
>>> +
>>>   		ret = do_shrink_slab(&sc, shrinker, priority);
>>>   		if (ret == SHRINK_EMPTY)
>>>   			ret = 0;
>>>   		freed += ret;
>>> -		/*
>>> -		 * Bail out if someone want to register a new shrinker to
>>> -		 * prevent the registration from being stalled for long periods
>>> -		 * by parallel ongoing shrinking.
>>> -		 */
>>> -		if (rwsem_is_contended(&shrinker_rwsem)) {
>>> -			freed = freed ? : 1;
>>> -			break;
>>> -		}
>>> -	}
>>>   
>>> -	up_read(&shrinker_rwsem);
>>> -out:
>>> +		rcu_read_lock();
>>
>> That new rcu_read_lock() won't help AFAIK, the whole
>> list_for_each_entry_rcu() needs to be under the single rcu_read_lock() to be
>> safe.
> 
> Yeah, that's the pattern we've been taught and the one we can look
> at and immediately say "this is safe".
> 
> This is a different pattern, as has been explained bi Qi, and I
> think it *might* be safe.
> 
> *However.*
> 
> Right now I don't have time to go through a novel RCU list iteration
> pattern it one step at to determine the correctness of the
> algorithm. I'm mostly worried about list manipulations that can
> occur outside rcu_read_lock() section bleeding into the RCU
> critical section because rcu_read_lock() by itself is not a memory
> barrier.
> 
> Maybe Paul has seen this pattern often enough he could simply tell
> us what conditions it is safe in. But for me to work that out from
> first principles? I just don't have the time to do that right now.

Hi Paul, can you help to confirm this?

> 
>> IIUC this is why Dave in [4] suggests unifying shrink_slab() with
>> shrink_slab_memcg(), as the latter doesn't iterate the list but uses IDR.
> 
> Yes, I suggested the IDR route because radix tree lookups under RCU
> with reference counted objects are a known safe pattern that we can
> easily confirm is correct or not.  Hence I suggested the unification
> + IDR route because it makes the life of reviewers so, so much
> easier...

In fact, I originally planned to try the unification + IDR method you
suggested at the beginning. But in the case of CONFIG_MEMCG disabled,
the struct mem_cgroup is not even defined, and root_mem_cgroup and
shrinker_info will not be allocated. This required more code changes, so
I ended up keeping the shrinker_list and implementing the above pattern.

If the above pattern is not safe, I will go back to the unification +
IDR method.

Thanks,
Qi

> 
> Cheers,
> 
> Dave.