[Intel-gfx] [PATCH v2] drm/i915: Shrink the GEM kmem_caches upon idling

Tue Jan 16 15:16:28 UTC 2018

On 16/01/2018 15:12, Tvrtko Ursulin wrote:
> 
> On 16/01/2018 13:05, Chris Wilson wrote:
>> When we finally decide the gpu is idle, that is a good time to shrink
>> our kmem_caches.
>>
>> v2: Comment upon the random sprinkling of rcu_barrier() inside the idle
>> worker.
>>
>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_gem.c | 30 ++++++++++++++++++++++++++++++
>>   1 file changed, 30 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c 
>> b/drivers/gpu/drm/i915/i915_gem.c
>> index 335731c93b4a..61b13fdfaa71 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -4716,6 +4716,21 @@ i915_gem_retire_work_handler(struct work_struct 
>> *work)
>>       }
>>   }
>> +static void shrink_caches(struct drm_i915_private *i915)
>> +{
>> +    /*
>> +     * kmem_cache_shrink() discards empty slabs and reorders partially
>> +     * filled slabs to prioritise allocating from the mostly full slabs,
>> +     * with the aim of reducing fragmentation.
>> +     */
>> +    kmem_cache_shrink(i915->priorities);
>> +    kmem_cache_shrink(i915->dependencies);
>> +    kmem_cache_shrink(i915->requests);
>> +    kmem_cache_shrink(i915->luts);
>> +    kmem_cache_shrink(i915->vmas);
>> +    kmem_cache_shrink(i915->objects);
>> +}
>> +
>>   static inline bool
>>   new_requests_since_last_retire(const struct drm_i915_private *i915)
>>   {
>> @@ -4803,6 +4818,21 @@ i915_gem_idle_work_handler(struct work_struct 
>> *work)
>>           GEM_BUG_ON(!dev_priv->gt.awake);
>>           i915_queue_hangcheck(dev_priv);
>>       }
>> +
>> +    /*
>> +     * We use magical TYPESAFE_BY_RCU kmem_caches whose pages are not
>> +     * returned to the system imediately but only after an RCU grace
>> +     * period. We want to encourage such pages to be returned and so
>> +     * incorporate a RCU barrier here to provide some rate limiting
>> +     * of the driver and flush the old pages before we free a new batch
>> +     * from the next round of shrinking.
>> +     */
>> +    rcu_barrier();
> 
> Should this go into the conditional below? I don't think it makes a 
> difference effectively, but may be more logical.
> 
>> +
>> +    if (!new_requests_since_last_retire(dev_priv)) {
>> +        __i915_gem_free_work(&dev_priv->mm.free_work);
> 
> I thought for a bit if re-using the worker from here is completely fine 
> but I think it is. We expect only one pass when called from here so 
> need_resched will be correctly neutralized/not-relevant from this path. 
> Hm, unless if we consider mmap_gtt users.. so we could still have new 
> objects appearing on the free_list after the 1st pass. And then 
> need_resched might kick us out. What do you think?

This also ties back to what I wrote in the earlier reply - do we want to 
shrink the obj and vma caches from here? It may be colliding with 
mmap_gtt operations. But it sounds appealing to tidy them, and I can't 
think of any other convenient point. Given how we are de-prioritising 
mmap_gtt its probably fine.

> 
> Regards,
> 
> Tvrtko
> 
>> +        shrink_caches(dev_priv);
>> +    }
>>   }
>>   int i915_gem_suspend(struct drm_i915_private *dev_priv)
>>