[Intel-gfx] [PATCH] drm/i915: Do not leak VMAs (and PPGTT VMs) of imported flinked objects
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Mon Apr 20 05:53:03 PDT 2015
On 04/20/2015 01:36 PM, Chris Wilson wrote:
> On Mon, Apr 20, 2015 at 01:14:34PM +0100, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>
>> If a client instantiates a VMA against an imported object (flink) this VMA
>> will not get unbound when the object is closed.
>>
>> This happens because the exporter holds a reference count on the object and
>> will also keep a reference to the PPGTT VM.
>>
>> In real life this happens with xorg-driver-intel and fbcon takeover. Latter
>> is copied from via the flink name and when Xorg process exists one VMA
>> remains dangling with a now unreachable VM.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>> Testcase: igt/gem_ppgtt/flink-vs-ctx-vm-leak
>> ---
>> drivers/gpu/drm/i915/i915_drv.c | 1 +
>> drivers/gpu/drm/i915/i915_drv.h | 2 ++
>> drivers/gpu/drm/i915/i915_gem.c | 63 ++++++++++++++++++++++++++++++++---------
>> 3 files changed, 53 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>> index f9754c3..16a0b34 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -1630,6 +1630,7 @@ static struct drm_driver driver = {
>> .debugfs_init = i915_debugfs_init,
>> .debugfs_cleanup = i915_debugfs_cleanup,
>> #endif
>> + .gem_close_object = i915_gem_close_object,
>> .gem_free_object = i915_gem_free_object,
>> .gem_vm_ops = &i915_gem_vm_ops,
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 6a2528c..e82790b 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2635,6 +2635,8 @@ struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
>> size_t size);
>> void i915_init_vm(struct drm_i915_private *dev_priv,
>> struct i915_address_space *vm);
>> +void i915_gem_close_object(struct drm_gem_object *gem_obj,
>> + struct drm_file *file);
>> void i915_gem_free_object(struct drm_gem_object *obj);
>> void i915_gem_vma_destroy(struct i915_vma *vma);
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index f7b8766..a720154 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -4523,6 +4523,55 @@ static bool discard_backing_storage(struct drm_i915_gem_object *obj)
>> return atomic_long_read(&obj->base.filp->f_count) == 1;
>> }
>>
>> +static void i915_gem_unbind_vma(struct drm_i915_private *dev_priv,
>> + struct i915_vma *vma)
>> +{
>> + if (WARN_ON(i915_vma_unbind(vma) == -ERESTARTSYS)) {
>> + bool was_interruptible;
>> +
>> + was_interruptible = dev_priv->mm.interruptible;
>> + dev_priv->mm.interruptible = false;
>> +
>> + WARN_ON(i915_vma_unbind(vma));
>> +
>> + dev_priv->mm.interruptible = was_interruptible;
>> + }
>> +}
>> +
>> +void i915_gem_close_object(struct drm_gem_object *gem_obj,
>> + struct drm_file *file)
>> +{
>> + struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
>> + struct drm_device *dev = obj->base.dev;
>> + struct drm_i915_file_private *file_priv = file->driver_priv;
>> + struct i915_vma *vma, *next;
>> + struct i915_hw_ppgtt *ppgtt;
>> +
>> + mutex_lock(&dev->struct_mutex);
>> +
>> + /*
>> + * Release all VMAs associated with this client's PPGTT.
>> + *
>> + * This is to avoid potentially unreachable VMAs since contexts can have
>> + * shorter lifetime than objects. Meaning if a client has a reference to
>> + * an object (flink) and an instantiated VMA, when it exists neither VMA
>> + * will be unbound (since object free won't run), nor the PPGTT VM
>> + * freed (since VMA holds a reference to it).
>> + */
>> + list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
>> + if (i915_is_ggtt(vma->vm))
>> + continue;
>> +
>> + ppgtt = (struct i915_hw_ppgtt *)vma->vm;
>> + if (ppgtt->file_priv != file_priv)
>> + continue;
>> +
>> + i915_gem_unbind_vma(dev->dev_private, vma);
>
> No we can't do this, as it makes close sync and so can have disasterous
> effects on performance (though mitigated chiefly by userspace
> agressively caching bo) and also the unbind is very likely to fail,
> though admittedly the fbcon copy should be before X starts (ab)using
> signals - hence nasty WARN_ON.
>
> Plus also walking this linear list is quite painful in certain abusive
> tests. My preference for fixing this bug would be via vma active
> references and auto-unbinding on retirement after a close.
Why this is any different today with GEM_CLOSE having pretty similar VMA
unbind loop? Is the typical (and interesting) case not that GEM_CLOSE
will trigger gem_object_close and gem_object_free at the same invocation?
> Expect some fun patches in your inbox Tvrtko :-p
Ok, just don't make them depend on everything. :)
Regards,
Tvrtko
More information about the Intel-gfx
mailing list