[Intel-gfx] [PATCH v3 08/17] drm/i915: Call i915_gem_evict_vm in vm_fault_gtt to prevent new ENOSPC errors

Maarten Lankhorst maarten.lankhorst at linux.intel.com
Fri Dec 17 15:29:34 UTC 2021


On 17-12-2021 12:58, Matthew Auld wrote:
> On Thu, 16 Dec 2021 at 14:28, Maarten Lankhorst
> <maarten.lankhorst at linux.intel.com> wrote:
>> Now that we cannot unbind kill the currently locked object directly
> "unbind kill"
>
>> because we're removing short term pinning, we may have to unbind the
>> object from gtt manually, using a i915_gem_evict_vm() call.
>>
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
> Maybe mention that this only in preparation for some future patches,
> once the actual eviction is trylock and evict_for_vm can also handle
> shared dma-resv? At this point in the series we shouldn't expect to
> hit -ENOSPC, right?
>
>> ---
>>  drivers/gpu/drm/i915/gem/i915_gem_mman.c | 18 ++++++++++++++++--
>>  1 file changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> index af81d6c3332a..00cd9642669a 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> @@ -358,8 +358,22 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf)
>>                         vma = i915_gem_object_ggtt_pin_ww(obj, &ww, &view, 0, 0, flags);
>>                 }
>>
>> -               /* The entire mappable GGTT is pinned? Unexpected! */
>> -               GEM_BUG_ON(vma == ERR_PTR(-ENOSPC));
>> +               /*
>> +                * The entire mappable GGTT is pinned? Unexpected!
>> +                * Try to evict the object we locked too, as normally we skip it
>> +                * due to lack of short term pinning inside execbuf.
>> +                */
>> +               if (vma == ERR_PTR(-ENOSPC)) {
>> +                       ret = mutex_lock_interruptible(&ggtt->vm.mutex);
>> +                       if (!ret) {
>> +                               ret = i915_gem_evict_vm(&ggtt->vm);
>> +                               mutex_unlock(&ggtt->vm.mutex);
>> +                       }
>> +                       if (ret)
>> +                               goto err_reset;
>> +                       vma = i915_gem_object_ggtt_pin_ww(obj, &ww, &view, 0, 0, flags);
>> +               }
>> +               GEM_WARN_ON(vma == ERR_PTR(-ENOSPC));
> Looks like this is being triggered in CI, I assume because the trylock
> could easily fail, due to contention? Is this expected for now? Do we
> keep the WARN and track it as a known issue?

I think it makes sense. I can probably fix i915_gem_evict_vm to attempt to take all objects in a blocking way.

I had some primitives that could lock for eviction, and keep a refcount on the object. i915_gem_evict_vm could probably be changed to use it.

>>         }
>>         if (IS_ERR(vma)) {
>>                 ret = PTR_ERR(vma);
>> --
>> 2.34.1
>>



More information about the Intel-gfx mailing list