[Intel-gfx] [PATCH 7/7] drm/i915/gem: Acquire all vma/objects under reservation_ww_class

Thomas Hellström (Intel) thomas_os at shipmail.org
Tue Jun 23 12:57:06 UTC 2020


On 6/23/20 1:22 PM, Thomas Hellström (Intel) wrote:
> Hi, Chris,
>
> On 6/22/20 11:59 AM, Chris Wilson wrote:
>> In order to actually handle eviction and what not, we need to process
>> all the objects together under a common lock, reservation_ww_class. As
>> such, do a memory reservation pass after looking up the object/vma,
>> which then feeds into the rest of execbuf [relocation, cmdparsing,
>> flushing and ofc execution].
>>
>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>> ---
>>   .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 91 ++++++++++++++-----
>>   1 file changed, 70 insertions(+), 21 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
>> b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> index 46fcbdf8161c..8db2e013465f 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> @@ -53,10 +53,9 @@ struct eb_vma_array {
>>     #define __EXEC_OBJECT_HAS_PIN        BIT(31)
>>   #define __EXEC_OBJECT_HAS_FENCE        BIT(30)
>> -#define __EXEC_OBJECT_HAS_PAGES        BIT(29)
>> -#define __EXEC_OBJECT_NEEDS_MAP        BIT(28)
>> -#define __EXEC_OBJECT_NEEDS_BIAS    BIT(27)
>> -#define __EXEC_OBJECT_INTERNAL_FLAGS    (~0u << 27) /* all of the 
>> above */
>> +#define __EXEC_OBJECT_NEEDS_MAP        BIT(29)
>> +#define __EXEC_OBJECT_NEEDS_BIAS    BIT(28)
>> +#define __EXEC_OBJECT_INTERNAL_FLAGS    (~0u << 28) /* all of the 
>> above */
>>     #define __EXEC_HAS_RELOC    BIT(31)
>>   #define __EXEC_INTERNAL_FLAGS    (~0u << 31)
>> @@ -241,6 +240,8 @@ struct i915_execbuffer {
>>       struct intel_context *context; /* logical state for the request */
>>       struct i915_gem_context *gem_context; /** caller's context */
>>   +    struct dma_fence *mm_fence;
>> +
>>       struct i915_request *request; /** our request to build */
>>       struct eb_vma *batch; /** identity of the batch obj/vma */
>>       struct i915_vma *trampoline; /** trampoline used for chaining */
>> @@ -331,12 +332,7 @@ static inline void eb_unreserve_vma(struct 
>> eb_vma *ev)
>>       if (ev->flags & __EXEC_OBJECT_HAS_PIN)
>>           __i915_vma_unpin(vma);
>>   -    if (ev->flags & __EXEC_OBJECT_HAS_PAGES)
>> -        i915_gem_object_unpin_pages(vma->obj);
>> -
>> -    ev->flags &= ~(__EXEC_OBJECT_HAS_PIN |
>> -               __EXEC_OBJECT_HAS_FENCE |
>> -               __EXEC_OBJECT_HAS_PAGES);
>> +    ev->flags &= ~(__EXEC_OBJECT_HAS_PIN | __EXEC_OBJECT_HAS_FENCE);
>>   }
>>     static void eb_vma_array_destroy(struct kref *kref)
>> @@ -667,6 +663,55 @@ eb_add_vma(struct i915_execbuffer *eb,
>>       list_add_tail(&ev->lock_link, &eb->lock);
>>   }
>>   +static int eb_vma_get_pages(struct i915_execbuffer *eb,
>> +                struct eb_vma *ev,
>> +                u64 idx)
>> +{
>> +    struct i915_vma *vma = ev->vma;
>> +    int err;
>> +
>> +    /* XXX also preallocate PD for vma */
>> +
>> +    err = ____i915_gem_object_get_pages_async(vma->obj);
>> +    if (err)
>> +        return err;
>> +
>> +    return i915_active_ref(&vma->obj->mm.active, idx, eb->mm_fence);
>> +}
>> +
>> +static int eb_reserve_mm(struct i915_execbuffer *eb)
>> +{
>> +    const u64 idx = eb->context->timeline->fence_context;
>> +    struct ww_acquire_ctx acquire;
>> +    struct eb_vma *ev;
>> +    int err;
>> +
>> +    eb->mm_fence = __dma_fence_create_proxy(0, 0);
>> +    if (!eb->mm_fence)
>> +        return -ENOMEM;
>
> Question: eb is local to this thread, right, so eb->mm_fence is not 
> considered "published" yet?
>
>> +
>> +    ww_acquire_init(&acquire, &reservation_ww_class);
>> +
>> +    err = eb_lock_vma(eb, &acquire);
>> +    if (err)
>> +        goto out;
>> +
>> +    ww_acquire_done(&acquire);
>> +
>> +    list_for_each_entry(ev, &eb->lock, lock_link) {
>> +        struct i915_vma *vma = ev->vma;
>> +
>> +        if (err == 0)
>> +            err = eb_vma_get_pages(eb, ev, idx);
>
> I figure this is where you publish the proxy fence? If so, the fence 
> signaling critical path starts with this loop, and that means any code 
> we call between here and submission complete (including spawned work 
> we need to wait for before submission) may not lock the 
> reservation_ww_class nor (still being discussed) allocate memory. It 
> looks like i915_pin_vma takes a reservation_ww_class. And all memory 
> pinning seems to be in the fence critical path as well?

And I think even if we at some point end up with the allocation 
annotation the other way around, allowing memory allocations in fence 
signalling critical paths, both relocations and userpointer would cause 
lockdep problems because of

mmap_sem->reservation_object->fence_wait (fault handlers, lockdep priming)
vs
fence_critical->gup/copy_from_user->mmap_sem

/Thomas


>
> /Thomas
>
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx


More information about the Intel-gfx mailing list