[Intel-gfx] [PATCH 13/66] drm/i915/gem: Don't drop the timeline lock during execbuf
Thomas Hellström (Intel)
thomas_os at shipmail.org
Tue Jul 28 14:46:27 UTC 2020
On 7/23/20 6:09 PM, Thomas Hellström (Intel) wrote:
>
> On 2020-07-15 13:50, Chris Wilson wrote:
>> Our timeline lock is our defence against a concurrent execbuf
>> interrupting our request construction. we need hold it throughout or,
>> for example, a second thread may interject a relocation request in
>> between our own relocation request and execution in the ring.
>>
>> A second, major benefit, is that it allows us to preserve a large chunk
>> of the ringbuffer for our exclusive use; which should virtually
>> eliminate the threat of hitting a wait_for_space during request
>> construction -- although we should have already dropped other
>> contentious locks at that point.
>>
>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>> ---
>> .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 413 +++++++++++-------
>> .../i915/gem/selftests/i915_gem_execbuffer.c | 24 +-
>> 2 files changed, 281 insertions(+), 156 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> index 719ba9fe3e85..af3499aafd22 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> @@ -259,6 +259,8 @@ struct i915_execbuffer {
>> bool has_fence : 1;
>> bool needs_unfenced : 1;
>> + struct intel_context *ce;
>> +
>> struct i915_vma *target;
>> struct i915_request *rq;
>> struct i915_vma *rq_vma;
>> @@ -639,6 +641,35 @@ static int eb_reserve_vma(const struct
>> i915_execbuffer *eb,
>> return 0;
>> }
>> +static void retire_requests(struct intel_timeline *tl)
>> +{
>> + struct i915_request *rq, *rn;
>> +
>> + list_for_each_entry_safe(rq, rn, &tl->requests, link)
>> + if (!i915_request_retire(rq))
>> + break;
>> +}
>> +
>> +static int wait_for_timeline(struct intel_timeline *tl)
>> +{
>> + do {
>> + struct dma_fence *fence;
>> + int err;
>> +
>> + fence = i915_active_fence_get(&tl->last_request);
>> + if (!fence)
>> + return 0;
>> +
>> + err = dma_fence_wait(fence, true);
>> + dma_fence_put(fence);
>> + if (err)
>> + return err;
>> +
>> + /* Retiring may trigger a barrier, requiring an extra pass */
>> + retire_requests(tl);
>> + } while (1);
>> +}
>> +
>> static int eb_reserve(struct i915_execbuffer *eb)
>> {
>> const unsigned int count = eb->buffer_count;
>> @@ -646,7 +677,6 @@ static int eb_reserve(struct i915_execbuffer *eb)
>> struct list_head last;
>> struct eb_vma *ev;
>> unsigned int i, pass;
>> - int err = 0;
>> /*
>> * Attempt to pin all of the buffers into the GTT.
>> @@ -662,18 +692,37 @@ static int eb_reserve(struct i915_execbuffer *eb)
>> * room for the earlier objects *unless* we need to defragment.
>> */
>> - if (mutex_lock_interruptible(&eb->i915->drm.struct_mutex))
>> - return -EINTR;
>> -
>> pass = 0;
>> do {
>> + int err = 0;
>> +
>> + /*
>> + * We need to hold one lock as we bind all the vma so that
>> + * we have a consistent view of the entire vm and can plan
>> + * evictions to fill the whole GTT. If we allow a second
>> + * thread to run as we do this, it will either unbind
>> + * everything we want pinned, or steal space that we need for
>> + * ourselves. The closer we are to a full GTT, the more likely
>> + * such contention will cause us to fail to bind the workload
>> + * for this batch. Since we know at this point we need to
>> + * find space for new buffers, we know that extra pressure
>> + * from contention is likely.
>> + *
>> + * In lieu of being able to hold vm->mutex for the entire
>> + * sequence (it's complicated!), we opt for struct_mutex.
>> + */
>> + if (mutex_lock_interruptible(&eb->i915->drm.struct_mutex))
>> + return -EINTR;
>> +
>
> With TTM, an idea that has been around for a long time is to let the
> reservations resolve this. I don't think that's in place yet, though,
> due to the fact that eviction / unbinding still requires a trylock
> reservation and also because the evictions are not batched but
> performed one by one with the evicted objects' reservations dropped
> immediately after eviction. Having reservations resolve this could
> perhaps be something we could aim for in the long run as well?
> Unrelated batches would then never contend.
>
> In the meantime would it make sense to introduce a new device-wide mutex
>
> to avoid completely unrelated contention with the struct_mutex?
>
>
Actually I see this is changed later in the series...
/Thomas
More information about the Intel-gfx
mailing list