[Intel-gfx] [PATCH 13/66] drm/i915/gem: Don't drop the timeline lock during execbuf
Chris Wilson
chris at chris-wilson.co.uk
Tue Jul 28 14:51:50 UTC 2020
Quoting Thomas Hellström (Intel) (2020-07-23 17:09:15)
>
> On 2020-07-15 13:50, Chris Wilson wrote:
> > Our timeline lock is our defence against a concurrent execbuf
> > interrupting our request construction. we need hold it throughout or,
> > for example, a second thread may interject a relocation request in
> > between our own relocation request and execution in the ring.
> >
> > A second, major benefit, is that it allows us to preserve a large chunk
> > of the ringbuffer for our exclusive use; which should virtually
> > eliminate the threat of hitting a wait_for_space during request
> > construction -- although we should have already dropped other
> > contentious locks at that point.
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> > .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 413 +++++++++++-------
> > .../i915/gem/selftests/i915_gem_execbuffer.c | 24 +-
> > 2 files changed, 281 insertions(+), 156 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > index 719ba9fe3e85..af3499aafd22 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > @@ -259,6 +259,8 @@ struct i915_execbuffer {
> > bool has_fence : 1;
> > bool needs_unfenced : 1;
> >
> > + struct intel_context *ce;
> > +
> > struct i915_vma *target;
> > struct i915_request *rq;
> > struct i915_vma *rq_vma;
> > @@ -639,6 +641,35 @@ static int eb_reserve_vma(const struct i915_execbuffer *eb,
> > return 0;
> > }
> >
> > +static void retire_requests(struct intel_timeline *tl)
> > +{
> > + struct i915_request *rq, *rn;
> > +
> > + list_for_each_entry_safe(rq, rn, &tl->requests, link)
> > + if (!i915_request_retire(rq))
> > + break;
> > +}
> > +
> > +static int wait_for_timeline(struct intel_timeline *tl)
> > +{
> > + do {
> > + struct dma_fence *fence;
> > + int err;
> > +
> > + fence = i915_active_fence_get(&tl->last_request);
> > + if (!fence)
> > + return 0;
> > +
> > + err = dma_fence_wait(fence, true);
> > + dma_fence_put(fence);
> > + if (err)
> > + return err;
> > +
> > + /* Retiring may trigger a barrier, requiring an extra pass */
> > + retire_requests(tl);
> > + } while (1);
> > +}
> > +
> > static int eb_reserve(struct i915_execbuffer *eb)
> > {
> > const unsigned int count = eb->buffer_count;
> > @@ -646,7 +677,6 @@ static int eb_reserve(struct i915_execbuffer *eb)
> > struct list_head last;
> > struct eb_vma *ev;
> > unsigned int i, pass;
> > - int err = 0;
> >
> > /*
> > * Attempt to pin all of the buffers into the GTT.
> > @@ -662,18 +692,37 @@ static int eb_reserve(struct i915_execbuffer *eb)
> > * room for the earlier objects *unless* we need to defragment.
> > */
> >
> > - if (mutex_lock_interruptible(&eb->i915->drm.struct_mutex))
> > - return -EINTR;
> > -
> > pass = 0;
> > do {
> > + int err = 0;
> > +
> > + /*
> > + * We need to hold one lock as we bind all the vma so that
> > + * we have a consistent view of the entire vm and can plan
> > + * evictions to fill the whole GTT. If we allow a second
> > + * thread to run as we do this, it will either unbind
> > + * everything we want pinned, or steal space that we need for
> > + * ourselves. The closer we are to a full GTT, the more likely
> > + * such contention will cause us to fail to bind the workload
> > + * for this batch. Since we know at this point we need to
> > + * find space for new buffers, we know that extra pressure
> > + * from contention is likely.
> > + *
> > + * In lieu of being able to hold vm->mutex for the entire
> > + * sequence (it's complicated!), we opt for struct_mutex.
> > + */
> > + if (mutex_lock_interruptible(&eb->i915->drm.struct_mutex))
> > + return -EINTR;
> > +
>
> With TTM, an idea that has been around for a long time is to let the
> reservations resolve this. I don't think that's in place yet, though,
> due to the fact that eviction / unbinding still requires a trylock
> reservation and also because the evictions are not batched but performed
> one by one with the evicted objects' reservations dropped immediately
> after eviction. Having reservations resolve this could perhaps be
> something we could aim for in the long run as well? Unrelated batches
> would then never contend.
>
> In the meantime would it make sense to introduce a new device-wide mutex
> to avoid completely unrelated contention with the struct_mutex?
No.
The vma are not related to reservations.
-Chris
More information about the Intel-gfx
mailing list