[Intel-gfx] [PATCH 2/3] drm/i915: Fix some invalid requests cancellations
Daniel Vetter
daniel at ffwll.ch
Mon Feb 15 09:47:26 UTC 2016
On Thu, Feb 11, 2016 at 01:01:34PM +0000, Tvrtko Ursulin wrote:
> On 29/01/16 16:49, Chris Wilson wrote:
> >As we add the VMA to the request early, it may be cancelled during
> >execbuf reservation. This will leave the context object pointing to a
>
> I don't get it, request is created after the reservation.
That's the problem - we add vma for a given request (well ring/seqno pair)
before that request exists.
-Daniel
>
> >dangling request; i915_wait_request() simply skips the wait and so we
> >may unbind the object whilst it is still active.
> >
> >However, if at any point we make a change to the hardware (and equally
> >importantly our bookkeeping in the driver), we cannot cancel the request
> >as what has already been written must be submitted. Submitting a partial
> >request is far easier than trying to unwind the incomplete change.
> >
> >Unfortunately this patch undoes the excess breadcrumb usage that olr
> >prevented, e.g. if we interrupt batchbuffer submission then we submit
> >the requests along with the memory writes and interrupt (even though we
> >do no real work). Disassociating requests from breadcrumbs (and
> >semaphores) is a topic for a past/future series, but now much more
> >important.
> >
> >v2: Rebase
> >
> >Note that igt/gem_concurrent_blit tiggers both misrendering and a GPF
> >that is fixed by this patch.
> >
> >Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
> >Testcase: igt/gem_concurrent_blit
> >Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> >Cc: stable at vger.kernel.org
> >---
> > drivers/gpu/drm/i915/i915_drv.h | 1 -
> > drivers/gpu/drm/i915/i915_gem.c | 7 ++-----
> > drivers/gpu/drm/i915/i915_gem_context.c | 21 +++++++++------------
> > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 16 +++++-----------
> > drivers/gpu/drm/i915/intel_display.c | 2 +-
> > drivers/gpu/drm/i915/intel_lrc.c | 1 -
> > 6 files changed, 17 insertions(+), 31 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >index a2d2f08b7515..f828a7ffed37 100644
> >--- a/drivers/gpu/drm/i915/i915_drv.h
> >+++ b/drivers/gpu/drm/i915/i915_drv.h
> >@@ -2823,7 +2823,6 @@ int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data,
> > struct drm_file *file_priv);
> > void i915_gem_execbuffer_move_to_active(struct list_head *vmas,
> > struct drm_i915_gem_request *req);
> >-void i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params);
> > int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
> > struct drm_i915_gem_execbuffer2 *args,
> > struct list_head *vmas);
> >diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >index e9b19bca1383..f764f33580fc 100644
> >--- a/drivers/gpu/drm/i915/i915_gem.c
> >+++ b/drivers/gpu/drm/i915/i915_gem.c
> >@@ -3407,12 +3407,9 @@ int i915_gpu_idle(struct drm_device *dev)
> > return PTR_ERR(req);
> >
> > ret = i915_switch_context(req);
> >- if (ret) {
> >- i915_gem_request_cancel(req);
> >- return ret;
> >- }
> >-
> > i915_add_request_no_flush(req);
> >+ if (ret)
> >+ return ret;
> > }
> >
> > ret = intel_ring_idle(ring);
> >diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> >index 83a097c94911..5da7adc3f7b2 100644
> >--- a/drivers/gpu/drm/i915/i915_gem_context.c
> >+++ b/drivers/gpu/drm/i915/i915_gem_context.c
> >@@ -652,7 +652,6 @@ static int do_switch(struct drm_i915_gem_request *req)
> > struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > struct intel_context *from = ring->last_context;
> > u32 hw_flags = 0;
> >- bool uninitialized = false;
> > int ret, i;
> >
> > if (from != NULL && ring == &dev_priv->ring[RCS]) {
> >@@ -759,6 +758,15 @@ static int do_switch(struct drm_i915_gem_request *req)
> > to->remap_slice &= ~(1<<i);
> > }
> >
> >+ if (!to->legacy_hw_ctx.initialized) {
> >+ if (ring->init_context) {
> >+ ret = ring->init_context(req);
> >+ if (ret)
> >+ goto unpin_out;
> >+ }
> >+ to->legacy_hw_ctx.initialized = true;
> >+ }
> >+
> > /* The backing object for the context is done after switching to the
> > * *next* context. Therefore we cannot retire the previous context until
> > * the next context has already started running. In fact, the below code
> >@@ -782,21 +790,10 @@ static int do_switch(struct drm_i915_gem_request *req)
> > i915_gem_context_unreference(from);
> > }
> >
> >- uninitialized = !to->legacy_hw_ctx.initialized;
> >- to->legacy_hw_ctx.initialized = true;
> >-
> > done:
> > i915_gem_context_reference(to);
> > ring->last_context = to;
> >
> >- if (uninitialized) {
> >- if (ring->init_context) {
> >- ret = ring->init_context(req);
> >- if (ret)
> >- DRM_ERROR("ring init context: %d\n", ret);
> >- }
> >- }
> >-
> > return 0;
> >
> > unpin_out:
> >diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >index 8fd00d279447..61bb15507b30 100644
> >--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >@@ -1133,7 +1133,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
> > }
> > }
> >
> >-void
> >+static void
> > i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params)
> > {
> > /* Unconditionally force add_request to emit a full flush. */
> >@@ -1318,7 +1318,6 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
> > trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
> >
> > i915_gem_execbuffer_move_to_active(vmas, params->request);
> >- i915_gem_execbuffer_retire_commands(params);
>
> Hm this whole block block from trace_... to
> i915_gem_execbuffer_retire_commands could be moved to do_execbuffer as and
> independent cleanup patch if under "if (ret == 0)".
>
> So the concern is that something has been written to the ringbuffer but not
> all of it?
>
> Why do we have to submit that, I am assuming whatever was written is not
> interesting to be executed unless the whole bb went in.
>
> So could we just rewind the pointers? Store them at the beginning and rewind
> if something failed.
>
> >
> > return 0;
> > }
> >@@ -1616,8 +1615,10 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > }
> >
> > ret = i915_gem_request_add_to_client(req, file);
> >- if (ret)
> >+ if (ret) {
> >+ i915_gem_request_cancel(req);
> > goto err_batch_unpin;
> >+ }
>
> Doesn't look like this can fail so a side cleanup only?
>
> Regards,
>
> Tvrtko
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the Intel-gfx
mailing list