[Intel-gfx] [PATCH v2 4/5] drm/i915/guc: rework guc_add_workqueue_item()

Chris Wilson chris at chris-wilson.co.uk
Fri Apr 29 16:10:19 UTC 2016


On Fri, Apr 29, 2016 at 04:44:24PM +0100, Tvrtko Ursulin wrote:
> 
> On 27/04/16 19:03, Dave Gordon wrote:
> >Mostly little optimisations; for instance, if the driver is correctly
> >following the submission protocol, the "out of space" condition is
> >impossible, so the previous runtime WARN_ON() is promoted to a
> >GEM_BUG_ON() for a more dramatic effect in development and less impact
> >in end-user systems.
> >
> >Similarly we can replace other WARN_ON() conditions that don't relate to
> >the hardware state with either BUILD_BUG_ON() for compile-time-
> >detectable issues, or GEM_BUG_ON() for logical "can't happen" errors.
> >
> >With those changes, we can convert it to void, as suggested by Chris
> >Wilson, and update the calling code appropriately.
> >
> >Signed-off-by: Dave Gordon <david.s.gordon at intel.com>
> >Cc: Chris Wilson <chris at chris-wilson.co.uk>
> >
> >---
> >  drivers/gpu/drm/i915/i915_guc_submission.c | 69 +++++++++++++++---------------
> >  drivers/gpu/drm/i915/intel_guc.h           |  2 +-
> >  drivers/gpu/drm/i915/intel_guc_fwif.h      |  3 +-
> >  3 files changed, 37 insertions(+), 37 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> >index 6626eff..4d2ea84 100644
> >--- a/drivers/gpu/drm/i915/i915_guc_submission.c
> >+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> >@@ -470,23 +470,28 @@ int i915_guc_wq_check_space(struct drm_i915_gem_request *request)
> >  	return -EAGAIN;
> >  }
> >
> >-static int guc_add_workqueue_item(struct i915_guc_client *gc,
> >-				  struct drm_i915_gem_request *rq)
> >+static void guc_add_workqueue_item(struct i915_guc_client *gc,
> >+				   struct drm_i915_gem_request *rq)
> >  {
> >+	/* wqi_len is in DWords, and does not include the one-word header */
> >+	const size_t wqi_size = sizeof(struct guc_wq_item);
> 
> Again, u32 is correct I think.
> 
> >+	const u32 wqi_len = wqi_size/sizeof(u32) - 1;
> >  	struct guc_process_desc *desc;
> >  	struct guc_wq_item *wqi;
> >  	void *base;
> >-	u32 tail, wq_len, wq_off, space;
> >+	u32 space, tail, wq_off, wq_page;
> >
> >  	desc = gc->client_base + gc->proc_desc_offset;
> >+
> >+	/* Space was checked earlier, in i915_guc_wq_check_space() above */
> 
> It may be above in the file, but the two do not call one another so
> I recommend saying exactly who called it.
> 
> >  	space = CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size);
> >-	if (WARN_ON(space < sizeof(struct guc_wq_item)))
> >-		return -ENOSPC; /* shouldn't happen */
> >+	GEM_BUG_ON(space < wqi_size);
> 
> It is impossible to hit this only because of the struct_mutex
> guarding the whole time window from request creation to submission.
> If in the future, near or far, that gets fixed, then this will need
> reworking.

Request submission will still have to serialised by a "ring" mutex, 
from the time we allocate the request to the time we add it to whatever
submission queue. It should still hold that we can pin all the required
resources (ringbuffer, context state, vm page tables, workqueues) up
front and take any errors early and then rely on our preallocation when
submitting the request.

> I don't have any better ideas though.
> 
> But a WARN_ON and return would be almost as good. Everything is
> better than a dead machine one can't ssh into...
> 
> So I appeal to make this a WARN_ON and return. Nothing bad would
> happen apart from software thinking GPU has hung.

Hence why not make it a bug? If you can't ssh in because the driver died
inside GEM, something is very wrong.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list