[Intel-gfx] [PATCH v2 38/38] drm/i915: Add initial selftests for hang detection and resets

Chris Wilson chris at chris-wilson.co.uk
Wed Feb 1 13:31:43 UTC 2017


On Wed, Feb 01, 2017 at 01:43:41PM +0200, Mika Kuoppala wrote:
> Chris Wilson <chris at chris-wilson.co.uk> writes:
> > +static u64 hws_address(const struct i915_vma *hws,
> > +		       const struct drm_i915_gem_request *rq)
> > +{
> > +	return hws->node.start + offset_in_page(sizeof(u32)*rq->fence.context);
> 
> fence.context is something unique returned by dma_fence_context_alloc()
> and we assume we don't collide in the scope of this test?

Correct, fence.context is unique to a timeline/engine.
(dma_fence_context_alloc() doesn't prevent collisions, but that is a
topic for another day.)

> > +static struct drm_i915_gem_request *
> > +hang_create_request(struct hang *h,
> > +		    struct intel_engine_cs *engine,
> > +		    struct i915_gem_context *ctx)
> > +{
> > +	struct drm_i915_gem_request *rq;
> > +	int err;
> > +
> > +	if (i915_gem_object_is_active(h->obj)) {
> > +		struct drm_i915_gem_object *obj;
> > +		void *vaddr;
> > +
> > +		obj = i915_gem_object_create_internal(h->i915, PAGE_SIZE);
> > +		if (IS_ERR(obj))
> > +			return ERR_CAST(obj);
> > +
> > +		vaddr = i915_gem_object_pin_map(obj,
> > +						HAS_LLC(h->i915) ? I915_MAP_WB : I915_MAP_WC);
> > +		if (IS_ERR(vaddr)) {
> > +			i915_gem_object_put(obj);
> > +			return ERR_CAST(vaddr);
> > +		}
> > +
> > +		i915_gem_object_unpin_map(h->obj);
> > +		__i915_gem_object_release_unless_active(h->obj);
> > +
> > +		h->obj = obj;
> > +		h->batch = vaddr;
> 
> This whole block confuses me. Is it about the reset queue test
> if something went wrong with the previous request?

I was trying to write a generic struct hang to handle tests I haven't
thought of yet. In this case, I want to create a new request whilst the
old one is inflight, and so need to replace the pointers with a fresh
bo. It can be more fancy, we could just reuse the same bo until it is
full, but I was trying to get something up and running.

> > +static int igt_wait_reset(void *arg)
> > +{
> > +	struct drm_i915_private *i915 = arg;
> > +	struct drm_i915_gem_request *rq;
> > +	unsigned int reset_count;
> > +	struct hang h;
> > +	long timeout;
> > +	int err;
> > +
> > +	/* Check that we detect a stuck waiter and issue a reset */
> > +
> > +	if (!intel_has_gpu_reset(i915))
> > +		return 0;
> > +
> > +	set_bit(I915_RESET_IN_PROGRESS, &i915->gpu_error.flags);
> 
> Noticed that you do this early. In this test the fake_hangcheck
> would do it for you also but I suspect you want this to gain
> exclusive access after this point?

Yes, I was mostly trying to apply a common pattern, claiming exclusivity
was a bonus. The difference is more apparent later.

What is missing to complete this set of tests are per-engine resets.
Hooking those in will be a fun exercise in improving both.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list