[Intel-gfx] [PATCH v2 1/2] drm/i915: signal first fence from irq handler if complete

Fri Feb 24 10:47:29 UTC 2017

On 24/02/2017 10:19, Chris Wilson wrote:
> On Fri, Feb 24, 2017 at 10:04:23AM +0000, Tvrtko Ursulin wrote:
>>
>> On 23/02/2017 15:42, Chris Wilson wrote:
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_request.h b/drivers/gpu/drm/i915/i915_gem_request.h
>>> index 5f73d8c0a38a..0efee879df23 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_request.h
>>> +++ b/drivers/gpu/drm/i915/i915_gem_request.h
>>> @@ -32,10 +32,12 @@
>>>
>>> struct drm_file;
>>> struct drm_i915_gem_object;
>>> +struct drm_i915_gem_request;
>>>
>>> struct intel_wait {
>>> 	struct rb_node node;
>>> 	struct task_struct *tsk;
>>> +	struct drm_i915_gem_request *request;
>>> 	u32 seqno;
>>
>> Hm, do we now have a situation where both waiters and signallers
>> hold a reference to the request so could drop the seqno field from
>> here?
>
> Keeping the seqno field helps with the lowlevel testcase that works
> without requests. Not scientific, but it was one less pointer for the
> irq handler.

Ok.

>>> };
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
>>> index bc70e2c451b2..3c79753edf0e 100644
>>> --- a/drivers/gpu/drm/i915/i915_irq.c
>>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>>> @@ -1033,12 +1033,41 @@ static void ironlake_rps_change_irq_handler(struct drm_i915_private *dev_priv)
>>>
>>> static void notify_ring(struct intel_engine_cs *engine)
>>> {
>>> -	bool waiters;
>>> +	struct drm_i915_gem_request *rq = NULL;
>>> +	struct intel_wait *wait;
>>> +
>>> +	if (!intel_engine_has_waiter(engine))
>>> +		return;
>>
>> I think we need the tracepoint before the bail out.
>>
>>>
>>> 	atomic_inc(&engine->irq_count);
>>
>> And the increment as well.
>>
>>> 	set_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
>>> -	waiters = intel_engine_wakeup(engine);
>>> -	trace_intel_engine_notify(engine, waiters);
>>> +
>>> +	spin_lock(&engine->breadcrumbs.lock);
>>> +	wait = engine->breadcrumbs.first_wait;
>>> +	if (wait) {
>>> +		/* We use a callback from the dma-fence to submit
>>> +		 * requests after waiting on our own requests. To
>>> +		 * ensure minimum delay in queuing the next request to
>>> +		 * hardware, signal the fence now rather than wait for
>>> +		 * the signaler to be woken up. We still wake up the
>>> +		 * waiter in order to handle the irq-seqno coherency
>>> +		 * issues (we may receive the interrupt before the
>>> +		 * seqno is written, see __i915_request_irq_complete())
>>> +		 * and to handle coalescing of multiple seqno updates
>>> +		 * and many waiters.
>>> +		 */
>>> +		if (i915_seqno_passed(intel_engine_get_seqno(engine),
>>> +				      wait->seqno))
>>> +			rq = wait->request;
>>> +
>>> +		wake_up_process(wait->tsk);
>>> +	}
>>> +	spin_unlock(&engine->breadcrumbs.lock);
>>> +
>>> +	if (rq) /* rq is protected by RCU, so safe from hard-irq context */
>>> +		dma_fence_signal(&rq->fence);
>>> +
>>> +	trace_intel_engine_notify(engine, wait);
>
> Hmm, I put the trace here for access to wait. I guess we only need a
> READ_ONCE(engine->breadcrumbs.first_wait) for tracking?

So far I don't think I even used that field from the tracepoint, but I 
think we just agreed it was an useful thing to have.

Tracepoint itself is useful for deducing individual request boundaries 
with coalescing which is why I don't think it being conditional on 
waiters is correct. But I do agree that might be questionable since it 
is not reliable until the GuC scheduling backend and GuC becoming a 
default, and at that point it will make no difference where the exact 
tracepoint site is.

> Note in the next patch, it's different again. And yeah I should massage
> it such that it doesn't introduce a fluctuation.

Yes would be better. :)

>>> }
>>>
>>> static void vlv_c0_read(struct drm_i915_private *dev_priv,
>>> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
>>> index 027c93e34c97..5f2665aa817c 100644
>>> --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
>>> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
>>> @@ -81,8 +81,11 @@ static void intel_breadcrumbs_fake_irq(unsigned long data)
>>> 	 * every jiffie in order to kick the oldest waiter to do the
>>> 	 * coherent seqno check.
>>> 	 */
>>> -	if (intel_engine_wakeup(engine))
>>> -		mod_timer(&engine->breadcrumbs.fake_irq, jiffies + 1);
>>> +	if (!intel_engine_has_waiter(engine))
>>> +		return;
>>> +
>>> +	intel_engine_wakeup(engine);
>>> +	mod_timer(&engine->breadcrumbs.fake_irq, jiffies + 1);
>>
>> Not sure it is worth splitting this in two instead of just leaving
>> it as it was.
>
> The problem with the previous code was the timer stopped if the waiter
> was running at that moment. I am still playing to find something that
> doesn't look horrible. Just going back to the bool intel_engine_wakeup,
> which is now far too large to be an inline for an unimportant function.

[thread merge]

> Ah, found the complication. Here we want intel_engine_wakeup() to
> report if there was a waiter, but in intel_breadcrumbs_hangcheck, we
> want to know if we failed to wakeup the waiter.
>
> Just grin and bear it for the moment. :|

I've missed that detail. Makes sense.

Regards,

Tvrtko