[Intel-gfx] [PATCH v5] drm/i915/scheduler: emulate a scheduler for guc

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Wed Feb 1 13:15:49 UTC 2017


On 01/02/2017 12:46, Chris Wilson wrote:
> On Wed, Feb 01, 2017 at 12:09:40PM +0000, Tvrtko Ursulin wrote:
>>
>> On 01/02/2017 10:24, Chris Wilson wrote:
>>> This emulates execlists on top of the GuC in order to defer submission of
>>> requests to the hardware. This deferral allows time for high priority
>>> requests to gazump their way to the head of the queue, however it nerfs
>>> the GuC by converting it back into a simple execlist (where the CPU has
>>> to wake up after every request to feed new commands into the GuC).
>>
>> Not every request since it does have two virtual ports, or even a
>> bit better with request merging. Just saying so it sounds less
>> dramatic for people reading commit messages.
>
> We still wakeup after the first virtual port is drained, with the hope
> that the second is full to prevent the stall. What you meant is that we
> coalesce requests from the same context into a port, so not every
> request need result in an extra interrupt. Almost every request.

I was just after a milder description so it doesn't sound like we never 
submit more than one request to the GuC.

>>> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
>>> index 0ff75f2282fa..e0cec254faa5 100644
>>> --- a/drivers/gpu/drm/i915/i915_irq.c
>>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>>> @@ -1350,8 +1350,10 @@ static void snb_gt_irq_handler(struct drm_i915_private *dev_priv,
>>> static __always_inline void
>>> gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
>>> {
>>> -	if (iir & (GT_RENDER_USER_INTERRUPT << test_shift))
>>> +	if (iir & (GT_RENDER_USER_INTERRUPT << test_shift)) {
>>> +		tasklet_schedule(&engine->irq_tasklet);
>>> 		notify_ring(engine);
>>> +	}
>>
>> Would this be better:
>>
>> if (iir & (CTX | USR)) {
>> 	set_bit();
>
> No because ^.
>
>> 	tasklet_hi_schedule();
>> 	
>> 	if (iir & USR) {
>> 		notify_ring();
>> 	}
>> }
>>
>> ?
>
> If we set the context-switch bit too early (and process the
> user-interrupt and the context-switch as two separate passes through the
> tasklet), we encounter the error from before that the CSB register may
> be undefined. Possibly not since the user interrupt should mean the ring
> is powered up (and so the register restored from the power context, one
> hopes!), but there is no reason for us to read back the registers until
> we see the interrupt telling us they have been updated.
>
> Note sure the best way to write that for simplicity, so I kept the patch
> clean.
>
>         if (!(iir & ((GT_RENDER_USER_INTERRUPT | GT_CONTEXT_SWITCH_INTERRUPT) << test_shift)))
>                 return;
>
>         if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift))
>                 set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
>
>         tasklet_hi_schedule(&engine->irq_tasklet);
>
>         if (iir & (GT_RENDER_USER_INTERRUPT << test_shift))
>                 notify_ring(engine);
>
> add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-85 (-85)
> function                                     old     new   delta
> gen8_gt_irq_handler                          756     671     -85
> Total: Before=1093982, After=1093897, chg -0.01%
>
> Ok, gcc likes that a lot more.

What's this add/remove grow/shrink up/down business? Something from some 
standard tool?

Back to the topic - should we be concerned here that in execlist mode we 
might wake up the tasklet prematurely? If it manages to run and exit in 
the window between the user interrupt and the context interrupt it would 
just waste cycles. I can easily see ~150us between the two here. What do 
you think?

Regards,

Tvrtko


More information about the Intel-gfx mailing list