[Intel-gfx] [PATCH v4] drm/i915: Optimistically spin for the request completion

Fri Mar 20 09:31:29 PDT 2015

On 03/20/2015 04:19 PM, Chris Wilson wrote:
> On Fri, Mar 20, 2015 at 04:01:52PM +0000, Tvrtko Ursulin wrote:
>>
>> On 03/20/2015 02:36 PM, Chris Wilson wrote:
>>> This provides a nice boost to mesa in swap bound scenarios (as mesa
>>> throttles itself to the previous frame and given the scenario that will
>>> complete shortly). It will also provide a good boost to systems running
>>> with semaphores disabled and so frequently waiting on the GPU as it
>>> switches rings. In the most favourable of microbenchmarks, this can
>>> increase performance by around 15% - though in practice improvements
>>> will be marginal and rarely noticeable.
>>>
>>> v2: Account for user timeouts
>>> v3: Limit the spinning to a single jiffie (~1us) at most. On an
>>> otherwise idle system, there is no scheduler contention and so without a
>>> limit we would spin until the GPU is ready.
>>> v4: Drop forcewake - the lazy coherent access doesn't require it, and we
>>> have no reason to believe that the forcewake itself improves seqno
>>> coherency - it only adds delay.
>>>
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
>>> Cc: Eero Tamminen <eero.t.tamminen at intel.com>
>>> Cc: "Rantala, Valtteri" <valtteri.rantala at intel.com>
>>
>> Still against a toggle switch like a simple module parameter?
>
> Yes. I'd much rather tackle the corner cases than ignore them.

I'll say it once more then leave it be - my point of view is that module 
param does not mean ignoring any issues. It rather means that, if 
pathological use case if found in the field, you can provide a better 
user experience and then work in parallel in coming with improvements.

Your view is probably that if there is a toggle, someone somewhere will 
put on some wiki "yeah if that happens just put this in modprobe.conf" 
and there won't even be a bug report.

It is a downside yes, but to me much better than finding a bad corner 
case and then saying "Yeah, please recompile your kernel", or downgrade 
your kernel and wait for X weeks/months until the fix propagates.

>>> ---
>>>   drivers/gpu/drm/i915/i915_gem.c | 44 +++++++++++++++++++++++++++++++++++------
>>>   1 file changed, 38 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>> index 2e17a254aac1..9988e65c1440 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>> @@ -1181,6 +1181,29 @@ static bool missed_irq(struct drm_i915_private *dev_priv,
>>>   	return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings);
>>>   }
>>>
>>> +static int __i915_spin_request(struct drm_i915_gem_request *rq)
>>> +{
>>> +	unsigned long timeout;
>>> +
>>> +	if (i915_gem_request_get_ring(rq)->irq_refcount)
>>> +		return -EBUSY;
>>
>> So if someone else is already waiting on this ring skip the spin and
>> do the sleep-wait.
>>
>> That would mean earlier waiter didn't manage to spin to completion
>> so for subsequent ones does it make sense to try to spin? If we
>> assume waiters are arriving here in submission order then no, they
>> should proceed to sleep-wait. But if waiters are arriving here in
>> random order, and that is purely up to userspace I think, then I am
>> not sure?
>
> They arrive pretty much in random order.
>
>> On the other hand if we allowed this "out-of-order waiters" that
>> would potentially be too much spinning so maybe it is better like it
>> is.
>
> Also part of my premise is that it the cost of the irq (setting it up and
> handling all the intermidate ones) that is the crux of the issue. Once
> we have enabled the irq for one, we may as well then use it for the
> herd. Also with a herd we will want to err on the side of sleeping more,
> or so my intuition says.

My gut feeling is the same. Waking thundering herd sounds safer than 
spinning one.

Regards,

Tvrtko