[Intel-gfx] [PATCH] drm/i915: Use exponential backoff for wait_for()

Tue Nov 21 16:49:37 UTC 2017


On 11/21/2017 10:03 PM, Chris Wilson wrote:
> Quoting Sagar Arun Kamble (2017-11-21 16:29:57)
>>
>> On 11/21/2017 8:54 PM, Chris Wilson wrote:
>>> Instead of sleeping for a fixed 1ms (roughly, depending on timer slack),
>>> start with a small sleep and exponentially increase the sleep on each
>>> cycle.
>>>
>>> A good example of a beneficiary is the guc mmio communication channel.
>>> Typically we expect (and so spin) for 10us for a quick response, but this
>>> doesn't cover everything and so sometimes we fallback to the millisecond+
>>> sleep. This incurs a significant delay in time-critical operations like
>>> preemption (igt/gem_exec_latency), which can be improved significantly by
>>> using a small sleep after the spin fails.
>>>
>>> We've made this suggestion many times, but had little experimental data
>>> to support adding the complexity.
>>>
>>> References: 1758b90e38f5 ("drm/i915: Use a hybrid scheme for fast register waits")
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>> Cc: John Harrison <John.C.Harrison at intel.com>
>>> Cc: Michał Winiarski <michal.winiarski at intel.com>
>>> Cc: Ville Syrjala <ville.syrjala at linux.intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/intel_drv.h | 5 ++++-
>>>    1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
>>> index 69aab324aaa1..c1ea9a009eb4 100644
>>> --- a/drivers/gpu/drm/i915/intel_drv.h
>>> +++ b/drivers/gpu/drm/i915/intel_drv.h
>>> @@ -50,6 +50,7 @@
>>>     */
>>>    #define _wait_for(COND, US, W) ({ \
>>>        unsigned long timeout__ = jiffies + usecs_to_jiffies(US) + 1;   \
>>> +     long wait__ = 1;                                                \
>>>        int ret__;                                                      \
>>>        might_sleep();                                                  \
>>>        for (;;) {                                                      \
>>> @@ -62,7 +63,9 @@
>>>                        ret__ = -ETIMEDOUT;                             \
>>>                        break;                                          \
>>>                }                                                       \
>>> -             usleep_range((W), (W) * 2);                             \
>>> +             usleep_range(wait__, wait__ * 2);                       \
>>> +             if (wait__ < (W))                                       \
>> This should be "wait__ < (US)" ?
> US is the total timeout, W is the sleep, now used to provide a cap
> (minimum frequency of polling).
Oh. right. I thought we want to exponentially run down the total 
timeout. This might not be correct approach.
Hybrid polling might be useful here but we need to precisely know for 
each use of wait_for the expected wait time.
Then we could sleep half the total time and then poll :)
Users needing faster completion should use wait_for_atomic with higher 
timeout in uS?
> -Chris