[Intel-gfx] [PATCH v3] drm/i915: Use exponential backoff for wait_for()

Sagar Arun Kamble sagar.a.kamble at intel.com
Wed Nov 22 10:03:58 UTC 2017



On 11/22/2017 3:06 PM, Chris Wilson wrote:
> Quoting Sagar Arun Kamble (2017-11-22 07:41:02)
>>
>> On 11/22/2017 2:29 AM, Chris Wilson wrote:
>>> Instead of sleeping for a fixed 1ms (roughly, depending on timer slack),
>>> start with a small sleep and exponentially increase the sleep on each
>>> cycle.
>>>
>>> A good example of a beneficiary is the guc mmio communication channel.
>> As Tvrtko said, for the current GuC communication (guc_send_mmio) we
>> will need to update fast timeout of
>> __intel_wait_for_register to 20us. Improvement this patch proposes
>> through wait_for will
>> certainly be seen once we switch over to GuC CT. May be specifying "GuC
>> CT channel" here is apt.
> guc mmio comm falls off the fastpath hitting the sleeping wait_for, and
> *is* improved by this patch.
Thanks for clarification. I overlooked the sleeping wait of 10ms in 
_intel_wait_for_register.
>   As far as the latency experienced by
> gem_exec_latency, there is no difference between a 10us sleep and
> spinning for 20us.  Changing the spin length to 20us!!! is
> something that you should talk to the guc about, at that point we really
> need an asynchronous communication channel (ct is still being used
> synchronously).
>
> Changing the intel_guc_send_mmio fast timeout is a different
> conversation, that is of more dubious merit because of this patch (i.e.
> if we can achieve the same latency with a sleep should we spin at all?).
Agree.
>>> Typically we expect (and so spin) for 10us for a quick response, but this
>>> doesn't cover everything and so sometimes we fallback to the millisecond+
>>> sleep. This incurs a significant delay in time-critical operations like
>>> preemption (igt/gem_exec_latency), which can be improved significantly by
>>> using a small sleep after the spin fails.
>>>
>>> We've made this suggestion many times, but had little experimental data
>>> to support adding the complexity.
>>>
>>> v2: Bump the minimum usleep to 10us on advice of
>>> Documentation/timers/timers-howto.txt (Tvrko)
>>> v3: Specify min, max range for usleep intervals -- some code may
>>> crucially depend upon and so want to specify the sleep pattern.
>>>
>>> References: 1758b90e38f5 ("drm/i915: Use a hybrid scheme for fast register waits")
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>> Cc: John Harrison <John.C.Harrison at intel.com>
>>> Cc: MichaƂ Winiarski <michal.winiarski at intel.com>
>>> Cc: Ville Syrjala <ville.syrjala at linux.intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble at intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/intel_drv.h | 11 +++++++----
>>>    drivers/gpu/drm/i915/intel_pm.c  |  2 +-
>>>    2 files changed, 8 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
>>> index 635a96fcd788..c00441a3d649 100644
>>> --- a/drivers/gpu/drm/i915/intel_drv.h
>>> +++ b/drivers/gpu/drm/i915/intel_drv.h
>>> @@ -48,8 +48,9 @@
>>>     * having timed out, since the timeout could be due to preemption or similar and
>>>     * we've never had a chance to check the condition before the timeout.
>>>     */
>>> -#define _wait_for(COND, US, W) ({ \
>>> +#define _wait_for(COND, US, Wmin, Wmax) ({ \
>>>        unsigned long timeout__ = jiffies + usecs_to_jiffies(US) + 1;   \
>>> +     long wait__ = (Wmin); /* recommended min for usleep is 10 us */ \
>>>        int ret__;                                                      \
>>>        might_sleep();                                                  \
>>>        for (;;) {                                                      \
>>> @@ -62,12 +63,14 @@
>>>                        ret__ = -ETIMEDOUT;                             \
>>>                        break;                                          \
>>>                }                                                       \
>>> -             usleep_range((W), (W) * 2);                             \
>>> +             usleep_range(wait__, wait__ * 2);                       \
>>> +             if (wait__ < (Wmax))                                    \
>>> +                     wait__ <<= 1;                                   \
>> I think we need to keep track of total time we have waited else we might
>> wait for longer than necessary.
> This is not a precise wait, it's a sleep with a rough upper bound.
> Sleeps by their very nature are very rough, you are giving up the
> processor and telling it not to wake you before a certain time. We are
> not asking to be woken at that time, just not before.
>
>> For e.g. for wait_for_us(COND, 900) this approach might actually lead to
>> sleep of 1270us.
> The numbers are irrelevant, they are merely round numbers from once upon
> a time using msleep(1).
Ok. Thanks for clarification.
> -Chris



More information about the Intel-gfx mailing list