[Intel-gfx] [PATCH 2/2] drm/i915: Use atomic waits for short non-atomic ones

Tue Jun 28 14:38:01 UTC 2016

On 28/06/16 14:53, Imre Deak wrote:
> On ti, 2016-06-28 at 14:29 +0100, Tvrtko Ursulin wrote:
>> On 28/06/16 13:19, Imre Deak wrote:
>>> On ti, 2016-06-28 at 12:51 +0100, Tvrtko Ursulin wrote:
>>>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>>
>>>> usleep_range is not recommended for waits shorten than 10us.
>>>>
>>>> Make the wait_for_us use the atomic variant for such waits.
>>>>
>>>> To do so we need to disable the !in_atomic warning for such uses
>>>> and also disable preemption since the macro is written in a way
>>>> to only be safe to be used in atomic context (local_clock() and
>>>> no second COND check after the timeout).
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>>>> Cc: Imre Deak <imre.deak at intel.com>
>>>> Cc: Mika Kuoppala <mika.kuoppala at intel.com>
>>>> ---
>>>>    drivers/gpu/drm/i915/intel_drv.h | 29 +++++++++++++++++++++--------
>>>>    1 file changed, 21 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
>>>> index 3156d8df7921..e21bf6e6f119 100644
>>>> --- a/drivers/gpu/drm/i915/intel_drv.h
>>>> +++ b/drivers/gpu/drm/i915/intel_drv.h
>>>> @@ -69,20 +69,21 @@
>>>>    })
>>>>
>>>>    #define wait_for(COND, MS)	  	_wait_for((COND), (MS) * 1000, 1000)
>>>> -#define wait_for_us(COND, US)	  	_wait_for((COND), (US), 1)
>>>>
>>>>    /* If CONFIG_PREEMPT_COUNT is disabled, in_atomic() always reports false. */
>>>>    #if defined(CONFIG_DRM_I915_DEBUG) && defined(CONFIG_PREEMPT_COUNT)
>>>> -# define _WAIT_FOR_ATOMIC_CHECK WARN_ON_ONCE(!in_atomic())
>>>> +# define _WAIT_FOR_ATOMIC_CHECK(ATOMIC) WARN_ON_ONCE((ATOMIC) && !in_atomic())
>>>>    #else
>>>> -# define _WAIT_FOR_ATOMIC_CHECK do { } while (0)
>>>> +# define _WAIT_FOR_ATOMIC_CHECK(ATOMIC) do { } while (0)
>>>>    #endif
>>>>
>>>> -#define _wait_for_atomic(COND, US) ({ \
>>>> +#define _wait_for_atomic(COND, US, ATOMIC) ({ \
>>>>    	unsigned long end__; \
>>>>    	int ret__ = 0; \
>>>> -	_WAIT_FOR_ATOMIC_CHECK; \
>>>> -	BUILD_BUG_ON((US) > 50000); \
>>>> +	_WAIT_FOR_ATOMIC_CHECK(ATOMIC); \
>>>> +	BUILD_BUG_ON((ATOMIC) && (US) > 50000); \
>>>> +	if (!(ATOMIC)) \
>>>> +		preempt_disable(); \
>>>
>>> Disabling preemption for this purpose (scheduling a timeout) could be
>>> frowned upon, although for 10us may be not an issue. Another
>>
>> Possibly, but I don't see how to otherwise do it.
>>
>> And about the number itself - I chose 10us just because usleep_range is
>> not recommended for <10us due setup overhead.
>>
>>> possibility would be to use cpu_clock() instead which would have some
>>> overhead in case of scheduling away from the initial CPU, but we'd only
>>> incur it for the non-atomic <10us case, so would be negligible imo.
>>> You'd also have to re-check the condition with that solution.
>>
>> How would you implement it with cpu_clock? What would you do when
>> re-scheduled?
>
> By calculating the expiry in the beginning with cpu_clock()
> using raw_smp_processor_id() and then calling cpu_clock() in
> time_after() with the same CPU id. cpu_clock() would then internally
> handle the scheduling away scenario.

Right, but that is also not ideal since if the two cpu_clocks differ the 
running time domain is not identical to the timeout one. Probably would 
not matter but feels hacky.

>>> Also could you explain how can we ignore hard IRQs as hinted by the
>>> comment in _wait_for_atomic()?
>>
>> Hm, in retrospect it does not look safe. Upside that after your fixes
>> from today it will be, since all remaining callers are with interrupts
>> disabled.
>
> Well, except for the GuC path, but that's for a 10ms timeout, so
> probably doesn't matter (or else we have a bigger problem).

I've just sent a patch for that.

>> And downside that the patch from this thread is not safe then
>> and would need the condition put back in. Possibly only in the !ATOMIC
>> case but that might be too fragile for the future.
>
> I'd say we'd need the extra check at least whenever hard IRQs are not
> disabled. Even then there could be NMIs or some other background stuff
> (ME) that could be a problem. OTOH we'd incur the overhead from the
> extra check only in the exceptional timeout case, so I think doing it
> in all cases wouldn't be a big problem.

Yeah I'll put it in.

Regards,

Tvrtko