[Intel-gfx] [PATCH 2/2] drm/i915: Use atomic waits for short non-atomic ones

Tue Jun 28 13:53:02 UTC 2016

On ti, 2016-06-28 at 14:29 +0100, Tvrtko Ursulin wrote:
> On 28/06/16 13:19, Imre Deak wrote:
> > On ti, 2016-06-28 at 12:51 +0100, Tvrtko Ursulin wrote:
> > > From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > > 
> > > usleep_range is not recommended for waits shorten than 10us.
> > > 
> > > Make the wait_for_us use the atomic variant for such waits.
> > > 
> > > To do so we need to disable the !in_atomic warning for such uses
> > > and also disable preemption since the macro is written in a way
> > > to only be safe to be used in atomic context (local_clock() and
> > > no second COND check after the timeout).
> > > 
> > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > > Cc: Chris Wilson <chris at chris-wilson.co.uk>
> > > Cc: Imre Deak <imre.deak at intel.com>
> > > Cc: Mika Kuoppala <mika.kuoppala at intel.com>
> > > ---
> > >   drivers/gpu/drm/i915/intel_drv.h | 29 +++++++++++++++++++++--------
> > >   1 file changed, 21 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> > > index 3156d8df7921..e21bf6e6f119 100644
> > > --- a/drivers/gpu/drm/i915/intel_drv.h
> > > +++ b/drivers/gpu/drm/i915/intel_drv.h
> > > @@ -69,20 +69,21 @@
> > >   })
> > > 
> > >   #define wait_for(COND, MS)	  	_wait_for((COND), (MS) * 1000, 1000)
> > > -#define wait_for_us(COND, US)	  	_wait_for((COND), (US), 1)
> > > 
> > >   /* If CONFIG_PREEMPT_COUNT is disabled, in_atomic() always reports false. */
> > >   #if defined(CONFIG_DRM_I915_DEBUG) && defined(CONFIG_PREEMPT_COUNT)
> > > -# define _WAIT_FOR_ATOMIC_CHECK WARN_ON_ONCE(!in_atomic())
> > > +# define _WAIT_FOR_ATOMIC_CHECK(ATOMIC) WARN_ON_ONCE((ATOMIC) && !in_atomic())
> > >   #else
> > > -# define _WAIT_FOR_ATOMIC_CHECK do { } while (0)
> > > +# define _WAIT_FOR_ATOMIC_CHECK(ATOMIC) do { } while (0)
> > >   #endif
> > > 
> > > -#define _wait_for_atomic(COND, US) ({ \
> > > +#define _wait_for_atomic(COND, US, ATOMIC) ({ \
> > >   	unsigned long end__; \
> > >   	int ret__ = 0; \
> > > -	_WAIT_FOR_ATOMIC_CHECK; \
> > > -	BUILD_BUG_ON((US) > 50000); \
> > > +	_WAIT_FOR_ATOMIC_CHECK(ATOMIC); \
> > > +	BUILD_BUG_ON((ATOMIC) && (US) > 50000); \
> > > +	if (!(ATOMIC)) \
> > > +		preempt_disable(); \
> > 
> > Disabling preemption for this purpose (scheduling a timeout) could be
> > frowned upon, although for 10us may be not an issue. Another
> 
> Possibly, but I don't see how to otherwise do it.
> 
> And about the number itself - I chose 10us just because usleep_range is 
> not recommended for <10us due setup overhead.
> 
> > possibility would be to use cpu_clock() instead which would have some
> > overhead in case of scheduling away from the initial CPU, but we'd only
> > incur it for the non-atomic <10us case, so would be negligible imo.
> > You'd also have to re-check the condition with that solution.
> 
> How would you implement it with cpu_clock? What would you do when 
> re-scheduled?

By calculating the expiry in the beginning with cpu_clock()
using raw_smp_processor_id() and then calling cpu_clock() in
time_after() with the same CPU id. cpu_clock() would then internally
handle the scheduling away scenario.

> > Also could you explain how can we ignore hard IRQs as hinted by the
> > comment in _wait_for_atomic()?
> 
> Hm, in retrospect it does not look safe. Upside that after your fixes 
> from today it will be, since all remaining callers are with interrupts 
> disabled.

Well, except for the GuC path, but that's for a 10ms timeout, so
probably doesn't matter (or else we have a bigger problem).

> And downside that the patch from this thread is not safe then 
> and would need the condition put back in. Possibly only in the !ATOMIC 
> case but that might be too fragile for the future.

I'd say we'd need the extra check at least whenever hard IRQs are not
disabled. Even then there could be NMIs or some other background stuff
(ME) that could be a problem. OTOH we'd incur the overhead from the
extra check only in the exceptional timeout case, so I think doing it
in all cases wouldn't be a big problem.

--Imre