[Intel-gfx] [PATCH 4/5] drm/i915: Do not lie about atomic wait granularity

Chris Wilson chris at chris-wilson.co.uk
Thu Mar 3 15:11:45 UTC 2016


On Thu, Mar 03, 2016 at 02:36:44PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> 
> Currently the wait_for_atomic_us only allows for a millisecond
> granularity which is not nice towards callers requesting small
> micro-second waits.

Hmm, by granularity I think of the interval between COND reads. That
should be the limiting factor in how fast we respond to the change in
event (and so for the atomic variants should be virtually the same).

Your suggestion that it is icache or similar is probably along the right
path. A quick code size measurement of a test function would be a good
supporting argument.
 
> Re-implement it so micro-second granularity is really supported
> and not just in the name of the macro.
> 
> This has another beneficial side effect that it improves
> "gem_latency -n 100" results by approximately 2.5% (throughput
> and latencies) and 3% (CPU usage). (Note this improvement is
> relative to not yet merged execlist lock uncontention patch
> which moves the CSB MMIO outside this lock.)
> 
> v2:
>   * Warn when used from non-atomic context (if possible).
>   * Warn on too long atomic waits.
> 
> v3:
>   * Added comment explaining CONFIG_PREEMPT_COUNT.
>   * Fixed pre-processor indentation.
>   (Chris Wilson)
> 
> v4:
>  * Commit msg update (gem_latency) and rebase.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/intel_drv.h | 31 +++++++++++++++++++++++++++++--
>  1 file changed, 29 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index c2a62e9554b3..b6fcb5c800e7 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -44,6 +44,10 @@
>   * contexts. Note that it's important that we check the condition again after
>   * having timed out, since the timeout could be due to preemption or similar and
>   * we've never had a chance to check the condition before the timeout.
> + *
> + * TODO: When modesetting has fully transitioned to atomic, the below
> + * drm_can_sleep() can be removed and in_atomic()/!in_atomic() asserts
> + * added.
>   */
>  #define _wait_for(COND, US, W) ({ \
>  	unsigned long timeout__ = jiffies + usecs_to_jiffies(US) + 1;	\
> @@ -66,8 +70,31 @@
>  #define wait_for(COND, MS)	  	_wait_for((COND), (MS) * 1000, 1000)
>  #define wait_for_us(COND, US)	  	_wait_for((COND), (US), 1)
>  
> -#define wait_for_atomic(COND, MS) 	_wait_for((COND), (MS) * 1000, 0)
> -#define wait_for_atomic_us(COND, US)	_wait_for((COND), (US), 0)
> +/* If CONFIG_PREEMPT_COUNT is disabled, in_atomic() always reports false. */
> +#if defined(CONFIG_DRM_I915_DEBUG) && defined(CONFIG_PREEMPT_COUNT)
> +# define _WAIT_FOR_ATOMIC_CHECK WARN_ON_ONCE(!in_atomic())
> +#else
> +# define _WAIT_FOR_ATOMIC_CHECK do { } while(0)
> +#endif
> +
> +#define _wait_for_atomic(COND, US) ({ \
> +	unsigned long end__; \
> +	int ret__ = 0; \
> +	_WAIT_FOR_ATOMIC_CHECK; \
> +	BUILD_BUG_ON((US) > 50000); \
> +	end__ = (local_clock() >> 10) + (US) + 1; \
> +	while (!(COND)) { \
> +		if (time_after((unsigned long)(local_clock() >> 10), end__)) { \

A comment here about why we can forgo the COND recheck would be
worthwhile.

/* Unlike the regular wait_for(), this atomic variant cannot be
 * preempted (and we'll just ignore the issue of irq interruptions) and
 * so we know that no time has passed since the last check of COND and
 * can immediately report the timeout.
 */
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list