[Intel-gfx] [PATCH] drm/i915: Restore waitboost credit to the synchronous waiter

Mon Dec 7 09:38:43 PST 2015

On 12/01/2015 02:48 PM, Chris Wilson wrote:
> Ideally, we want to automagically have the GPU respond to the
> instantaneous load by reclocking itself. However, reclocking occurs
> relatively slowly, and to the client waiting for a result from the GPU,
> too late. To compensate and reduce the client latency, we allow the
> first wait from a client to boost the GPU clocks to maximum. This
> overcomes the lag in autoreclocking, at the expense of forcing the GPU
> clocks too high. So to offset the excessive power usage, we currently
> allow a client to only boost the clocks once before we detect the GPU
> is idle again. This works reasonably for say the first frame in a
> benchmark, but for many more synchronous workloads (like OpenCL) we find
> the GPU clocks remain too low. By noting a wait which would idle the GPU
> (i.e. we just waited upon the last known request), we can give that
> client the idle boost credit (for their next wait) without the 100ms
> delay required for us to detect the GPU idle state. The intention is to
> boost clients that are stalling in the process of feeding the GPU more
> work (and who in doing so let the GPU idle), without granting boost
> credits to clients that are throttling themselves (such as compositors).
> 
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: "Zou, Nanhai" <nanhai.zou at intel.com>
> Cc: Jesse Barnes <jesse.barnes at intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 92598601a232..f5aef48b93db 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1312,6 +1312,22 @@ out:
>  			*timeout = 0;
>  	}
>  
> +	if (ret == 0 && rps && req->seqno == req->ring->last_submitted_seqno) {
> +		/* The GPU is now idle and this client has stalled.
> +		 * Since no other client has submitted a request in the
> +		 * meantime, assume that this client is the only one
> +		 * supplying work to the GPU but is unable to keep that
> +		 * work supplied because it is waiting. Since the GPU is
> +		 * then never kept fully busy, RPS autoclocking will
> +		 * keep the clocks relatively low, causing further delays.
> +		 * Compensate by giving the synchronous client credit for
> +		 * a waitboost next time.
> +		 */
> +		spin_lock(&req->i915->rps.client_lock);
> +		list_del_init(&rps->link);
> +		spin_unlock(&req->i915->rps.client_lock);
> +	}
> +
>  	return ret;
>  }
>  
> 

Still wishing we had a good way to benchmark these types of changes
across a range of workloads.  Eero, have you guys looked at turbo stuff
at all yet?

Also, is the boost logic only documented in misc commit messages?  Or do
we have a nice block of text somewhere describing the intent (which may
not match our implementation!) and how we try to achieve it?

Those are both new requests though, so no need to block this patch:
Reviewed-by: Jesse Barnes <jbarnes at virtuousgeek.org>

Thanks,
Jesse