[Intel-gfx] [PATCH v3] drm/i915: Optimistically spin for the request completion
Daniel Vetter
daniel at ffwll.ch
Fri Mar 20 07:54:01 PDT 2015
On Thu, Mar 19, 2015 at 03:16:15PM +0000, Chris Wilson wrote:
> On Thu, Mar 12, 2015 at 11:11:17AM +0000, Chris Wilson wrote:
> > This provides a nice boost to mesa in swap bound scenarios (as mesa
> > throttles itself to the previous frame and given the scenario that will
> > complete shortly). It will also provide a good boost to systems running
> > with semaphores disabled and so frequently waiting on the GPU as it
> > switches rings. In the most favourable of microbenchmarks, this can
> > increase performance by around 15% - though in practice improvements
> > will be marginal and rarely noticeable.
> >
> > v2: Account for user timeouts
> > v3: Limit the spinning to a single jiffie (~1us) at most. On an
> > otherwise idle system, there is no scheduler contention and so without a
> > limit we would spin until the GPU is ready.
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
>
> Just recording ideas for the future. Replace the busy-spin with
> monitor/mwait. This requires Pentium4+, a cooperating GPU with working
> cacheline snooping and that we use HWS seqno.
Just for the record: Did it help with powersaving or was it all in the
noise?
-Daniel
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 85e71e0e2340..454a38d4caa3 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -37,6 +37,7 @@
> #include <linux/swap.h>
> #include <linux/pci.h>
> #include <linux/dma-buf.h>
> +#include <asm/mwait.h>
>
> #define RQ_BUG_ON(expr)
>
> @@ -1187,18 +1188,42 @@ static int __i915_spin_request(struct drm_i915_gem_request *req)
> unsigned long timeout;
> int ret = -EBUSY;
>
> + if (ring->irq_refcount) /* IRQ is already active, keep using it */
> + return ret;
> +
> intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
> timeout = jiffies + 1;
> - while (!need_resched()) {
> - if (i915_gem_request_completed(req, true)) {
> - ret = 0;
> - goto out;
> - }
> + if (this_cpu_has(X86_FEATURE_MWAIT)) {
> + do {
> + unsigned long ecx = 1; /* break on interrupt */
> + unsigned long eax = 0; /* cstate */
>
> - if (time_after_eq(jiffies, timeout))
> - break;
> + __monitor((void *)&ring->status_page.page_addr[I915_GEM_HWS_INDEX], 0, 0);
> + if (need_resched())
> + break;
> +
> + if (i915_gem_request_completed(req, true)) {
> + ret = 0;
> + goto out;
> + }
>
> - cpu_relax_lowlatency();
> + if (time_after_eq(jiffies, timeout))
> + break;
> +
> + __mwait(eax, ecx);
> + } while (1);
> + } else {
> + while (!need_resched()) {
> + if (i915_gem_request_completed(req, true)) {
> + ret = 0;
> + goto out;
> + }
> +
> + if (time_after_eq(jiffies, timeout))
> + break;
> +
> + cpu_relax_lowlatency();
> + }
> }
> if (i915_gem_request_completed(req, false))
> ret = 0;
>
>
> --
> Chris Wilson, Intel Open Source Technology Centre
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the Intel-gfx
mailing list