[Intel-gfx] [PATCH v3 1/2] drm/i915: Expose the busyspin durations for i915_wait_request
Sagar Arun Kamble
sagar.a.kamble at intel.com
Mon Nov 27 07:25:05 UTC 2017
On 11/27/2017 12:50 PM, Sagar Arun Kamble wrote:
>
>
> On 11/26/2017 5:50 PM, Chris Wilson wrote:
>> An interesting discussion regarding "hybrid interrupt polling" for NVMe
>> came to the conclusion that the ideal busyspin before sleeping
>
> I think hybrid approach suggests sleep (1/2 duration) and then
> busyspin (1/2 duration)
> (for small I/O size), so this should be "busyspin after sleeping"
> although we are not doing exactly same.
>
>> was half
>> of the expected request latency (and better if it was already halfway
>> through that request). This suggested that we too should look again at
>> our tradeoff between spinning and waiting. Currently, our spin simply
>> tries to hide the cost of enabling the interrupt, which is good to avoid
>> penalising nop requests (i.e. test throughput) and not much else.
>> Studying real world workloads suggests that a spin of upto 500us can
>> dramatically boost performance, but the suggestion is that this is not
>> from avoiding interrupt latency per-se, but from secondary effects of
>> sleeping such as allowing the CPU reduce cstate and context switch away.
>>
>> v2: Expose the spin setting via Kconfig options for easier adjustment
>> and testing.
>> v3: Don't get caught sneaking in a change to the busyspin parameters.
>>
>> Suggested-by: Sagar Kamble <sagar.a.kamble at intel.com>
>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>> Cc: Sagar Kamble <sagar.a.kamble at intel.com>
>> Cc: Eero Tamminen <eero.t.tamminen at intel.com>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>> Cc: Ben Widawsky <ben at bwidawsk.net>
>> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
>> Cc: Michał Winiarski <michal.winiarski at intel.com>
>> ---
>> drivers/gpu/drm/i915/Kconfig | 6 ++++++
>> drivers/gpu/drm/i915/Kconfig.profile | 23 +++++++++++++++++++++++
>> drivers/gpu/drm/i915/i915_gem_request.c | 28
>> ++++++++++++++++++++++++----
>> 3 files changed, 53 insertions(+), 4 deletions(-)
>> create mode 100644 drivers/gpu/drm/i915/Kconfig.profile
>>
>> diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
>> index dfd95889f4b7..eae90783f8f9 100644
>> --- a/drivers/gpu/drm/i915/Kconfig
>> +++ b/drivers/gpu/drm/i915/Kconfig
>> @@ -131,3 +131,9 @@ depends on DRM_I915
>> depends on EXPERT
>> source drivers/gpu/drm/i915/Kconfig.debug
>> endmenu
>> +
>> +menu "drm/i915 Profile Guided Optimisation"
>> + visible if EXPERT
>> + depends on DRM_I915
>> + source drivers/gpu/drm/i915/Kconfig.profile
>> +endmenu
>> diff --git a/drivers/gpu/drm/i915/Kconfig.profile
>> b/drivers/gpu/drm/i915/Kconfig.profile
>> new file mode 100644
>> index 000000000000..a1aed0e2aad5
>> --- /dev/null
>> +++ b/drivers/gpu/drm/i915/Kconfig.profile
>> @@ -0,0 +1,23 @@
>> +config DRM_I915_SPIN_REQUEST_IRQ
>> + int
>> + default 5 # microseconds
>> + help
>> + Before sleeping waiting for a request (GPU operation) to
>> complete,
>> + we may spend some time polling for its completion. As the IRQ may
>> + take a non-negligible time to setup, we do a short spin first to
>> + check if the request will complete quickly.
>> +
>> + May be 0 to disable the initial spin.
>> +
>> +config DRM_I915_SPIN_REQUEST_CS
>> + int
>> + default 2 # microseconds
>> + help
>> + After sleeping for a request (GPU operation) to complete, we will
>> + be woken up on the completion of every request prior to the one
>> + being waited on. For very short requests, going back to sleep and
>> + be woken up again may add considerably to the wakeup latency. To
>> + avoid incurring extra latency from the scheduler, we may
>> choose to
>> + spin prior to sleeping again.
>> +
>> + May be 0 to disable spinning after being woken.
>> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c
>> b/drivers/gpu/drm/i915/i915_gem_request.c
>> index a90bdd26571f..7ac72a0a949c 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_request.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
>> @@ -1198,8 +1198,21 @@ long i915_wait_request(struct
>> drm_i915_gem_request *req,
>> GEM_BUG_ON(!intel_wait_has_seqno(&wait));
>> GEM_BUG_ON(!i915_sw_fence_signaled(&req->submit));
>> - /* Optimistic short spin before touching IRQs */
>> - if (__i915_spin_request(req, wait.seqno, state, 5))
>> + /* Optimistic spin before touching IRQs.
>> + *
>> + * We may use a rather large value here to offset the penalty of
>> + * switching away from the active task. Frequently, the client will
>> + * wait upon an old swapbuffer to throttle itself to remain
>> within a
>> + * frame of the gpu. If the client is running in lockstep with
>> the gpu,
>> + * then it should not be waiting long at all, and a sleep now
>> will incur
>> + * extra scheduler latency in producing the next frame. So we sleep
>> + * for longer to try and keep the client running.
>> + *
>> + * We need ~5us to enable the irq, ~20us to hide a context switch.
>
> This comment fits more with next patch.
>
>> + */
>> + if (CONFIG_DRM_I915_SPIN_REQUEST_IRQ &&
>
> If this is set to 0 we still want to sleep and not go to complete.
Hit wicket :) ... Ignore this comment.
With above comments above commit message and comment update this change
looks fine to me.
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble at intel.com>
>
>> + __i915_spin_request(req, wait.seqno, state,
>> + CONFIG_DRM_I915_SPIN_REQUEST_IRQ))
>> goto complete;
>> set_current_state(state);
>> @@ -1255,8 +1268,15 @@ long i915_wait_request(struct
>> drm_i915_gem_request *req,
>> __i915_wait_request_check_and_reset(req))
>> continue;
>> - /* Only spin if we know the GPU is processing this request */
>> - if (__i915_spin_request(req, wait.seqno, state, 2))
>> + /*
>> + * A quick spin now we are on the CPU to offset the cost of
>> + * context switching away (and so spin for roughly the same as
>> + * the scheduler latency). We only spin if we know the GPU is
>> + * processing this request, and so likely to finish shortly.
>> + */
>> + if (CONFIG_DRM_I915_SPIN_REQUEST_CS &&
>
> Same here.
>
>> + __i915_spin_request(req, wait.seqno, state,
>> + CONFIG_DRM_I915_SPIN_REQUEST_CS))
>> break;
>> if (!intel_wait_check_request(&wait, req)) {
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
More information about the Intel-gfx
mailing list