[Intel-gfx] [PATCH v1] drm/i915/gen11: Preempt-to-idle support in execlists.
Chris Wilson
chris at chris-wilson.co.uk
Wed Mar 28 22:28:11 UTC 2018
Quoting Lis, Tomasz (2018-03-28 17:06:58)
>
>
> On 2018-03-28 01:27, Chris Wilson wrote:
> > Quoting Tomasz Lis (2018-03-27 16:17:59)
> >> The patch adds support of preempt-to-idle requesting by setting a proper
> >> bit within Execlist Control Register, and receiving preemption result from
> >> Context Status Buffer.
> >>
> >> Preemption in previous gens required a special batch buffer to be executed,
> >> so the Command Streamer never preempted to idle directly. In Icelake it is
> >> possible, as there is a hardware mechanism to inform the kernel about
> >> status of the preemption request.
> >>
> >> This patch does not cover using the new preemption mechanism when GuC is
> >> active.
> >>
> >> Bspec: 18922
> >> Signed-off-by: Tomasz Lis <tomasz.lis at intel.com>
> >> ---
> >> drivers/gpu/drm/i915/i915_drv.h | 2 ++
> >> drivers/gpu/drm/i915/i915_pci.c | 3 ++-
> >> drivers/gpu/drm/i915/intel_device_info.h | 1 +
> >> drivers/gpu/drm/i915/intel_lrc.c | 45 +++++++++++++++++++++++++++-----
> >> drivers/gpu/drm/i915/intel_lrc.h | 1 +
> >> 5 files changed, 45 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >> index 800230b..c32580b 100644
> >> --- a/drivers/gpu/drm/i915/i915_drv.h
> >> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >> @@ -2514,6 +2514,8 @@ intel_info(const struct drm_i915_private *dev_priv)
> >> ((dev_priv)->info.has_logical_ring_elsq)
> >> #define HAS_LOGICAL_RING_PREEMPTION(dev_priv) \
> >> ((dev_priv)->info.has_logical_ring_preemption)
> >> +#define HAS_HW_PREEMPT_TO_IDLE(dev_priv) \
> >> + ((dev_priv)->info.has_hw_preempt_to_idle)
> >>
> >> #define HAS_EXECLISTS(dev_priv) HAS_LOGICAL_RING_CONTEXTS(dev_priv)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> >> index 4364922..66b6700 100644
> >> --- a/drivers/gpu/drm/i915/i915_pci.c
> >> +++ b/drivers/gpu/drm/i915/i915_pci.c
> >> @@ -595,7 +595,8 @@ static const struct intel_device_info intel_cannonlake_info = {
> >> GEN(11), \
> >> .ddb_size = 2048, \
> >> .has_csr = 0, \
> >> - .has_logical_ring_elsq = 1
> >> + .has_logical_ring_elsq = 1, \
> >> + .has_hw_preempt_to_idle = 1
> >>
> >> static const struct intel_device_info intel_icelake_11_info = {
> >> GEN11_FEATURES,
> >> diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
> >> index 933e316..4eb97b5 100644
> >> --- a/drivers/gpu/drm/i915/intel_device_info.h
> >> +++ b/drivers/gpu/drm/i915/intel_device_info.h
> >> @@ -98,6 +98,7 @@ enum intel_platform {
> >> func(has_logical_ring_contexts); \
> >> func(has_logical_ring_elsq); \
> >> func(has_logical_ring_preemption); \
> >> + func(has_hw_preempt_to_idle); \
> >> func(has_overlay); \
> >> func(has_pooled_eu); \
> >> func(has_psr); \
> >> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> >> index ba7f783..1a22de4 100644
> >> --- a/drivers/gpu/drm/i915/intel_lrc.c
> >> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> >> @@ -153,6 +153,7 @@
> >> #define GEN8_CTX_STATUS_ACTIVE_IDLE (1 << 3)
> >> #define GEN8_CTX_STATUS_COMPLETE (1 << 4)
> >> #define GEN8_CTX_STATUS_LITE_RESTORE (1 << 15)
> >> +#define GEN11_CTX_STATUS_PREEMPT_IDLE (1 << 29)
> >>
> >> #define GEN8_CTX_STATUS_COMPLETED_MASK \
> >> (GEN8_CTX_STATUS_COMPLETE | GEN8_CTX_STATUS_PREEMPTED)
> >> @@ -183,7 +184,9 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
> >> const struct i915_request *last,
> >> int prio)
> >> {
> >> - return engine->i915->preempt_context && prio > max(rq_prio(last), 0);
> >> + return (engine->i915->preempt_context ||
> >> + HAS_HW_PREEMPT_TO_IDLE(engine->i915)) &&
> > Well, you haven't actually disabled allocating the preempt_context so...
> Yes.. I had mixed feelings about changing needs_preempt_context() now,
> as that would mean adding a temporary condition on GuC until the GuC
> preemption is merged.
> I will add the conditions and disable the allocation in v2 of the patch.
> > But at any rate, making this an engine->flag would eliminate one pointer
> > dance.
> Could be an interesting idea for a separate patch.
To land first ;)
> >> + prio > max(rq_prio(last), 0);
> >> }
> >>
> >> /**
> >> @@ -535,6 +538,25 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
> >> execlists_set_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT);
> >> }
> >>
> >> +static void gen11_preempt_to_idle(struct intel_engine_cs *engine)
> >> +{
> >> + struct intel_engine_execlists *execlists = &engine->execlists;
> >> +
> >> + GEM_TRACE("%s\n", engine->name);
> >> +
> >> + /*
> >> + * hardware which HAS_HW_PREEMPT_TO_IDLE(), always also
> >> + * HAS_LOGICAL_RING_ELSQ(), so we can assume ctrl_reg is set
> >> + */
> >> + GEM_BUG_ON(execlists->ctrl_reg != NULL);
> >> +
> >> + /* trigger preemption to idle */
> >> + writel(EL_CTRL_PREEMPT_TO_IDLE, execlists->ctrl_reg);
> > Future plans? Because just inserting the branch into the setter of
> > inject_preempt_context() resolves a lot of conflicts with other work.
> My arguments for separate function are:
> - better code readability
> - keeping the symmetry between execlist and GuC flow - GuC preemption
> patches will introduce separate function as well
> - only 4 lines of the function would be common
> - the name inject_preempt_context() wouldn't match the new purpose, so
> renaming would be needed
> - reduced self-documenting code due to two separate preempt methods not
> having distinct names
>
> That's all, I don't have any future plans for it. If you want me to
> merge the two, let me know.
The problem that I am worrying about is that we will duplicate bunch of
other code, the actual ELS[PQ] write is the smaller portion. Plus we
already have the branch on something much more pleasant.
> >> @@ -962,10 +987,13 @@ static void execlists_submission_tasklet(unsigned long data)
> >> status, buf[2*head + 1],
> >> execlists->active);
> >>
> >> - if (status & (GEN8_CTX_STATUS_IDLE_ACTIVE |
> >> - GEN8_CTX_STATUS_PREEMPTED))
> >> + /* Check if switched to active or preempted to active */
> >> + if ((status & (GEN8_CTX_STATUS_IDLE_ACTIVE |
> >> + GEN8_CTX_STATUS_PREEMPTED)) &&
> >> + !(status & GEN11_CTX_STATUS_PREEMPT_IDLE))
> > Setting HWACK here is harmless as it gets cleared again. Unless, there
> > is some oddity in the code flow.
> I will check if lack of the change affects test results.
> Personally, I would keep this change, even if only for allowing simple
> definition of what HWACK flag means.
The simple definition is the opposite one, imo. We set the flag after we
get the corresponding response from HW; any preemption or activate event
must follow the most recent ELSP write. So that will include the
preemption event following the preempt-idle write.
Then on deciding that the HW is idle, we apply the complication such
that execlists->active == 0. (That rule is what breaks the pattern.)
-Chris
More information about the Intel-gfx
mailing list