[Intel-gfx] [PATCH v1] drm/i915/gen11: Preempt-to-idle support in execlists.

Fri Mar 30 15:42:02 UTC 2018


On 2018-03-29 00:28, Chris Wilson wrote:
> Quoting Lis, Tomasz (2018-03-28 17:06:58)
>>
>> On 2018-03-28 01:27, Chris Wilson wrote:
>>> Quoting Tomasz Lis (2018-03-27 16:17:59)
>>>> The patch adds support of preempt-to-idle requesting by setting a proper
>>>> bit within Execlist Control Register, and receiving preemption result from
>>>> Context Status Buffer.
>>>>
>>>> Preemption in previous gens required a special batch buffer to be executed,
>>>> so the Command Streamer never preempted to idle directly. In Icelake it is
>>>> possible, as there is a hardware mechanism to inform the kernel about
>>>> status of the preemption request.
>>>>
>>>> This patch does not cover using the new preemption mechanism when GuC is
>>>> active.
>>>>
>>>> Bspec: 18922
>>>> Signed-off-by: Tomasz Lis <tomasz.lis at intel.com>
>>>> ---
>>>>    drivers/gpu/drm/i915/i915_drv.h          |  2 ++
>>>>    drivers/gpu/drm/i915/i915_pci.c          |  3 ++-
>>>>    drivers/gpu/drm/i915/intel_device_info.h |  1 +
>>>>    drivers/gpu/drm/i915/intel_lrc.c         | 45 +++++++++++++++++++++++++++-----
>>>>    drivers/gpu/drm/i915/intel_lrc.h         |  1 +
>>>>    5 files changed, 45 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>>> index 800230b..c32580b 100644
>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>> @@ -2514,6 +2514,8 @@ intel_info(const struct drm_i915_private *dev_priv)
>>>>                   ((dev_priv)->info.has_logical_ring_elsq)
>>>>    #define HAS_LOGICAL_RING_PREEMPTION(dev_priv) \
>>>>                   ((dev_priv)->info.has_logical_ring_preemption)
>>>> +#define HAS_HW_PREEMPT_TO_IDLE(dev_priv) \
>>>> +               ((dev_priv)->info.has_hw_preempt_to_idle)
>>>>    
>>>>    #define HAS_EXECLISTS(dev_priv) HAS_LOGICAL_RING_CONTEXTS(dev_priv)
>>>>    
>>>> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
>>>> index 4364922..66b6700 100644
>>>> --- a/drivers/gpu/drm/i915/i915_pci.c
>>>> +++ b/drivers/gpu/drm/i915/i915_pci.c
>>>> @@ -595,7 +595,8 @@ static const struct intel_device_info intel_cannonlake_info = {
>>>>           GEN(11), \
>>>>           .ddb_size = 2048, \
>>>>           .has_csr = 0, \
>>>> -       .has_logical_ring_elsq = 1
>>>> +       .has_logical_ring_elsq = 1, \
>>>> +       .has_hw_preempt_to_idle = 1
>>>>    
>>>>    static const struct intel_device_info intel_icelake_11_info = {
>>>>           GEN11_FEATURES,
>>>> diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
>>>> index 933e316..4eb97b5 100644
>>>> --- a/drivers/gpu/drm/i915/intel_device_info.h
>>>> +++ b/drivers/gpu/drm/i915/intel_device_info.h
>>>> @@ -98,6 +98,7 @@ enum intel_platform {
>>>>           func(has_logical_ring_contexts); \
>>>>           func(has_logical_ring_elsq); \
>>>>           func(has_logical_ring_preemption); \
>>>> +       func(has_hw_preempt_to_idle); \
>>>>           func(has_overlay); \
>>>>           func(has_pooled_eu); \
>>>>           func(has_psr); \
>>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>>>> index ba7f783..1a22de4 100644
>>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>>> @@ -153,6 +153,7 @@
>>>>    #define GEN8_CTX_STATUS_ACTIVE_IDLE    (1 << 3)
>>>>    #define GEN8_CTX_STATUS_COMPLETE       (1 << 4)
>>>>    #define GEN8_CTX_STATUS_LITE_RESTORE   (1 << 15)
>>>> +#define GEN11_CTX_STATUS_PREEMPT_IDLE  (1 << 29)
>>>>    
>>>>    #define GEN8_CTX_STATUS_COMPLETED_MASK \
>>>>            (GEN8_CTX_STATUS_COMPLETE | GEN8_CTX_STATUS_PREEMPTED)
>>>> @@ -183,7 +184,9 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
>>>>                                   const struct i915_request *last,
>>>>                                   int prio)
>>>>    {
>>>> -       return engine->i915->preempt_context && prio > max(rq_prio(last), 0);
>>>> +       return (engine->i915->preempt_context ||
>>>> +               HAS_HW_PREEMPT_TO_IDLE(engine->i915)) &&
>>> Well, you haven't actually disabled allocating the preempt_context so...
>> Yes.. I had mixed feelings about changing needs_preempt_context() now,
>> as that would mean adding a temporary condition on GuC until the GuC
>> preemption is merged.
>> I will add the conditions and disable the allocation in v2 of the patch.
>>> But at any rate, making this an engine->flag would eliminate one pointer
>>> dance.
>> Could be an interesting idea for a separate patch.
> To land first ;)
:)
Sure, I can do that.
>>>> +                prio > max(rq_prio(last), 0);
>>>>    }
>>>>    
>>>>    /**
>>>> @@ -535,6 +538,25 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
>>>>           execlists_set_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT);
>>>>    }
>>>>    
>>>> +static void gen11_preempt_to_idle(struct intel_engine_cs *engine)
>>>> +{
>>>> +       struct intel_engine_execlists *execlists = &engine->execlists;
>>>> +
>>>> +       GEM_TRACE("%s\n", engine->name);
>>>> +
>>>> +       /*
>>>> +        * hardware which HAS_HW_PREEMPT_TO_IDLE(), always also
>>>> +        * HAS_LOGICAL_RING_ELSQ(), so we can assume ctrl_reg is set
>>>> +        */
>>>> +       GEM_BUG_ON(execlists->ctrl_reg != NULL);
>>>> +
>>>> +       /* trigger preemption to idle */
>>>> +       writel(EL_CTRL_PREEMPT_TO_IDLE, execlists->ctrl_reg);
>>> Future plans? Because just inserting the branch into the setter of
>>> inject_preempt_context() resolves a lot of conflicts with other work.
>> My arguments for separate function are:
>> - better code readability
>> - keeping the symmetry between execlist and GuC flow - GuC preemption
>> patches will introduce separate function as well
>> - only 4 lines of the function would be common
>> - the name inject_preempt_context() wouldn't match the new purpose, so
>> renaming would be needed
>> - reduced self-documenting code due to two separate preempt methods not
>> having distinct names
>>
>> That's all, I don't have any future plans for it. If you want me to
>> merge the two, let me know.
> The problem that I am worrying about is that we will duplicate bunch of
> other code, the actual ELS[PQ] write is the smaller portion. Plus we
> already have the branch on something much more pleasant.
I see. I don't know any details there, so I'm not able to weigh that.
Just let me know whether this possible duplication outweights the 
arguments I provided, and I will merge these functions.
I'm not overly attached to my solution.
>
>>>> @@ -962,10 +987,13 @@ static void execlists_submission_tasklet(unsigned long data)
>>>>                                     status, buf[2*head + 1],
>>>>                                     execlists->active);
>>>>    
>>>> -                       if (status & (GEN8_CTX_STATUS_IDLE_ACTIVE |
>>>> -                                     GEN8_CTX_STATUS_PREEMPTED))
>>>> +                       /* Check if switched to active or preempted to active */
>>>> +                       if ((status & (GEN8_CTX_STATUS_IDLE_ACTIVE |
>>>> +                                       GEN8_CTX_STATUS_PREEMPTED)) &&
>>>> +                           !(status & GEN11_CTX_STATUS_PREEMPT_IDLE))
>>> Setting HWACK here is harmless as it gets cleared again. Unless, there
>>> is some oddity in the code flow.
>> I will check if lack of the change affects test results.
>> Personally, I would keep this change, even if only for allowing simple
>> definition of what HWACK flag means.
> The simple definition is the opposite one, imo. We set the flag after we
> get the corresponding response from HW; any preemption or activate event
> must follow the most recent ELSP write. So that will include the
> preemption event following the preempt-idle write.
>
> Then on deciding that the HW is idle, we apply the complication such
> that execlists->active == 0. (That rule is what breaks the pattern.)
> -Chris
Ok, I will remove this unnecessary condition.
I tested this and lack of it doesn't seem to affect the results.
(I'll be out next week; expect v2 when I'm back)
-Tomasz