[Intel-gfx] [PATCH v1] drm/i915/gen11: Preempt-to-idle support in execlists.

Fri Mar 30 19:45:43 UTC 2018

On 30/03/18 08:42, Lis, Tomasz wrote:
> 
> 
> On 2018-03-29 00:28, Chris Wilson wrote:
>> Quoting Lis, Tomasz (2018-03-28 17:06:58)
>>>
>>> On 2018-03-28 01:27, Chris Wilson wrote:
>>>> Quoting Tomasz Lis (2018-03-27 16:17:59)
>>>>> The patch adds support of preempt-to-idle requesting by setting a 
>>>>> proper
>>>>> bit within Execlist Control Register, and receiving preemption 
>>>>> result from
>>>>> Context Status Buffer.
>>>>>
>>>>> Preemption in previous gens required a special batch buffer to be 
>>>>> executed,
>>>>> so the Command Streamer never preempted to idle directly. In 
>>>>> Icelake it is
>>>>> possible, as there is a hardware mechanism to inform the kernel about
>>>>> status of the preemption request.
>>>>>
>>>>> This patch does not cover using the new preemption mechanism when 
>>>>> GuC is
>>>>> active.
>>>>>
>>>>> Bspec: 18922
>>>>> Signed-off-by: Tomasz Lis <tomasz.lis at intel.com>
>>>>> ---
>>>>>    drivers/gpu/drm/i915/i915_drv.h          |  2 ++
>>>>>    drivers/gpu/drm/i915/i915_pci.c          |  3 ++-
>>>>>    drivers/gpu/drm/i915/intel_device_info.h |  1 +
>>>>>    drivers/gpu/drm/i915/intel_lrc.c         | 45 
>>>>> +++++++++++++++++++++++++++-----
>>>>>    drivers/gpu/drm/i915/intel_lrc.h         |  1 +
>>>>>    5 files changed, 45 insertions(+), 7 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>> index 800230b..c32580b 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>> @@ -2514,6 +2514,8 @@ intel_info(const struct drm_i915_private 
>>>>> *dev_priv)
>>>>>                   ((dev_priv)->info.has_logical_ring_elsq)
>>>>>    #define HAS_LOGICAL_RING_PREEMPTION(dev_priv) \
>>>>>                   ((dev_priv)->info.has_logical_ring_preemption)
>>>>> +#define HAS_HW_PREEMPT_TO_IDLE(dev_priv) \
>>>>> +               ((dev_priv)->info.has_hw_preempt_to_idle)
>>>>>    #define HAS_EXECLISTS(dev_priv) HAS_LOGICAL_RING_CONTEXTS(dev_priv)
>>>>> diff --git a/drivers/gpu/drm/i915/i915_pci.c 
>>>>> b/drivers/gpu/drm/i915/i915_pci.c
>>>>> index 4364922..66b6700 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_pci.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_pci.c
>>>>> @@ -595,7 +595,8 @@ static const struct intel_device_info 
>>>>> intel_cannonlake_info = {
>>>>>           GEN(11), \
>>>>>           .ddb_size = 2048, \
>>>>>           .has_csr = 0, \
>>>>> -       .has_logical_ring_elsq = 1
>>>>> +       .has_logical_ring_elsq = 1, \
>>>>> +       .has_hw_preempt_to_idle = 1
>>>>>    static const struct intel_device_info intel_icelake_11_info = {
>>>>>           GEN11_FEATURES,
>>>>> diff --git a/drivers/gpu/drm/i915/intel_device_info.h 
>>>>> b/drivers/gpu/drm/i915/intel_device_info.h
>>>>> index 933e316..4eb97b5 100644
>>>>> --- a/drivers/gpu/drm/i915/intel_device_info.h
>>>>> +++ b/drivers/gpu/drm/i915/intel_device_info.h
>>>>> @@ -98,6 +98,7 @@ enum intel_platform {
>>>>>           func(has_logical_ring_contexts); \
>>>>>           func(has_logical_ring_elsq); \
>>>>>           func(has_logical_ring_preemption); \
>>>>> +       func(has_hw_preempt_to_idle); \
>>>>>           func(has_overlay); \
>>>>>           func(has_pooled_eu); \
>>>>>           func(has_psr); \
>>>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
>>>>> b/drivers/gpu/drm/i915/intel_lrc.c
>>>>> index ba7f783..1a22de4 100644
>>>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>>>> @@ -153,6 +153,7 @@
>>>>>    #define GEN8_CTX_STATUS_ACTIVE_IDLE    (1 << 3)
>>>>>    #define GEN8_CTX_STATUS_COMPLETE       (1 << 4)
>>>>>    #define GEN8_CTX_STATUS_LITE_RESTORE   (1 << 15)
>>>>> +#define GEN11_CTX_STATUS_PREEMPT_IDLE  (1 << 29)
>>>>>    #define GEN8_CTX_STATUS_COMPLETED_MASK \
>>>>>            (GEN8_CTX_STATUS_COMPLETE | GEN8_CTX_STATUS_PREEMPTED)
>>>>> @@ -183,7 +184,9 @@ static inline bool need_preempt(const struct 
>>>>> intel_engine_cs *engine,
>>>>>                                   const struct i915_request *last,
>>>>>                                   int prio)
>>>>>    {
>>>>> -       return engine->i915->preempt_context && prio > 
>>>>> max(rq_prio(last), 0);
>>>>> +       return (engine->i915->preempt_context ||
>>>>> +               HAS_HW_PREEMPT_TO_IDLE(engine->i915)) &&
>>>> Well, you haven't actually disabled allocating the preempt_context 
>>>> so...
>>> Yes.. I had mixed feelings about changing needs_preempt_context() now,
>>> as that would mean adding a temporary condition on GuC until the GuC
>>> preemption is merged.
>>> I will add the conditions and disable the allocation in v2 of the patch.
>>>> But at any rate, making this an engine->flag would eliminate one 
>>>> pointer
>>>> dance.
>>> Could be an interesting idea for a separate patch.
>> To land first ;)
> :)
> Sure, I can do that.
>>>>> +                prio > max(rq_prio(last), 0);
>>>>>    }
>>>>>    /**
>>>>> @@ -535,6 +538,25 @@ static void inject_preempt_context(struct 
>>>>> intel_engine_cs *engine)
>>>>>           execlists_set_active(&engine->execlists, 
>>>>> EXECLISTS_ACTIVE_PREEMPT);
>>>>>    }
>>>>> +static void gen11_preempt_to_idle(struct intel_engine_cs *engine)
>>>>> +{
>>>>> +       struct intel_engine_execlists *execlists = &engine->execlists;
>>>>> +
>>>>> +       GEM_TRACE("%s\n", engine->name);
>>>>> +
>>>>> +       /*
>>>>> +        * hardware which HAS_HW_PREEMPT_TO_IDLE(), always also
>>>>> +        * HAS_LOGICAL_RING_ELSQ(), so we can assume ctrl_reg is set
>>>>> +        */
>>>>> +       GEM_BUG_ON(execlists->ctrl_reg != NULL);
>>>>> +
>>>>> +       /* trigger preemption to idle */
>>>>> +       writel(EL_CTRL_PREEMPT_TO_IDLE, execlists->ctrl_reg);
>>>> Future plans? Because just inserting the branch into the setter of
>>>> inject_preempt_context() resolves a lot of conflicts with other work.
>>> My arguments for separate function are:
>>> - better code readability
>>> - keeping the symmetry between execlist and GuC flow - GuC preemption
>>> patches will introduce separate function as well
>>> - only 4 lines of the function would be common
>>> - the name inject_preempt_context() wouldn't match the new purpose, so
>>> renaming would be needed
>>> - reduced self-documenting code due to two separate preempt methods not
>>> having distinct names
>>>
>>> That's all, I don't have any future plans for it. If you want me to
>>> merge the two, let me know.
>> The problem that I am worrying about is that we will duplicate bunch of
>> other code, the actual ELS[PQ] write is the smaller portion. Plus we
>> already have the branch on something much more pleasant.
> I see. I don't know any details there, so I'm not able to weigh that.
> Just let me know whether this possible duplication outweights the 
> arguments I provided, and I will merge these functions.
> I'm not overly attached to my solution.
>>
>>>>> @@ -962,10 +987,13 @@ static void 
>>>>> execlists_submission_tasklet(unsigned long data)
>>>>>                                     status, buf[2*head + 1],
>>>>>                                     execlists->active);
>>>>> -                       if (status & (GEN8_CTX_STATUS_IDLE_ACTIVE |
>>>>> -                                     GEN8_CTX_STATUS_PREEMPTED))
>>>>> +                       /* Check if switched to active or preempted 
>>>>> to active */
>>>>> +                       if ((status & (GEN8_CTX_STATUS_IDLE_ACTIVE |
>>>>> +                                       GEN8_CTX_STATUS_PREEMPTED)) &&
>>>>> +                           !(status & GEN11_CTX_STATUS_PREEMPT_IDLE))
>>>> Setting HWACK here is harmless as it gets cleared again. Unless, there
>>>> is some oddity in the code flow.
>>> I will check if lack of the change affects test results.
>>> Personally, I would keep this change, even if only for allowing simple
>>> definition of what HWACK flag means.
>> The simple definition is the opposite one, imo. We set the flag after we
>> get the corresponding response from HW; any preemption or activate event
>> must follow the most recent ELSP write. So that will include the
>> preemption event following the preempt-idle write.
>>
>> Then on deciding that the HW is idle, we apply the complication such
>> that execlists->active == 0. (That rule is what breaks the pattern.)
>> -Chris
> Ok, I will remove this unnecessary condition.
> I tested this and lack of it doesn't seem to affect the results.
> (I'll be out next week; expect v2 when I'm back)
> -Tomasz
> 

Do we have any test to cover a preempt to idle on already idle HW (which 
is the case we cover with this flag here)? If not maybe we cold add a 
selftest for that.

Daniele

> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx