[PATCH] drm/xe/guc: Add more GuC CT states

Wed Dec 27 22:25:21 UTC 2023

On 12/27/2023 2:20 PM, Daniele Ceraolo Spurio wrote:
>
>
> On 12/27/2023 1:55 PM, Matthew Brost wrote:
>> On Fri, Dec 22, 2023 at 11:36:28AM -0800, Daniele Ceraolo Spurio wrote:
>>>
>>> On 12/21/2023 9:47 PM, Matthew Brost wrote:
>>>> On Thu, Dec 21, 2023 at 01:56:33PM -0800, Daniele Ceraolo Spurio 
>>>> wrote:
>>>>> On 12/19/2023 9:28 AM, Matthew Brost wrote:
>>>>>> The Guc CT has more than enabled / disables states rather it has 
>>>>>> 4. The
>>>>>> 4 states are not initialized, disabled, drop messages, and enabled.
>>>>>> Change the code to reflect this. These states will enable proper 
>>>>>> return
>>>>>> codes from functions and therefore enable proper error messages.
>>>>> Can you explain a bit more in which situation we expect to drop 
>>>>> messages and
>>>>> handle it? AFAICS not all callers waiting for a G2H reply can cope 
>>>>> with the
>>>> Anything that requires a G2H reply must be able to cope with it 
>>>> getting
>>>> dropped as the GuC can hang at any moment. Certainly all of submission
>>>> is designed this way, so is TLB invalidations. More on that below. 
>>>> With
>>>> everything being able to cope with lost G2H their is not a point to
>>>> continue to process G2H once a reset has started (or send H2G either).
>>>>
>>>>> reply not coming; e.g. it looks like xe_gt_tlb_invalidation_wait() 
>>>>> will
>>>> During a GT reset xe_gt_tlb_invalidation_reset() is called which will
>>>> signal all waiters for invalidations avoiding timeouts.
>>>>
>>>> So the flow roughly is:
>>>>
>>>> Set CT channel to drop messages
>>>> Stop all submissions
>>>> Do reset
>>>> Signal TLB invalidation waiters.
>>> Thanks for clarifying
>>>
>>>>> timeout and throw an error (which IMO is already an issue, because 
>>>>> the reply
>>>>> might be lost due to reset). I know that currently in all cases in 
>>>>> which we
>>>>> stop communication we do a reset, so the situation ends up ok, but 
>>>>> there is
>>>>> a pending series to remove the reset in the runtime suspend/resume 
>>>>> scenario
>>>>> (https://patchwork.freedesktop.org/series/122772/) in which case 
>>>>> IMO we
>>>> This path we would want to put the GuC communication into a state 
>>>> where
>>>> if messages send / recv this triggers an error. (-ENODEV). We don't
>>>> expect to suspend the device and then send / recv messages. That is 
>>>> the
>>>> point of this patch - it is fine drop messages during a reset, not if
>>>> during suspend or if CT has not yet been initialized.
>>> AFAIU one of the reasons behind this patch (internal report 53093) 
>>> is an
>>> issue around the suspend path, so we do already receive messages 
>>> after we
>>> started suspending. If I understand this patch correctly, we would 
>>> put the
>>> CT in DROP_MESSAGES state on suspend, via the following chain:
>>>
>>> gt_suspend
>>>          uc_suspend
>>>                  uc_stop
>>>                          guc_stop
>>>                                  guc_ct_drop_messages
>>>
>>> Are you saying this is fine for now, because we always do a reset on 
>>> resume,
>>> and that we'll need a new state when we stop doing such a reset? (not a
>>> complaint, just making sure I understood your reply).
>>>
>> I missed this path.... This is slightly different as here we do not call
>> xe_gt_tlb_invalidation_reset() but I think this does indeed current work
>> but that may change based on some work Rodrigo is doing related to finer
>> grained PM.
>>
>> This is turning into a larger discussion as it relates to reset /
>> suspend flows. I see 3 different flows, lets talk these through.
>>
>> 1. Reset flow (discussed above in my reply)
>>
>> Rough flow should be:
>> - Set CT channel to drop messages
>> - Stop all submissions
>> - Signal TLB invalidation waiters (this would be a change in location)
>> - Do reset
>> - Do restart (entails cleaning up any lost submission G2H, enable 
>> CTs, starting submission)
>>
>> In this case we drop all pending H2G / G2H and the reset flow ensures we
>> recover properly. No fluhing needed.
>>
>> 2. Suspend flow (discussed above in Daniele's reply)
>>
>> Rough flow should be for suspend:
>> - Set CT channel to drop messages
>> - Stop all submissions
>> - Flush G2H handler (just ensure worker is not running to race with 
>> next step, this step is missing)
>> - Set CT channel to disallow messages
>>
>> Rough flow should be for resume:
>> - Do reset
>> - Do restart (entails cleaning up any lost submission G2H, enable 
>> CTs, starting submission)
>>
>> In this case, between setting the CT channel to drop messages and 
>> stopping
>> submissions, teardowns of exec queues could be happening as this action
>> likely doesn't take a PM ref that prevents turning off the device (e.g.
>> user destroying an exec queue). It is safe to just drop these H2G / G2H
>> as the GuC will be reloaded. We don't except submissions here though as
>> those should have a PM ref. Not sure we have a way to enforce this yet
>> but could add something if needed. After submissions are stopped and G2H
>> handler fluhed we toggle the channel to disallow any further messages.
>>
>> Also notice we don't signal TLB waiters here. I am thinking if a TLB
>> invalidation is in flight we likely don't power down the device. We
>> ensure this via PM ref counting somehow. That being said, if the device
>> is powered off and we try issue a new TLB invalidation we likely
>> shouldn't issue the invalidation (this is an optimization that is not
>> required).
>
> Preamble: PM is not something I am too knowledgeable about, so there 
> might be errors in what I'm saying below. Please correct me if I'm wrong.
>
> The PCI subsystem might not power down the device if we have a rpm 
> ref, but we might still go through the pm_suspend call (as AFAIU that 
> is not tied to the rpm refcount) and thus disable the CT 
> communication. In i915 we have waiters in the S3 suspend flow to make 
> sure everything is properly flushed out without relying on the rpm 
> ref. Also note that we don't know if the device actually lost power, 
> only that we got ready for it (not sure if there is a way to tell on 
> the resume side), so if the device stayed awake we might end up having 
> to actually do the TLB invalidation (and not just signal the waiters) 
> given that, unlike i915, in Xe we don't do a full GT reset in the 
> resume path. IMO just easier to make sure all the pending TLB invals 
> have gone through before we complete the suspend flow.

Or we can add a GT reset in the resume flow, which will guarantee that 
the TLBs (and other HW units) are always clean so we can safely drop 
messages in the suspend path.

Daniele

>
>>
>> 3. Runtime suspend (relates to 
>> https://patchwork.freedesktop.org/series/122772/)
>>
>> Rough flow should be for suspend:
>> - Stop all submissions
>> - Wait for all pending G2H to complete naturally
>> - Set CT channel to disallow messages
>>
>> Rough flow should be for resume:
>> - Enable CTs
>> - Start all submissions
>>
>> Rather than flushing G2H, we have to waiting for any pending G2H dance
>> to complete before disallowing messages as the GuC will not be 
>> reloaded so
>> we need to ensure the GuC is in a known state.
>>
>> Here we'd have to wake the device on any new TLB invalidation too.
>
> This looks good to me.
>
> Daniele
>
>>
>> Does this seem correct and make sense?
>>
>> Matt
>>
>>>> Proper error messages will added based on these new states.
>>>>
>>>>> don't want to drop messages but do a flush instead.
>>>>>
>>>> See above. Also unsure what you mean by flush here? Do you mean the 
>>>> G2H
>>>> worker? I think that creates some dma-fencing (or lockdep) 
>>>> nightmares if
>>>> we do that.
>>> I meant the G2H, yes. We've had a ton of problem on the i915 side with
>>> worker threads running parallel to the suspend code and trying to 
>>> talk to
>>> the GuC (latest of which is
>>> https://patchwork.freedesktop.org/series/121916/), so I am kind of 
>>> worried
>>> something similar could happen here.
>>>
>>> Daniele
>>>
>>>> Matt
>>>>
>>>>> Daniele
>>>>>
>>>>>> Cc: Michal Wajdeczko <michal.wajdeczko at intel.com>
>>>>>> Cc: Tejas Upadhyay <tejas.upadhyay at intel.com>
>>>>>> Signed-off-by: Matthew Brost <matthew.brost at intel.com>
>>>>>> ---
>>>>>>     drivers/gpu/drm/xe/xe_guc.c          |  4 +-
>>>>>>     drivers/gpu/drm/xe/xe_guc_ct.c       | 55 
>>>>>> ++++++++++++++++++++--------
>>>>>>     drivers/gpu/drm/xe/xe_guc_ct.h       |  8 +++-
>>>>>>     drivers/gpu/drm/xe/xe_guc_ct_types.h | 18 ++++++++-
>>>>>>     4 files changed, 64 insertions(+), 21 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/xe/xe_guc.c 
>>>>>> b/drivers/gpu/drm/xe/xe_guc.c
>>>>>> index 482cb0df9f15..9b0fa8b1eb48 100644
>>>>>> --- a/drivers/gpu/drm/xe/xe_guc.c
>>>>>> +++ b/drivers/gpu/drm/xe/xe_guc.c
>>>>>> @@ -645,7 +645,7 @@ int xe_guc_mmio_send_recv(struct xe_guc *guc, 
>>>>>> const u32 *request,
>>>>>>         BUILD_BUG_ON(VF_SW_FLAG_COUNT != MED_VF_SW_FLAG_COUNT);
>>>>>> -    xe_assert(xe, !guc->ct.enabled);
>>>>>> +    xe_assert(xe, !xe_guc_ct_enabled(&guc->ct));
>>>>>>         xe_assert(xe, len);
>>>>>>         xe_assert(xe, len <= VF_SW_FLAG_COUNT);
>>>>>>         xe_assert(xe, len <= MED_VF_SW_FLAG_COUNT);
>>>>>> @@ -827,7 +827,7 @@ int xe_guc_stop(struct xe_guc *guc)
>>>>>>     {
>>>>>>         int ret;
>>>>>> -    xe_guc_ct_disable(&guc->ct);
>>>>>> +    xe_guc_ct_drop_messages(&guc->ct);
>>>>>>         ret = xe_guc_submit_stop(guc);
>>>>>>         if (ret)
>>>>>> diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c 
>>>>>> b/drivers/gpu/drm/xe/xe_guc_ct.c
>>>>>> index 24a33fa36496..22d655a8bf9a 100644
>>>>>> --- a/drivers/gpu/drm/xe/xe_guc_ct.c
>>>>>> +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
>>>>>> @@ -278,12 +278,25 @@ static int guc_ct_control_toggle(struct 
>>>>>> xe_guc_ct *ct, bool enable)
>>>>>>         return ret > 0 ? -EPROTO : ret;
>>>>>>     }
>>>>>> +static void xe_guc_ct_set_state(struct xe_guc_ct *ct,
>>>>>> +                enum xe_guc_ct_state state)
>>>>>> +{
>>>>>> +    mutex_lock(&ct->lock);        /* Serialise dequeue_one_g2h() */
>>>>>> +    spin_lock_irq(&ct->fast_lock);    /* Serialise CT fast-path */
>>>>>> +
>>>>>> +    ct->g2h_outstanding = 0;
>>>>>> +    ct->state = state;
>>>>>> +
>>>>>> +    spin_unlock_irq(&ct->fast_lock);
>>>>>> +    mutex_unlock(&ct->lock);
>>>>>> +}
>>>>>> +
>>>>>>     int xe_guc_ct_enable(struct xe_guc_ct *ct)
>>>>>>     {
>>>>>>         struct xe_device *xe = ct_to_xe(ct);
>>>>>>         int err;
>>>>>> -    xe_assert(xe, !ct->enabled);
>>>>>> +    xe_assert(xe, !xe_guc_ct_enabled(ct));
>>>>>>         guc_ct_ctb_h2g_init(xe, &ct->ctbs.h2g, &ct->bo->vmap);
>>>>>>         guc_ct_ctb_g2h_init(xe, &ct->ctbs.g2h, &ct->bo->vmap);
>>>>>> @@ -300,12 +313,7 @@ int xe_guc_ct_enable(struct xe_guc_ct *ct)
>>>>>>         if (err)
>>>>>>             goto err_out;
>>>>>> -    mutex_lock(&ct->lock);
>>>>>> -    spin_lock_irq(&ct->fast_lock);
>>>>>> -    ct->g2h_outstanding = 0;
>>>>>> -    ct->enabled = true;
>>>>>> -    spin_unlock_irq(&ct->fast_lock);
>>>>>> -    mutex_unlock(&ct->lock);
>>>>>> +    xe_guc_ct_set_state(ct, XE_GUC_CT_STATE_ENABLED);
>>>>>>         smp_mb();
>>>>>>         wake_up_all(&ct->wq);
>>>>>> @@ -321,12 +329,12 @@ int xe_guc_ct_enable(struct xe_guc_ct *ct)
>>>>>>     void xe_guc_ct_disable(struct xe_guc_ct *ct)
>>>>>>     {
>>>>>> -    mutex_lock(&ct->lock); /* Serialise dequeue_one_g2h() */
>>>>>> -    spin_lock_irq(&ct->fast_lock); /* Serialise CT fast-path */
>>>>>> -    ct->enabled = false; /* Finally disable CT communication */
>>>>>> -    spin_unlock_irq(&ct->fast_lock);
>>>>>> -    mutex_unlock(&ct->lock);
>>>>>> +    xe_guc_ct_set_state(ct, XE_GUC_CT_STATE_DISABLED);
>>>>>> +}
>>>>>> +void xe_guc_ct_drop_messages(struct xe_guc_ct *ct)
>>>>>> +{
>>>>>> +    xe_guc_ct_set_state(ct, XE_GUC_CT_STATE_DROP_MESSAGES);
>>>>>>         xa_destroy(&ct->fence_lookup);
>>>>>>     }
>>>>>> @@ -493,11 +501,19 @@ static int __guc_ct_send_locked(struct 
>>>>>> xe_guc_ct *ct, const u32 *action,
>>>>>>             goto out;
>>>>>>         }
>>>>>> -    if (unlikely(!ct->enabled)) {
>>>>>> +    if (ct->state == XE_GUC_CT_STATE_NOT_INITIALIZED ||
>>>>>> +        ct->state == XE_GUC_CT_STATE_DISABLED) {
>>>>>>             ret = -ENODEV;
>>>>>>             goto out;
>>>>>>         }
>>>>>> +    if (ct->state == XE_GUC_CT_STATE_DROP_MESSAGES) {
>>>>>> +        ret = -ECANCELED;
>>>>>> +        goto out;
>>>>>> +    }
>>>>>> +
>>>>>> +    xe_assert(xe, xe_guc_ct_enabled(ct));
>>>>>> +
>>>>>>         if (g2h_fence) {
>>>>>>             g2h_len = GUC_CTB_HXG_MSG_MAX_LEN;
>>>>>>             num_g2h = 1;
>>>>>> @@ -682,7 +698,8 @@ static bool retry_failure(struct xe_guc_ct 
>>>>>> *ct, int ret)
>>>>>>             return false;
>>>>>>     #define ct_alive(ct)    \
>>>>>> -    (ct->enabled && !ct->ctbs.h2g.info.broken && 
>>>>>> !ct->ctbs.g2h.info.broken)
>>>>>> +    (xe_guc_ct_enabled(ct) && !ct->ctbs.h2g.info.broken && \
>>>>>> +     !ct->ctbs.g2h.info.broken)
>>>>>>         if (!wait_event_interruptible_timeout(ct->wq, 
>>>>>> ct_alive(ct),  HZ * 5))
>>>>>>             return false;
>>>>>>     #undef ct_alive
>>>>>> @@ -941,12 +958,18 @@ static int g2h_read(struct xe_guc_ct *ct, 
>>>>>> u32 *msg, bool fast_path)
>>>>>>         lockdep_assert_held(&ct->fast_lock);
>>>>>> -    if (!ct->enabled)
>>>>>> +    if (ct->state == XE_GUC_CT_STATE_NOT_INITIALIZED ||
>>>>>> +        ct->state == XE_GUC_CT_STATE_DISABLED)
>>>>>>             return -ENODEV;
>>>>>> +    if (ct->state == XE_GUC_CT_STATE_DROP_MESSAGES)
>>>>>> +        return -ECANCELED;
>>>>>> +
>>>>>>         if (g2h->info.broken)
>>>>>>             return -EPIPE;
>>>>>> +    xe_assert(xe, xe_guc_ct_enabled(ct));
>>>>>> +
>>>>>>         /* Calculate DW available to read */
>>>>>>         tail = desc_read(xe, g2h, tail);
>>>>>>         avail = tail - g2h->info.head;
>>>>>> @@ -1245,7 +1268,7 @@ struct xe_guc_ct_snapshot 
>>>>>> *xe_guc_ct_snapshot_capture(struct xe_guc_ct *ct,
>>>>>>             return NULL;
>>>>>>         }
>>>>>> -    if (ct->enabled) {
>>>>>> +    if (xe_guc_ct_enabled(ct)) {
>>>>>>             snapshot->ct_enabled = true;
>>>>>>             snapshot->g2h_outstanding = 
>>>>>> READ_ONCE(ct->g2h_outstanding);
>>>>>>             guc_ctb_snapshot_capture(xe, &ct->ctbs.h2g,
>>>>>> diff --git a/drivers/gpu/drm/xe/xe_guc_ct.h 
>>>>>> b/drivers/gpu/drm/xe/xe_guc_ct.h
>>>>>> index f15f8a4857e0..214a6a357519 100644
>>>>>> --- a/drivers/gpu/drm/xe/xe_guc_ct.h
>>>>>> +++ b/drivers/gpu/drm/xe/xe_guc_ct.h
>>>>>> @@ -13,6 +13,7 @@ struct drm_printer;
>>>>>>     int xe_guc_ct_init(struct xe_guc_ct *ct);
>>>>>>     int xe_guc_ct_enable(struct xe_guc_ct *ct);
>>>>>>     void xe_guc_ct_disable(struct xe_guc_ct *ct);
>>>>>> +void xe_guc_ct_drop_messages(struct xe_guc_ct *ct);
>>>>>>     void xe_guc_ct_fast_path(struct xe_guc_ct *ct);
>>>>>>     struct xe_guc_ct_snapshot *
>>>>>> @@ -22,10 +23,15 @@ void xe_guc_ct_snapshot_print(struct 
>>>>>> xe_guc_ct_snapshot *snapshot,
>>>>>>     void xe_guc_ct_snapshot_free(struct xe_guc_ct_snapshot 
>>>>>> *snapshot);
>>>>>>     void xe_guc_ct_print(struct xe_guc_ct *ct, struct drm_printer 
>>>>>> *p, bool atomic);
>>>>>> +static inline bool xe_guc_ct_enabled(struct xe_guc_ct *ct)
>>>>>> +{
>>>>>> +    return ct->state == XE_GUC_CT_STATE_ENABLED;
>>>>>> +}
>>>>>> +
>>>>>>     static inline void xe_guc_ct_irq_handler(struct xe_guc_ct *ct)
>>>>>>     {
>>>>>>         wake_up_all(&ct->wq);
>>>>>> -    if (ct->enabled)
>>>>>> +    if (xe_guc_ct_enabled(ct))
>>>>>>             queue_work(system_unbound_wq, &ct->g2h_worker);
>>>>>>         xe_guc_ct_fast_path(ct);
>>>>>>     }
>>>>>> diff --git a/drivers/gpu/drm/xe/xe_guc_ct_types.h 
>>>>>> b/drivers/gpu/drm/xe/xe_guc_ct_types.h
>>>>>> index d814d4ee3fc6..e36c7029dffe 100644
>>>>>> --- a/drivers/gpu/drm/xe/xe_guc_ct_types.h
>>>>>> +++ b/drivers/gpu/drm/xe/xe_guc_ct_types.h
>>>>>> @@ -72,6 +72,20 @@ struct xe_guc_ct_snapshot {
>>>>>>         struct guc_ctb_snapshot h2g;
>>>>>>     };
>>>>>> +/**
>>>>>> + * enum xe_guc_ct_state - CT state
>>>>>> + * @XE_GUC_CT_STATE_NOT_INITIALIZED: CT suspended, messages not 
>>>>>> expected in this state
>>>>>> + * @XE_GUC_CT_STATE_DISABLED: CT disabled, messages not expected 
>>>>>> in this state
>>>>>> + * @XE_GUC_CT_STATE_DROP_MESSAGES: CT drops messages without errors
>>>>>> + * @XE_GUC_CT_STATE_ENABLED: CT enabled, messages sent / 
>>>>>> recieved in this state
>>>>>> + */
>>>>>> +enum xe_guc_ct_state {
>>>>>> +    XE_GUC_CT_STATE_NOT_INITIALIZED = 0,
>>>>>> +    XE_GUC_CT_STATE_DISABLED,
>>>>>> +    XE_GUC_CT_STATE_DROP_MESSAGES,
>>>>>> +    XE_GUC_CT_STATE_ENABLED,
>>>>>> +};
>>>>>> +
>>>>>>     /**
>>>>>>      * struct xe_guc_ct - GuC command transport (CT) layer
>>>>>>      *
>>>>>> @@ -96,8 +110,8 @@ struct xe_guc_ct {
>>>>>>         u32 g2h_outstanding;
>>>>>>         /** @g2h_worker: worker to process G2H messages */
>>>>>>         struct work_struct g2h_worker;
>>>>>> -    /** @enabled: CT enabled */
>>>>>> -    bool enabled;
>>>>>> +    /** @state: CT state */
>>>>>> +    enum xe_guc_ct_state state;;
>>>>>>         /** @fence_seqno: G2H fence seqno - 16 bits used by CT */
>>>>>>         u32 fence_seqno;
>>>>>>         /** @fence_lookup: G2H fence lookup */
>