[PATCH v6 03/25] drm/xe: Error handling in xe_force_wake_get()

Nilawar, Badal badal.nilawar at intel.com
Fri Oct 4 07:40:01 UTC 2024



On 03-10-2024 21:40, Ghimiray, Himal Prasad wrote:
> 
> 
> On 03-10-2024 17:53, Nilawar, Badal wrote:
>>
>>
>> On 30-09-2024 11:01, Himal Prasad Ghimiray wrote:
>>> If an acknowledgment timeout occurs for a forcewake domain awake
>>> request, do not increment the reference count for the domain. This
>>> ensures that subsequent _get calls do not incorrectly assume the domain
>>> is awake. The return value is a mask of domains that got refcounted,
>>> and these domains need to be provided for subsequent xe_force_wake_put
>>> call.
>>>
>>> While at it, add simple kernel-doc for xe_force_wake_get()
>>>
>>> v3
>>> - Use explicit type for mask (Michal/Badal)
>>> - Improve kernel-doc (Michal)
>>> - Use unsigned int instead of abusing enum (Michal)
>>>
>>> v5
>>> - Use unsigned int for return (MattB/Badal/Rodrigo)
>>> - use xe_gt_WARN for domain awake ack failure (Badal/Rodrigo)
>>>
>>> v6
>>> - Change XE_FORCEWAKE_ALL to single bit, this helps accommodate
>>> actually refcounted domains in return. (Michal)
>>> - Modify commit message and warn message (Badal)
>>> - Remove unnecessary information in kernel-doc (Michal)
>>>
>>> Cc: Michal Wajdeczko <michal.wajdeczko at intel.com>
>>> Cc: Badal Nilawar <badal.nilawar at intel.com>
>>> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
>>> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
>>> Cc: Nirmoy Das <nirmoy.das at intel.com>
>>> Reviewed-by: Badal Nilawar <badal.nilawar at intel.com>
>>> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>
>>> ---
>>>   drivers/gpu/drm/xe/xe_force_wake.c       | 45 ++++++++++++++++++------
>>>   drivers/gpu/drm/xe/xe_force_wake.h       |  4 +--
>>>   drivers/gpu/drm/xe/xe_force_wake_types.h |  2 +-
>>>   3 files changed, 38 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_force_wake.c b/drivers/gpu/drm/xe/ 
>>> xe_force_wake.c
>>> index 5ce9e912818a..7f358e42c5d4 100644
>>> --- a/drivers/gpu/drm/xe/xe_force_wake.c
>>> +++ b/drivers/gpu/drm/xe/xe_force_wake.c
>>> @@ -160,29 +160,54 @@ static int domain_sleep_wait(struct xe_gt *gt,
>>>                        (ffs(tmp__) - 1))) && \
>>>                        domain__->reg_ctl.addr)
>>> -int xe_force_wake_get(struct xe_force_wake *fw,
>>> -              enum xe_force_wake_domains domains)
>>> +/**
>>> + * xe_force_wake_get() : Increase the domain refcount
>>> + * @fw: struct xe_force_wake
>>> + * @domains: forcewake domains to get refcount on
>>> + *
>>> + * This function takes references for the input @domains and wakes 
>>> them if
>>> + * they are asleep.If requested domain is ALL then only
>>> + * applicable/initialized domains will be considered for refcount 
>>> and it is
>>> + * a caller responsibilty to check returned ref if it includes any 
>>> specific
>>> + * domain by using xe_force_wake_ref_has_domain() function. caller 
>>> must call
>>> + * xe_force_wake_put() function to decrease incremented refcounts.
>>> + *
>>> + * Return: opaque reference to woken domains or zero if none of 
>>> requested
>>> + * domains were awake.
>>> + */
>>> +unsigned int xe_force_wake_get(struct xe_force_wake *fw,
>>> +                   enum xe_force_wake_domains domains)
>>>   {
>>>       struct xe_gt *gt = fw->gt;
>>>       struct xe_force_wake_domain *domain;
>>> -    enum xe_force_wake_domains tmp, woken = 0;
>>> +    unsigned int ref_incr = 0, awake_rqst = 0, awake_failed = 0;
>>> +    unsigned int tmp, ref_rqst;
>>>       unsigned long flags;
>>> -    int ret = 0;
>>
>> As we have fw->initialized_domains lets add check if valid domain is 
>> being passed and assert otherwise.
> 
> Sounds good check to have. Will add it.
> 
> 
>>
>> Regards,
>> Badal
>>
>>> +    ref_rqst = (domains == XE_FORCEWAKE_ALL) ? fw- 
>>> >initialized_domains : domains;
>>>       spin_lock_irqsave(&fw->lock, flags);
>>> -    for_each_fw_domain_masked(domain, domains, fw, tmp) {
>>> +    for_each_fw_domain_masked(domain, ref_rqst, fw, tmp) {
>>>           if (!domain->ref++) {
>>> -            woken |= BIT(domain->id);
>>> +            awake_rqst |= BIT(domain->id);
>>>               domain_wake(gt, domain);
>>>           }
>>> +        ref_incr |= BIT(domain->id);
>>>       }
>>> -    for_each_fw_domain_masked(domain, woken, fw, tmp) {
>>> -        ret |= domain_wake_wait(gt, domain);
>>> +    for_each_fw_domain_masked(domain, awake_rqst, fw, tmp) {
>>> +        if (domain_wake_wait(gt, domain) == 0) {
>>> +            fw->awake_domains |= BIT(domain->id);
>>> +        } else {
>>> +            awake_failed |= BIT(domain->id);
>>> +            --domain->ref;
>>> +        }
>>>       }
>>> -    fw->awake_domains |= woken;
>>> +    ref_incr &= ~awake_failed;
>>>       spin_unlock_irqrestore(&fw->lock, flags);
>>> -    return ret;
>>> +    xe_gt_WARN(gt, awake_failed, "Forcewake domain%s %#x failed to 
>>> acknowledge awake request\n",
>>> +           str_plural(hweight_long(awake_failed)), awake_failed);
>>> +
>>> +    return (ref_incr == fw->initialized_domains) ? ref_incr | 
>>> XE_FORCEWAKE_ALL : ref_incr;

How about we simply return ref_incr at this point? Then, in patch 2, 
given that we have a helper function available, we can validate ref_incr 
against fw->initialized_domains, particularly for XE_FORCEWAKE_ALL, 
within that helper function.

Regards,
Badal

>>>   }
>>>   int xe_force_wake_put(struct xe_force_wake *fw,
>>> diff --git a/drivers/gpu/drm/xe/xe_force_wake.h b/drivers/gpu/drm/xe/ 
>>> xe_force_wake.h
>>> index de720881a300..eb638128952d 100644
>>> --- a/drivers/gpu/drm/xe/xe_force_wake.h
>>> +++ b/drivers/gpu/drm/xe/xe_force_wake.h
>>> @@ -15,8 +15,8 @@ void xe_force_wake_init_gt(struct xe_gt *gt,
>>>                  struct xe_force_wake *fw);
>>>   void xe_force_wake_init_engines(struct xe_gt *gt,
>>>                   struct xe_force_wake *fw);
>>> -int xe_force_wake_get(struct xe_force_wake *fw,
>>> -              enum xe_force_wake_domains domains);
>>> +unsigned int xe_force_wake_get(struct xe_force_wake *fw,
>>> +                   enum xe_force_wake_domains domains);
>>>   int xe_force_wake_put(struct xe_force_wake *fw,
>>>                 enum xe_force_wake_domains domains);
>>> diff --git a/drivers/gpu/drm/xe/xe_force_wake_types.h b/drivers/gpu/ 
>>> drm/xe/xe_force_wake_types.h
>>> index fde17dc3d01e..899fbbcb3ea9 100644
>>> --- a/drivers/gpu/drm/xe/xe_force_wake_types.h
>>> +++ b/drivers/gpu/drm/xe/xe_force_wake_types.h
>>> @@ -48,7 +48,7 @@ enum xe_force_wake_domains {
>>>       XE_FW_MEDIA_VEBOX2    = BIT(XE_FW_DOMAIN_ID_MEDIA_VEBOX2),
>>>       XE_FW_MEDIA_VEBOX3    = BIT(XE_FW_DOMAIN_ID_MEDIA_VEBOX3),
>>>       XE_FW_GSC        = BIT(XE_FW_DOMAIN_ID_GSC),
>>> -    XE_FORCEWAKE_ALL    = BIT(XE_FW_DOMAIN_ID_COUNT) - 1
>>> +    XE_FORCEWAKE_ALL    = BIT(XE_FW_DOMAIN_ID_COUNT)
>>>   };
>>>   /**
>>
> 



More information about the Intel-xe mailing list