[PATCH v6 03/25] drm/xe: Error handling in xe_force_wake_get()

Ghimiray, Himal Prasad himal.prasad.ghimiray at intel.com
Mon Oct 7 03:14:08 UTC 2024



On 04-10-2024 13:10, Nilawar, Badal wrote:
> 
> 
> On 03-10-2024 21:40, Ghimiray, Himal Prasad wrote:
>>
>>
>> On 03-10-2024 17:53, Nilawar, Badal wrote:
>>>
>>>
>>> On 30-09-2024 11:01, Himal Prasad Ghimiray wrote:
>>>> If an acknowledgment timeout occurs for a forcewake domain awake
>>>> request, do not increment the reference count for the domain. This
>>>> ensures that subsequent _get calls do not incorrectly assume the domain
>>>> is awake. The return value is a mask of domains that got refcounted,
>>>> and these domains need to be provided for subsequent xe_force_wake_put
>>>> call.
>>>>
>>>> While at it, add simple kernel-doc for xe_force_wake_get()
>>>>
>>>> v3
>>>> - Use explicit type for mask (Michal/Badal)
>>>> - Improve kernel-doc (Michal)
>>>> - Use unsigned int instead of abusing enum (Michal)
>>>>
>>>> v5
>>>> - Use unsigned int for return (MattB/Badal/Rodrigo)
>>>> - use xe_gt_WARN for domain awake ack failure (Badal/Rodrigo)
>>>>
>>>> v6
>>>> - Change XE_FORCEWAKE_ALL to single bit, this helps accommodate
>>>> actually refcounted domains in return. (Michal)
>>>> - Modify commit message and warn message (Badal)
>>>> - Remove unnecessary information in kernel-doc (Michal)
>>>>
>>>> Cc: Michal Wajdeczko <michal.wajdeczko at intel.com>
>>>> Cc: Badal Nilawar <badal.nilawar at intel.com>
>>>> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
>>>> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
>>>> Cc: Nirmoy Das <nirmoy.das at intel.com>
>>>> Reviewed-by: Badal Nilawar <badal.nilawar at intel.com>
>>>> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>
>>>> ---
>>>>   drivers/gpu/drm/xe/xe_force_wake.c       | 45 +++++++++++++++++ 
>>>> +------
>>>>   drivers/gpu/drm/xe/xe_force_wake.h       |  4 +--
>>>>   drivers/gpu/drm/xe/xe_force_wake_types.h |  2 +-
>>>>   3 files changed, 38 insertions(+), 13 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_force_wake.c b/drivers/gpu/drm/ 
>>>> xe/ xe_force_wake.c
>>>> index 5ce9e912818a..7f358e42c5d4 100644
>>>> --- a/drivers/gpu/drm/xe/xe_force_wake.c
>>>> +++ b/drivers/gpu/drm/xe/xe_force_wake.c
>>>> @@ -160,29 +160,54 @@ static int domain_sleep_wait(struct xe_gt *gt,
>>>>                        (ffs(tmp__) - 1))) && \
>>>>                        domain__->reg_ctl.addr)
>>>> -int xe_force_wake_get(struct xe_force_wake *fw,
>>>> -              enum xe_force_wake_domains domains)
>>>> +/**
>>>> + * xe_force_wake_get() : Increase the domain refcount
>>>> + * @fw: struct xe_force_wake
>>>> + * @domains: forcewake domains to get refcount on
>>>> + *
>>>> + * This function takes references for the input @domains and wakes 
>>>> them if
>>>> + * they are asleep.If requested domain is ALL then only
>>>> + * applicable/initialized domains will be considered for refcount 
>>>> and it is
>>>> + * a caller responsibilty to check returned ref if it includes any 
>>>> specific
>>>> + * domain by using xe_force_wake_ref_has_domain() function. caller 
>>>> must call
>>>> + * xe_force_wake_put() function to decrease incremented refcounts.
>>>> + *
>>>> + * Return: opaque reference to woken domains or zero if none of 
>>>> requested
>>>> + * domains were awake.
>>>> + */
>>>> +unsigned int xe_force_wake_get(struct xe_force_wake *fw,
>>>> +                   enum xe_force_wake_domains domains)
>>>>   {
>>>>       struct xe_gt *gt = fw->gt;
>>>>       struct xe_force_wake_domain *domain;
>>>> -    enum xe_force_wake_domains tmp, woken = 0;
>>>> +    unsigned int ref_incr = 0, awake_rqst = 0, awake_failed = 0;
>>>> +    unsigned int tmp, ref_rqst;
>>>>       unsigned long flags;
>>>> -    int ret = 0;
>>>
>>> As we have fw->initialized_domains lets add check if valid domain is 
>>> being passed and assert otherwise.
>>
>> Sounds good check to have. Will add it.
>>
>>
>>>
>>> Regards,
>>> Badal
>>>
>>>> +    ref_rqst = (domains == XE_FORCEWAKE_ALL) ? fw- 
>>>> >initialized_domains : domains;
>>>>       spin_lock_irqsave(&fw->lock, flags);
>>>> -    for_each_fw_domain_masked(domain, domains, fw, tmp) {
>>>> +    for_each_fw_domain_masked(domain, ref_rqst, fw, tmp) {
>>>>           if (!domain->ref++) {
>>>> -            woken |= BIT(domain->id);
>>>> +            awake_rqst |= BIT(domain->id);
>>>>               domain_wake(gt, domain);
>>>>           }
>>>> +        ref_incr |= BIT(domain->id);
>>>>       }
>>>> -    for_each_fw_domain_masked(domain, woken, fw, tmp) {
>>>> -        ret |= domain_wake_wait(gt, domain);
>>>> +    for_each_fw_domain_masked(domain, awake_rqst, fw, tmp) {
>>>> +        if (domain_wake_wait(gt, domain) == 0) {
>>>> +            fw->awake_domains |= BIT(domain->id);
>>>> +        } else {
>>>> +            awake_failed |= BIT(domain->id);
>>>> +            --domain->ref;
>>>> +        }
>>>>       }
>>>> -    fw->awake_domains |= woken;
>>>> +    ref_incr &= ~awake_failed;
>>>>       spin_unlock_irqrestore(&fw->lock, flags);
>>>> -    return ret;
>>>> +    xe_gt_WARN(gt, awake_failed, "Forcewake domain%s %#x failed to 
>>>> acknowledge awake request\n",
>>>> +           str_plural(hweight_long(awake_failed)), awake_failed);
>>>> +
>>>> +    return (ref_incr == fw->initialized_domains) ? ref_incr | 
>>>> XE_FORCEWAKE_ALL : ref_incr;
> 
> How about we simply return ref_incr at this point? Then, in patch 2, 
> given that we have a helper function available, we can validate ref_incr 
> against fw->initialized_domains, particularly for XE_FORCEWAKE_ALL, 
> within that helper function.


That approach is perfectly fine as well. The reasoning for this method 
was to maintain the independence of the API/helper from the fw.


> 
> Regards,
> Badal
> 
>>>>   }
>>>>   int xe_force_wake_put(struct xe_force_wake *fw,
>>>> diff --git a/drivers/gpu/drm/xe/xe_force_wake.h b/drivers/gpu/drm/ 
>>>> xe/ xe_force_wake.h
>>>> index de720881a300..eb638128952d 100644
>>>> --- a/drivers/gpu/drm/xe/xe_force_wake.h
>>>> +++ b/drivers/gpu/drm/xe/xe_force_wake.h
>>>> @@ -15,8 +15,8 @@ void xe_force_wake_init_gt(struct xe_gt *gt,
>>>>                  struct xe_force_wake *fw);
>>>>   void xe_force_wake_init_engines(struct xe_gt *gt,
>>>>                   struct xe_force_wake *fw);
>>>> -int xe_force_wake_get(struct xe_force_wake *fw,
>>>> -              enum xe_force_wake_domains domains);
>>>> +unsigned int xe_force_wake_get(struct xe_force_wake *fw,
>>>> +                   enum xe_force_wake_domains domains);
>>>>   int xe_force_wake_put(struct xe_force_wake *fw,
>>>>                 enum xe_force_wake_domains domains);
>>>> diff --git a/drivers/gpu/drm/xe/xe_force_wake_types.h b/drivers/gpu/ 
>>>> drm/xe/xe_force_wake_types.h
>>>> index fde17dc3d01e..899fbbcb3ea9 100644
>>>> --- a/drivers/gpu/drm/xe/xe_force_wake_types.h
>>>> +++ b/drivers/gpu/drm/xe/xe_force_wake_types.h
>>>> @@ -48,7 +48,7 @@ enum xe_force_wake_domains {
>>>>       XE_FW_MEDIA_VEBOX2    = BIT(XE_FW_DOMAIN_ID_MEDIA_VEBOX2),
>>>>       XE_FW_MEDIA_VEBOX3    = BIT(XE_FW_DOMAIN_ID_MEDIA_VEBOX3),
>>>>       XE_FW_GSC        = BIT(XE_FW_DOMAIN_ID_GSC),
>>>> -    XE_FORCEWAKE_ALL    = BIT(XE_FW_DOMAIN_ID_COUNT) - 1
>>>> +    XE_FORCEWAKE_ALL    = BIT(XE_FW_DOMAIN_ID_COUNT)
>>>>   };
>>>>   /**
>>>
>>
> 



More information about the Intel-xe mailing list