[PATCH v2 01/23] drm/xe: Error handling in xe_force_wake_get()

Michal Wajdeczko michal.wajdeczko at intel.com
Thu Sep 12 21:31:25 UTC 2024



On 12.09.2024 21:15, Himal Prasad Ghimiray wrote:
> If an acknowledgment timeout occurs for a domain awake request, do not
> increment the reference count for the domain. This ensures that
> subsequent _get calls do not incorrectly assume the domain is awake. The
> return value is a mask of domains whose reference counts were
> incremented, and these domains need to be released using
> xe_force_wake_put.
> 
> The caller needs to compare the return value with the input domains to
> determine the success or failure of the operation and decide whether to
> continue or return accordingly.
> 
> While at it, add simple kernel-doc for xe_force_wake_get()
> 
> Cc: Badal Nilawar <badal.nilawar at intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
> Cc: Nirmoy Das <nirmoy.das at intel.com>
> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>
> ---
>  drivers/gpu/drm/xe/xe_force_wake.c | 35 +++++++++++++++++++++++++-----
>  1 file changed, 29 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_force_wake.c b/drivers/gpu/drm/xe/xe_force_wake.c
> index a64c14757c84..fa42d652d23f 100644
> --- a/drivers/gpu/drm/xe/xe_force_wake.c
> +++ b/drivers/gpu/drm/xe/xe_force_wake.c
> @@ -150,26 +150,49 @@ static int domain_sleep_wait(struct xe_gt *gt,
>  					 (ffs(tmp__) - 1))) && \
>  					 domain__->reg_ctl.addr)
>  
> +/**
> + * xe_force_wake_get : Increase the domain refcount; if it was 0 initially, wake the domain

while likely this is still recognized by the kernel-doc tool, this is
not correct notation for the function() documentation

[1] https://docs.kernel.org/doc-guide/kernel-doc.html#function-documentation

> + * @fw: struct xe_force_wake
> + * @domains: forcewake domains to get refcount on
> + *
> + * Increment refcount for the force-wake domain. If the domain is
> + * asleep, awaken it and wait for acknowledgment within the specified
> + * timeout. If a timeout occurs, decrement the refcount.

not sure if doc shall be 1:1 of low level implementation details

> + * The caller should compare the return value with the @domains to
> + * determine the success or failure of the operation.
> + *
> + * Return: mask of refcount increased domains. 

if we return a 'mask' then maybe it should be of 'unsigned int' type?

> If the return value is
> + * equal to the input parameter @domains, the operation is considered
> + * successful. Otherwise, the operation is considered a failure, and
> + * the caller should handle the failure case, potentially returning
> + * -ETIMEDOUT.

it looks that all problems with the nice API is due to the
XE_FORCEWAKE_ALL that is not a single domain ID and requires extra care

maybe there should be different pair of functions:

// for single domain where ret=0 is success, ret<0 is error
int xe_force_wake_get(fw, enum xe_force_wake_domain_id id);
void xe_force_wake_put(fw, enum xe_force_wake_domain_id id);

and

// for all domain where ret=0 is success, ret<0 is error
int int xe_force_wake_get_all(fw);
void xe_force_wake_put_all(fw);

and

// input: mask of domains, return: mask of domain
unsigned int xe_force_wake_get_mask(fw, mask);
void xe_force_wake_put_mask(fw, mask);

this last one can be just main implementation (static or public if we
really want to continue with random set of enabled domains)

> + */
>  int xe_force_wake_get(struct xe_force_wake *fw,
>  		      enum xe_force_wake_domains domains)
>  {
>  	struct xe_gt *gt = fw->gt;
>  	struct xe_force_wake_domain *domain;
> -	enum xe_force_wake_domains tmp, woken = 0;
> +	enum xe_force_wake_domains tmp, awake_rqst = 0, awake_ack = 0;

it looks that you're abusing even more all enum variables by treating
them as plain integers

>  	unsigned long flags;
> -	int ret = 0;
> +	int ret = domains;
>  
>  	spin_lock_irqsave(&fw->lock, flags);
>  	for_each_fw_domain_masked(domain, domains, fw, tmp) {
>  		if (!domain->ref++) {
> -			woken |= BIT(domain->id);
> +			awake_rqst |= BIT(domain->id);
>  			domain_wake(gt, domain);
>  		}
>  	}
> -	for_each_fw_domain_masked(domain, woken, fw, tmp) {
> -		ret |= domain_wake_wait(gt, domain);
> +	for_each_fw_domain_masked(domain, awake_rqst, fw, tmp) {
> +		if (domain_wake_wait(gt, domain) == 0) {
> +			awake_ack |= BIT(domain->id);
> +		} else {
> +			ret &= ~BIT(domain->id);
> +			--domain->ref;
> +		}
>  	}
> -	fw->awake_domains |= woken;
> +
> +	fw->awake_domains |= awake_ack;
>  	spin_unlock_irqrestore(&fw->lock, flags);
>  
>  	return ret;


More information about the Intel-xe mailing list