[PATCH v6 04/25] drm/xe: Modify xe_force_wake_put to handle _get returned mask

Ghimiray, Himal Prasad himal.prasad.ghimiray at intel.com
Tue Oct 1 05:31:03 UTC 2024



On 01-10-2024 03:43, Matt Roper wrote:
> On Mon, Sep 30, 2024 at 11:01:28AM +0530, Himal Prasad Ghimiray wrote:
>> Instead of calling xe_force_wake_put on all domains that were input to
>> xe_force_wake_get, call _put only on the domains whose reference counts
>> were successfully incremented by the _get call. Since the return value
>> of _get can be a mask that does not match any specific value in the enum
>> xe_force_wake_domains, change the input parameter of _put to unsigned int.
>>
>> v3
>> - Move WARN to this patch (Badal)
>> - use xe_gt_WARN instead of XE_WARN (Michal)
>> - Stop using xe_force_wake_domains for non enum values.
>> - Remove kernel-doc from this patch (Badal)
>>
>> -v5
>> - Fix global awake_domain
>>
>> -v6
>> - put all initialized domains in case of FORCEWAKE_ALL.
>> - Modify ret variable name (Michal)
>> - Modify input var name (Michal)
>> - Modify commit message and warn (Badal)
>>
>> Cc: Michal Wajdeczko <michal.wajdeczko at intel.com>
>> Cc: Badal Nilawar <badal.nilawar at intel.com>
>> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
>> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
>> Cc: Nirmoy Das <nirmoy.das at intel.com>
>> Reviewed-by: Badal Nilawar <badal.nilawar at intel.com>
>> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>
>> ---
>>   drivers/gpu/drm/xe/xe_force_wake.c | 28 +++++++++++++++++++++-------
>>   drivers/gpu/drm/xe/xe_force_wake.h |  2 +-
>>   2 files changed, 22 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_force_wake.c b/drivers/gpu/drm/xe/xe_force_wake.c
>> index 7f358e42c5d4..372ea43b0d06 100644
>> --- a/drivers/gpu/drm/xe/xe_force_wake.c
>> +++ b/drivers/gpu/drm/xe/xe_force_wake.c
>> @@ -211,26 +211,40 @@ unsigned int xe_force_wake_get(struct xe_force_wake *fw,
>>   }
>>   
>>   int xe_force_wake_put(struct xe_force_wake *fw,
>> -		      enum xe_force_wake_domains domains)
>> +		      unsigned int fw_ref)
>>   {
>>   	struct xe_gt *gt = fw->gt;
>>   	struct xe_force_wake_domain *domain;
>> -	enum xe_force_wake_domains tmp, sleep = 0;
>> +	unsigned int tmp, sleep = 0;
>>   	unsigned long flags;
>> -	int ret = 0;
>> +	int ack_fail = 0;
>> +
>> +	/*
>> +	 * Avoid unnecessary lock and unlock when the function is called
>> +	 * in error path of individual domains.
>> +	 */
>> +	if (!fw_ref)
>> +		return 0;
>> +
>> +	if (fw_ref == XE_FORCEWAKE_ALL)
>> +		fw_ref = fw->initialized_domains;
>>   
>>   	spin_lock_irqsave(&fw->lock, flags);
>> -	for_each_fw_domain_masked(domain, domains, fw, tmp) {
>> +	for_each_fw_domain_masked(domain, fw_ref, fw, tmp) {
>>   		if (!--domain->ref) {
>>   			sleep |= BIT(domain->id);
>>   			domain_sleep(gt, domain);
>>   		}
>>   	}
>>   	for_each_fw_domain_masked(domain, sleep, fw, tmp) {
>> -		ret |= domain_sleep_wait(gt, domain);
>> +		if (domain_sleep_wait(gt, domain) == 0)
> 
> One of the long-standing bugs with Xe's forcewake implementation is that
> we shouldn't be waiting in the 'put' function at all.  The idea is that
> the driver is supposed to just submit a request to sleep and then move
> on; the hardware will actually go to sleep asynchronously and it's
> perfectly expected for that to not happen immediately when we request
> it.  Waiting here adds an unnecessary delay and slows down the whole
> system.  We probably/hopefully don't wake/sleep forcewake often enough
> in the Xe driver for this to cause a major user-noticeable performance
> impact, but the current synchronous sleep is definitely not the intended
> design or what we want going forward.
> 
> What we actually need to do is drop the wait here and add a check+wait
> at the beginning of the 'get' function to ensure that any previously
> submitted sleeps have actually completed before we start the next wake.
> Usually they will have already completed while the driver was doing
> other work, so there will be no extra artificial delays added as we have
> today.

I don't have any objections on my end. The only downside I see is that 
we won't have a definitive time for sleep acknowledgment with this approach.

I'd prefer to keep the current changes separate from this, and we can 
address it in future patches.

BR
Himal
> 
> 
> Matt
> 
>> +			fw->awake_domains &= ~BIT(domain->id);
>> +		else
>> +			ack_fail |= BIT(domain->id);
>>   	}
>> -	fw->awake_domains &= ~sleep;
>>   	spin_unlock_irqrestore(&fw->lock, flags);
>>   
>> -	return ret;
>> +	xe_gt_WARN(gt, ack_fail, "Forcewake domain%s %#x failed to acknowledge sleep request\n",
>> +		   str_plural(hweight_long(ack_fail)), ack_fail);
>> +	return ack_fail;
>>   }
>> diff --git a/drivers/gpu/drm/xe/xe_force_wake.h b/drivers/gpu/drm/xe/xe_force_wake.h
>> index eb638128952d..b5a75544d86a 100644
>> --- a/drivers/gpu/drm/xe/xe_force_wake.h
>> +++ b/drivers/gpu/drm/xe/xe_force_wake.h
>> @@ -18,7 +18,7 @@ void xe_force_wake_init_engines(struct xe_gt *gt,
>>   unsigned int xe_force_wake_get(struct xe_force_wake *fw,
>>   			       enum xe_force_wake_domains domains);
>>   int xe_force_wake_put(struct xe_force_wake *fw,
>> -		      enum xe_force_wake_domains domains);
>> +		      unsigned int fw_ref);
>>   
>>   static inline int
>>   xe_force_wake_ref(struct xe_force_wake *fw,
>> -- 
>> 2.34.1
>>
> 



More information about the Intel-xe mailing list