[PATCH v6 04/25] drm/xe: Modify xe_force_wake_put to handle _get returned mask
Ghimiray, Himal Prasad
himal.prasad.ghimiray at intel.com
Tue Oct 1 05:31:03 UTC 2024
On 01-10-2024 03:43, Matt Roper wrote:
> On Mon, Sep 30, 2024 at 11:01:28AM +0530, Himal Prasad Ghimiray wrote:
>> Instead of calling xe_force_wake_put on all domains that were input to
>> xe_force_wake_get, call _put only on the domains whose reference counts
>> were successfully incremented by the _get call. Since the return value
>> of _get can be a mask that does not match any specific value in the enum
>> xe_force_wake_domains, change the input parameter of _put to unsigned int.
>>
>> v3
>> - Move WARN to this patch (Badal)
>> - use xe_gt_WARN instead of XE_WARN (Michal)
>> - Stop using xe_force_wake_domains for non enum values.
>> - Remove kernel-doc from this patch (Badal)
>>
>> -v5
>> - Fix global awake_domain
>>
>> -v6
>> - put all initialized domains in case of FORCEWAKE_ALL.
>> - Modify ret variable name (Michal)
>> - Modify input var name (Michal)
>> - Modify commit message and warn (Badal)
>>
>> Cc: Michal Wajdeczko <michal.wajdeczko at intel.com>
>> Cc: Badal Nilawar <badal.nilawar at intel.com>
>> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
>> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
>> Cc: Nirmoy Das <nirmoy.das at intel.com>
>> Reviewed-by: Badal Nilawar <badal.nilawar at intel.com>
>> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>
>> ---
>> drivers/gpu/drm/xe/xe_force_wake.c | 28 +++++++++++++++++++++-------
>> drivers/gpu/drm/xe/xe_force_wake.h | 2 +-
>> 2 files changed, 22 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_force_wake.c b/drivers/gpu/drm/xe/xe_force_wake.c
>> index 7f358e42c5d4..372ea43b0d06 100644
>> --- a/drivers/gpu/drm/xe/xe_force_wake.c
>> +++ b/drivers/gpu/drm/xe/xe_force_wake.c
>> @@ -211,26 +211,40 @@ unsigned int xe_force_wake_get(struct xe_force_wake *fw,
>> }
>>
>> int xe_force_wake_put(struct xe_force_wake *fw,
>> - enum xe_force_wake_domains domains)
>> + unsigned int fw_ref)
>> {
>> struct xe_gt *gt = fw->gt;
>> struct xe_force_wake_domain *domain;
>> - enum xe_force_wake_domains tmp, sleep = 0;
>> + unsigned int tmp, sleep = 0;
>> unsigned long flags;
>> - int ret = 0;
>> + int ack_fail = 0;
>> +
>> + /*
>> + * Avoid unnecessary lock and unlock when the function is called
>> + * in error path of individual domains.
>> + */
>> + if (!fw_ref)
>> + return 0;
>> +
>> + if (fw_ref == XE_FORCEWAKE_ALL)
>> + fw_ref = fw->initialized_domains;
>>
>> spin_lock_irqsave(&fw->lock, flags);
>> - for_each_fw_domain_masked(domain, domains, fw, tmp) {
>> + for_each_fw_domain_masked(domain, fw_ref, fw, tmp) {
>> if (!--domain->ref) {
>> sleep |= BIT(domain->id);
>> domain_sleep(gt, domain);
>> }
>> }
>> for_each_fw_domain_masked(domain, sleep, fw, tmp) {
>> - ret |= domain_sleep_wait(gt, domain);
>> + if (domain_sleep_wait(gt, domain) == 0)
>
> One of the long-standing bugs with Xe's forcewake implementation is that
> we shouldn't be waiting in the 'put' function at all. The idea is that
> the driver is supposed to just submit a request to sleep and then move
> on; the hardware will actually go to sleep asynchronously and it's
> perfectly expected for that to not happen immediately when we request
> it. Waiting here adds an unnecessary delay and slows down the whole
> system. We probably/hopefully don't wake/sleep forcewake often enough
> in the Xe driver for this to cause a major user-noticeable performance
> impact, but the current synchronous sleep is definitely not the intended
> design or what we want going forward.
>
> What we actually need to do is drop the wait here and add a check+wait
> at the beginning of the 'get' function to ensure that any previously
> submitted sleeps have actually completed before we start the next wake.
> Usually they will have already completed while the driver was doing
> other work, so there will be no extra artificial delays added as we have
> today.
I don't have any objections on my end. The only downside I see is that
we won't have a definitive time for sleep acknowledgment with this approach.
I'd prefer to keep the current changes separate from this, and we can
address it in future patches.
BR
Himal
>
>
> Matt
>
>> + fw->awake_domains &= ~BIT(domain->id);
>> + else
>> + ack_fail |= BIT(domain->id);
>> }
>> - fw->awake_domains &= ~sleep;
>> spin_unlock_irqrestore(&fw->lock, flags);
>>
>> - return ret;
>> + xe_gt_WARN(gt, ack_fail, "Forcewake domain%s %#x failed to acknowledge sleep request\n",
>> + str_plural(hweight_long(ack_fail)), ack_fail);
>> + return ack_fail;
>> }
>> diff --git a/drivers/gpu/drm/xe/xe_force_wake.h b/drivers/gpu/drm/xe/xe_force_wake.h
>> index eb638128952d..b5a75544d86a 100644
>> --- a/drivers/gpu/drm/xe/xe_force_wake.h
>> +++ b/drivers/gpu/drm/xe/xe_force_wake.h
>> @@ -18,7 +18,7 @@ void xe_force_wake_init_engines(struct xe_gt *gt,
>> unsigned int xe_force_wake_get(struct xe_force_wake *fw,
>> enum xe_force_wake_domains domains);
>> int xe_force_wake_put(struct xe_force_wake *fw,
>> - enum xe_force_wake_domains domains);
>> + unsigned int fw_ref);
>>
>> static inline int
>> xe_force_wake_ref(struct xe_force_wake *fw,
>> --
>> 2.34.1
>>
>
More information about the Intel-xe
mailing list