[PATCH v6 04/25] drm/xe: Modify xe_force_wake_put to handle _get returned mask
Matt Roper
matthew.d.roper at intel.com
Mon Sep 30 22:13:52 UTC 2024
On Mon, Sep 30, 2024 at 11:01:28AM +0530, Himal Prasad Ghimiray wrote:
> Instead of calling xe_force_wake_put on all domains that were input to
> xe_force_wake_get, call _put only on the domains whose reference counts
> were successfully incremented by the _get call. Since the return value
> of _get can be a mask that does not match any specific value in the enum
> xe_force_wake_domains, change the input parameter of _put to unsigned int.
>
> v3
> - Move WARN to this patch (Badal)
> - use xe_gt_WARN instead of XE_WARN (Michal)
> - Stop using xe_force_wake_domains for non enum values.
> - Remove kernel-doc from this patch (Badal)
>
> -v5
> - Fix global awake_domain
>
> -v6
> - put all initialized domains in case of FORCEWAKE_ALL.
> - Modify ret variable name (Michal)
> - Modify input var name (Michal)
> - Modify commit message and warn (Badal)
>
> Cc: Michal Wajdeczko <michal.wajdeczko at intel.com>
> Cc: Badal Nilawar <badal.nilawar at intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
> Cc: Nirmoy Das <nirmoy.das at intel.com>
> Reviewed-by: Badal Nilawar <badal.nilawar at intel.com>
> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>
> ---
> drivers/gpu/drm/xe/xe_force_wake.c | 28 +++++++++++++++++++++-------
> drivers/gpu/drm/xe/xe_force_wake.h | 2 +-
> 2 files changed, 22 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_force_wake.c b/drivers/gpu/drm/xe/xe_force_wake.c
> index 7f358e42c5d4..372ea43b0d06 100644
> --- a/drivers/gpu/drm/xe/xe_force_wake.c
> +++ b/drivers/gpu/drm/xe/xe_force_wake.c
> @@ -211,26 +211,40 @@ unsigned int xe_force_wake_get(struct xe_force_wake *fw,
> }
>
> int xe_force_wake_put(struct xe_force_wake *fw,
> - enum xe_force_wake_domains domains)
> + unsigned int fw_ref)
> {
> struct xe_gt *gt = fw->gt;
> struct xe_force_wake_domain *domain;
> - enum xe_force_wake_domains tmp, sleep = 0;
> + unsigned int tmp, sleep = 0;
> unsigned long flags;
> - int ret = 0;
> + int ack_fail = 0;
> +
> + /*
> + * Avoid unnecessary lock and unlock when the function is called
> + * in error path of individual domains.
> + */
> + if (!fw_ref)
> + return 0;
> +
> + if (fw_ref == XE_FORCEWAKE_ALL)
> + fw_ref = fw->initialized_domains;
>
> spin_lock_irqsave(&fw->lock, flags);
> - for_each_fw_domain_masked(domain, domains, fw, tmp) {
> + for_each_fw_domain_masked(domain, fw_ref, fw, tmp) {
> if (!--domain->ref) {
> sleep |= BIT(domain->id);
> domain_sleep(gt, domain);
> }
> }
> for_each_fw_domain_masked(domain, sleep, fw, tmp) {
> - ret |= domain_sleep_wait(gt, domain);
> + if (domain_sleep_wait(gt, domain) == 0)
One of the long-standing bugs with Xe's forcewake implementation is that
we shouldn't be waiting in the 'put' function at all. The idea is that
the driver is supposed to just submit a request to sleep and then move
on; the hardware will actually go to sleep asynchronously and it's
perfectly expected for that to not happen immediately when we request
it. Waiting here adds an unnecessary delay and slows down the whole
system. We probably/hopefully don't wake/sleep forcewake often enough
in the Xe driver for this to cause a major user-noticeable performance
impact, but the current synchronous sleep is definitely not the intended
design or what we want going forward.
What we actually need to do is drop the wait here and add a check+wait
at the beginning of the 'get' function to ensure that any previously
submitted sleeps have actually completed before we start the next wake.
Usually they will have already completed while the driver was doing
other work, so there will be no extra artificial delays added as we have
today.
Matt
> + fw->awake_domains &= ~BIT(domain->id);
> + else
> + ack_fail |= BIT(domain->id);
> }
> - fw->awake_domains &= ~sleep;
> spin_unlock_irqrestore(&fw->lock, flags);
>
> - return ret;
> + xe_gt_WARN(gt, ack_fail, "Forcewake domain%s %#x failed to acknowledge sleep request\n",
> + str_plural(hweight_long(ack_fail)), ack_fail);
> + return ack_fail;
> }
> diff --git a/drivers/gpu/drm/xe/xe_force_wake.h b/drivers/gpu/drm/xe/xe_force_wake.h
> index eb638128952d..b5a75544d86a 100644
> --- a/drivers/gpu/drm/xe/xe_force_wake.h
> +++ b/drivers/gpu/drm/xe/xe_force_wake.h
> @@ -18,7 +18,7 @@ void xe_force_wake_init_engines(struct xe_gt *gt,
> unsigned int xe_force_wake_get(struct xe_force_wake *fw,
> enum xe_force_wake_domains domains);
> int xe_force_wake_put(struct xe_force_wake *fw,
> - enum xe_force_wake_domains domains);
> + unsigned int fw_ref);
>
> static inline int
> xe_force_wake_ref(struct xe_force_wake *fw,
> --
> 2.34.1
>
--
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation
More information about the Intel-xe
mailing list