[Intel-gfx] [PATCH] i915/guc: Run busyness worker only if gt is awake

Daniele Ceraolo Spurio daniele.ceraolospurio at intel.com
Thu Sep 14 17:23:37 UTC 2023


On 9/11/2023 5:52 PM, Umesh Nerlige Ramappa wrote:
> The worker is canceled in the __gt_park path, but we still see it
> running sometimes during suspend.
>
> Only update stats if gt is awake. If not, intel_guc_busyness_park would
> have already updated the stats. Note that we do not requeue the worker
> if gt is not awake since intel_guc_busyness_unpark would do that at some
> point.
>
> If the gt was parked longer than time taken for GT timestamp to roll
> over, we ignore those rollovers since we don't care about tracking the
> exact GT time. We only care about roll overs when the gt is active and
> running workloads.
>
> v2 (Daniele)
> - Edit commit message and code comment
> - Use runtime pm in the worker
> - Put runtime pm after enabling the worker
> - Use Link tag and add Fixes tag
>
> Link: https://gitlab.freedesktop.org/drm/intel/-/issues/7077
> Fixes: 77cdd054dd2c ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu")
> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa at intel.com>
> ---
>   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 26 ++++++++++++++++---
>   1 file changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index e250bedf90fb..d37b255559a0 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -1461,6 +1461,24 @@ static void guc_timestamp_ping(struct work_struct *wrk)
>   	unsigned long index;
>   	int srcu, ret;
>   
> +	/*
> +	 * The worker is canceled in the __gt_park path, but we still see it
> +	 * running sometimes during suspend.

This sounds very vague and more like there is a bug in __gt_park. We 
actually do know that the issue on the park side is that the worker 
re-queues itself if it's running while we cancel it; that's not really a 
problem as long as we don't wake the device after it's gone to sleep 
(which is what your change below does), so this sentence should be 
reworded or dropped.

> +	 *
> +	 * Only update stats if gt is awake. If not, intel_guc_busyness_park
> +	 * would have already updated the stats. Note that we do not requeue the
> +	 * worker in this case since intel_guc_busyness_unpark would do that at
> +	 * some point.
> +	 *
> +	 * If the gt was parked longer than time taken for GT timestamp to roll
> +	 * over, we ignore those rollovers since we don't care about tracking
> +	 * the exact GT time. We only care about roll overs when the gt is
> +	 * active and running workloads.
> +	 */
> +	wakeref = intel_runtime_pm_get_if_active(&gt->i915->runtime_pm);

The patch title and the comment refer to GT being awake, but here you're 
taking a global rpm ref and not a gt_pm ref. I understand that taking a 
gt_pm ref in here can be complicated because of the canceling of the 
worker in the gt_park flow and that taking an rpm ref does work for what 
we need, but the title/comments need to reflect and explain that.

Daniele

> +	if (!wakeref)
> +		return;
> +
>   	/*
>   	 * Synchronize with gt reset to make sure the worker does not
>   	 * corrupt the engine/guc stats. NB: can't actually block waiting
> @@ -1469,10 +1487,9 @@ static void guc_timestamp_ping(struct work_struct *wrk)
>   	 */
>   	ret = intel_gt_reset_trylock(gt, &srcu);
>   	if (ret)
> -		return;
> +		goto err_trylock;
>   
> -	with_intel_runtime_pm(&gt->i915->runtime_pm, wakeref)
> -		__update_guc_busyness_stats(guc);
> +	__update_guc_busyness_stats(guc);
>   
>   	/* adjust context stats for overflow */
>   	xa_for_each(&guc->context_lookup, index, ce)
> @@ -1481,6 +1498,9 @@ static void guc_timestamp_ping(struct work_struct *wrk)
>   	intel_gt_reset_unlock(gt, srcu);
>   
>   	guc_enable_busyness_worker(guc);
> +
> +err_trylock:
> +	intel_runtime_pm_put(&gt->i915->runtime_pm, wakeref);
>   }
>   
>   static int guc_action_enable_usage_stats(struct intel_guc *guc)



More information about the Intel-gfx mailing list