[Intel-gfx] [PATCH 1/2] drm/i915/gsc: flush the GSC worker before wedging on unload
Teres Alexis, Alan Previn
alan.previn.teres.alexis at intel.com
Tue Mar 7 22:55:29 UTC 2023
On Thu, 2023-02-23 at 09:21 -0800, Ceraolo Spurio, Daniele wrote:
> If we unload the driver and wedge before the GSC worker is complete,
> the worker will hit an error on its submission to the GSC engine and
> then exit. This is hard to hit for a user, but it is reproducible
> with skipping selftests. The error is handled gracefully by the
> worker, so there are no functional issues, but we still end up with
> an error message in dmesg, which is something we want to avoid as
> this is a supported scenario. We could modify the worker to better
> handle a wedging occurring during its execution, but that gets
> complicated for a couple of reasons:
> - We do want the error on runtime wedging, because there are
> implications for subsystems outside of GT (i.e., PXP, HDCP), it's
> only the error on driver unload that we want to silence.
> - The worker is responsible for multiple submissions (GSC FW load,
> HuC auth, SW proxy), so all of those will have to be adapted to
> handle the wedged_on_fini scenario.
> Therefore, it's much simpler to just wait for the worker to be done
> before wedging on driver removal, also considering that the worker
> will likely already be idle in the great majority of non-selftest
> scenarios.
>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
>
alan:snip
>
> --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> @@ -784,6 +784,29 @@ void intel_gt_driver_unregister(struct intel_gt *gt)
> intel_rps_driver_unregister(>->rps);
> intel_gsc_fini(>->gsc);
>
> + /*
> + * If we unload the driver and wedge before the GSC worker is complete,
alan:snip
> + * scenarios.
> + */
> + intel_gsc_uc_flush_work(>->uc.gsc);
alan: nit: from a code readiblity, having intel_gsc_fini before intel_gsc_uc_flush_work almost reads
as if code should have been inverted eventhough we know that the former is for the old mei-comp based
framework and the latter is based on the new gsc-cs based framework. i cant think of how else to resolve
this other unintrusively other than adding a comment. Other than that, also considering we've had offline
testing already verify this and the next patch too, LGTM:
Reviewed-by: Alan Previn <alan.previn.teres.alexis at intel.com>
More information about the Intel-gfx
mailing list