[Intel-gfx] [PATCH] drm/i915/guc: Fix detection of GuC submission in use
Janusz Krzysztofik
janusz.krzysztofik at linux.intel.com
Thu Sep 5 12:34:59 UTC 2019
Hi Michał,
On Thursday, September 5, 2019 2:08:12 PM CEST Michal Wajdeczko wrote:
> On Thu, 05 Sep 2019 13:16:31 +0200, Janusz Krzysztofik
> <janusz.krzysztofik at linux.intel.com> wrote:
>
> > The driver always assumes active GuC submission mode if it is
> > supported. That's not true if GuC initialization fails for some
> > reason. That may lead to kernel panics, caused e.g. by execlists
> > fallback submission mode incorrectly detecting GuC submission in use.
> >
> > Fix it by also checking for GuC enabled status.
> >
> > Fixes: 356c484822e6 ("drm/i915/uc: Add explicit DISABLED state for
> > firmware")
> > Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik at linux.intel.com>
> > ---
> > drivers/gpu/drm/i915/gt/uc/intel_uc.h | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.h
> > b/drivers/gpu/drm/i915/gt/uc/intel_uc.h
> > index 527995c21196..b28bab64a280 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.h
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.h
> > @@ -51,7 +51,8 @@ static inline bool
> > intel_uc_supports_guc_submission(struct intel_uc *uc)
> > static inline bool intel_uc_uses_guc_submission(struct intel_uc *uc)
> > {
> > - return intel_guc_is_submission_supported(&uc->guc);
> > + return intel_guc_is_enabled(&uc->guc) &&
> > + intel_guc_is_submission_supported(&uc->guc);
>
> This wont fix your original problem (that btw is not possible to
> repro on drm-tip)
I'm not sure how you force GuC initialization to fail, mine just didn't have
new firmware available. On module load, the driver was starting up in
execlists submission mode and BUG_ON( was raised from process_csb(). Running
on a simulator, I was using current internal tree, based on current drm-tip.
> as after any GuC initialization failure we still
> treat GuC as "enabled":
My bad, I initially used intel_guc_is_running() but that interfered badly with
module unload so I switched to intel_guc_is_enabled() and apparently didn't
re-test if this still fixes the original issue.
> intel_guc_is_supported => H/W support (static)
> intel_guc_is_enabled => aka not disabled by the user (config)
> intel_guc_is_running => no major fw failure (runtime)
>
> Note that we even s/intel_guc_is_enabled/intel_guc_is_running
> won't help as GuC may be running but we may fail to correctly
> initialize GuC submission.
>
> Correct fix to original problem must be aligned with new GuC
> submission model (coming soon) and it may look as this:
>
> +static inline bool intel_guc_is_submission_active(struct intel_guc *guc)
> +{
> + GEM_BUG_ON(guc->submission_active && !intel_guc_is_running(guc));
> + return guc->submission_active;
> +}
>
> and then
>
> static inline bool intel_uc_uses_guc_submission(struct intel_uc *uc)
> {
> - return intel_guc_is_submission_supported(&uc->guc);
> + return intel_guc_is_submission_active(&uc->guc);
> }
>
> We may need to revisit all uses/supports/ macros to better
> reflect configuration vs runtime differences.
Definitely, or we may get in troubles like the one I experienced on module
unload. And that can be done in advance, I believe.
As long as the unload issue is resolved by not using
intel_uc_uses_guc_submission() where it occurred inappropriate, using
(intel_guc_is_running() && intel_guc_is_submission_supported()) seems a valid
fix to me, easy to migrate to intel_guc_is_submission_active() as soon as
available. I'll revert back to intel_guc_is_running(), fix the module unload
issue and resubmit to trybot, maybe it can discover more issues with that.
Thanks,
Janusz
>
> Thanks,
> Michal
>
More information about the Intel-gfx
mailing list