[Intel-gfx] [PATCH 3/5] drm/i915/guc: Simplify intel_guc_load()

Arkadiusz Hiler arkadiusz.hiler at intel.com
Fri Dec 16 11:16:02 UTC 2016


On Thu, Dec 15, 2016 at 02:26:29PM -0800, Daniele Ceraolo Spurio wrote:
> 
> 
> On 15/12/16 07:47, Arkadiusz Hiler wrote:
> > Current version of intel_guc_load() does a lot:
> >  - cares about submission
> >  - loads huc
> >  - implement WA
> > 
> > This change offloads some of the logic to intel_uc_load(), which now
> > cares about the above.
> > 
> > Cc: Anusha Srivatsa <anusha.srivatsa at intel.com>
> > Cc: Jeff McGee <jeff.mcgee at intel.com>
> > Cc: Michal Winiarski <michal.winiarski at intel.com>
> > Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler at intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c         |   2 +-
> >  drivers/gpu/drm/i915/intel_guc_loader.c | 126 +++++---------------------------
> >  drivers/gpu/drm/i915/intel_uc.c         |  83 +++++++++++++++++++++
> >  drivers/gpu/drm/i915/intel_uc.h         |   8 ++
> >  4 files changed, 110 insertions(+), 109 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 6af4e85..76b52c6 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -417,7 +417,7 @@ static int guc_ucode_xfer(struct drm_i915_private *dev_priv)
> >  	return ret;
> >  }
> > 
> > -static int guc_hw_reset(struct drm_i915_private *dev_priv)
> > +int guc_hw_reset(struct drm_i915_private *dev_priv)
> If I haven't missed anything, guc_hw_reset is only called in 1 place, so we
> could keep the function static and move it to intel_uc.c.

Okay.

> >  {
> >  	int ret;
> >  	u32 guc_status;
> > @@ -452,75 +452,37 @@ int intel_guc_load(struct drm_i915_private *dev_priv)
> >  {
> >  	struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
> >  	const char *fw_path = guc_fw->guc_fw_path;
> > -	int retries, ret, err;
> > +	int ret;
> > 
> >  	DRM_DEBUG_DRIVER("GuC fw status: path %s, fetch %s, load %s\n",
> >  		fw_path,
> >  		intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
> >  		intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
> > 
> > -	/* Loading forbidden, or no firmware to load? */
> > -	if (!i915.enable_guc_loading) {
> > -		err = 0;
> > -		goto fail;
> > -	} else if (fw_path == NULL) {
> > +	if (fw_path == NULL) {
> >  		/* Device is known to have no uCode (e.g. no GuC) */
> > -		err = -ENXIO;
> > -		goto fail;
> > +		return -ENXIO;
> >  	} else if (*fw_path == '\0') {
> >  		/* Device has a GuC but we don't know what f/w to load? */
> >  		WARN(1, "No GuC firmware known for this platform!\n");
> > -		err = -ENODEV;
> > -		goto fail;
> > +		return -ENODEV;
> >  	}
> > 
> >  	/* Fetch failed, or already fetched but failed to load? */
> >  	if (guc_fw->guc_fw_fetch_status != GUC_FIRMWARE_SUCCESS) {
> > -		err = -EIO;
> > -		goto fail;
> > +		return -EIO;
> >  	} else if (guc_fw->guc_fw_load_status == GUC_FIRMWARE_FAIL) {
> > -		err = -ENOEXEC;
> > -		goto fail;
> > +		return -ENOEXEC;
> >  	}
> > 
> > -	guc_interrupts_release(dev_priv);
> > -	gen9_reset_guc_interrupts(dev_priv);
> > -
> >  	guc_fw->guc_fw_load_status = GUC_FIRMWARE_PENDING;
> > 
> > -	DRM_DEBUG_DRIVER("GuC fw status: fetch %s, load %s\n",
> > -		intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
> > -		intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
> > -
> > -	err = i915_guc_submission_init(dev_priv);
> > -	if (err)
> > -		goto fail;
> > -
> > -	/*
> > -	 * WaEnableuKernelHeaderValidFix:skl,bxt
> > -	 * For BXT, this is only upto B0 but below WA is required for later
> > -	 * steppings also so this is extended as well.
> > -	 */
> 
> This comment is removed, but the WA is applicable to all SKL steppings and
> is also applicable to HuC according to the specs so I suggest to retain the
> comment and move it to intel_uc_load().

I missread the commend. I'll leave this as a
WaEnableuKernelHeaderValidFix:skl
since it is fixed for BXT.


> >  	/* WaEnableGuCBootHashCheckNotSet:skl,bxt */
> 
> The implementation of this WA is now outside this function and it is marked
> as such there. I'd personally prefer to remove this comment from here as it
> might cause confusion, but no strong feelings either way.

The implementation is twofold now - the the function which returns
-EAGAIN if we failed at the step we know may fail and we may want to
retry on it.

Then you have functions that handles the actuall three attempts.

So I prefer to keep the note on the WA in both places.

> > -	for (retries = 3; ; ) {
> > -		/*
> > -		 * Always reset the GuC just before (re)loading, so
> > -		 * that the state and timing are fairly predictable
> > -		 */
> > -		err = guc_hw_reset(dev_priv);
> > -		if (err)
> > -			goto fail;
> > +	/* we may want to retry guc ucode transfer */
> > +	ret = guc_ucode_xfer(dev_priv);
> > 
> > -		err = guc_ucode_xfer(dev_priv);
> > -		if (!err)
> > -			break;
> > -
> > -		if (--retries == 0)
> > -			goto fail;
> > -
> > -		DRM_INFO("GuC fw load failed: %d; will reset and "
> > -			 "retry %d more time(s)\n", err, retries);
> > -	}
> > +	if (ret)
> > +		return -EAGAIN;
> > 
> >  	guc_fw->guc_fw_load_status = GUC_FIRMWARE_SUCCESS;
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c
> > index 8eec035..4e184edb 100644
> > --- a/drivers/gpu/drm/i915/intel_uc.c
> > +++ b/drivers/gpu/drm/i915/intel_uc.c
> > @@ -35,6 +35,89 @@ void intel_uc_init(struct drm_i915_private *dev_priv)
> >  	intel_guc_init(dev_priv);
> >  }
> > 
> > +int intel_uc_load(struct drm_i915_private *dev_priv)
> > +{
> > +	struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
> > +	int ret, retries;
> > +
> > +	/* guc not enabled, nothing to do */
> > +	if (!i915.enable_guc_loading)
> > +		return 0;
> > +
> > +	guc_interrupts_release(dev_priv);
> > +	gen9_reset_guc_interrupts(dev_priv);
> > +
> > +	guc_fw->guc_fw_load_status = GUC_FIRMWARE_PENDING;
> > +
> > +	if (i915.enable_guc_submission) {
> > +		ret = i915_guc_submission_init(dev_priv);
> > +		if (ret)
> > +			goto fail;
> > +	}
> > +
> > +	/* WaEnableGuCBootHashCheckNotSet:skl,bxt */
> > +	retries = GUC_WA_HASH_CHECK_NOT_SET_ATTEPMTS;
> > +	while (retries--) {
> > +		/*
> > +		 * Always reset the GuC just before (re)loading, so
> > +		 * that the state and timing are fairly predictable
> > +		 */
> > +		ret = guc_hw_reset(dev_priv);
> > +		if (ret)
> > +			goto fail;
> > +
> > +		ret = intel_guc_load(dev_priv);
> > +		if (ret == 0 || ret != -EAGAIN)
> > +			break;
> > +
> > +		DRM_INFO("GuC fw load failed: %d; will reset and "
> > +			 "retry %d more time(s)\n", ret, retries);
> > +	}
> > +
> > +	/* did we succeded or run out of retries? */
> > +	if (ret)
> > +		goto fail;
> > +
> > +	if (i915.enable_guc_submission) {
> > +		if (i915.guc_log_level >= 0)
> > +			gen9_enable_guc_interrupts(dev_priv);
> > +
> > +		ret = i915_guc_submission_enable(dev_priv);
> > +		if (ret)
> > +			goto fail;
> > +		guc_interrupts_capture(dev_priv);
> > +	}
> > +
> > +	return 0;
> > +
> > +fail:
> > +	/*
> > +	 * We've failed to load the firmware :(
> > +	 *
> > +	 * Decide whether to disable GuC submission and fall back to
> > +	 * execlist mode, and whether to hide the error by returning
> > +	 * zero or to return -EIO, which the caller will treat as a
> > +	 * nonfatal error (i.e. it doesn't prevent driver load, but
> > +	 * marks the GPU as wedged until reset).
> > +	 */
> > +	if (i915.enable_guc_loading > 1 || i915.enable_guc_submission > 1)
> > +		ret = -EIO;
> > +	else
> > +		ret = 0;
> > +
> > +	if (i915.enable_guc_submission) {
> > +		i915.enable_guc_submission = 0;
> > +		DRM_INFO("GuC submission without firmware not supported\n");
> > +		DRM_NOTE("Falling back from GuC submission to execlist mode\n");
> 
> If i915.enable_guc_submission > 1 we will mark the GPU as wedged so it might
> be worth retaining an error level message here in that scenario.

If we are wedging the GPU you do not really care about the fallback, so
theres no real use in having that promoted + those are the original
levels that were already here.

Anyway, it seems like the `enable_guc_* > 1` are likely to be gone. I've
discussed that on IRC yesterday and no one seems to really remember why
we've got it in the first place.

Anusha posted similar concern here with her HuC series as well.

> Apart from the minor comments above, the code re-org looks sensible (and
> required :)) and the patch lgtm.
> 
> Thanks,
> Daniele
> 
> > +	}
> > +
> > +	guc_interrupts_release(dev_priv);
> > +	i915_guc_submission_disable(dev_priv);
> > +	i915_guc_submission_fini(dev_priv);
> > +
> > +	return ret;
> > +}
> > +
> >  /*
> >   * Read GuC command/status register (SOFT_SCRATCH_0)
> >   * Return true if it contains a response rather than a command

-- 
Cheers,
Arek


More information about the Intel-gfx mailing list