[Intel-gfx] [PATCH 1/2] drm/i915: Add code to accept valid locked WOPCM register values

Michał Winiarski michal.winiarski at intel.com
Fri Mar 23 14:01:37 UTC 2018


On Thu, Mar 22, 2018 at 02:27:59PM -0700, Yaodong Li wrote:
> On 03/22/2018 01:38 PM, Michał Winiarski wrote:
> > On Tue, Mar 20, 2018 at 04:18:46PM -0700, Jackie Li wrote:
> > > In current code, we only compare the locked WOPCM register values with the
> > > calculated values. However, we can continue loading GuC/HuC firmware if the
> > > locked (or partially locked) values were valid for current GuC/HuC firmware
> > > sizes.
> > > 
> > > This patch added a new code path to verify whether the locked register
> > > values can be used for GuC/HuC firmware loading, it will recalculate the
> > > verify the new values if these registers were partially locked, so that we
> > > won't fail the GuC/HuC firmware loading even if the locked register values
> > > are different from the calculated ones.
> > > 
> > > Signed-off-by: Jackie Li <yaodong.li at intel.com>
> > > Cc: Michal Wajdeczko <michal.wajdeczko at intel.com>
> > > Cc: Sagar Arun Kamble <sagar.a.kamble at intel.com>
> > > Cc: Michał Winiarski <michal.winiarski at intel.com>
> > > Cc: John Spotswood <john.a.spotswood at intel.com>
> > > ---
> > >   drivers/gpu/drm/i915/intel_wopcm.c | 183 +++++++++++++++++++++++++++++++------
> > >   1 file changed, 157 insertions(+), 26 deletions(-)
> > [snip]
> > 
> > >   /**
> > >    * intel_wopcm_init() - Initialize the WOPCM structure.
> > >    * @wopcm: pointer to intel_wopcm.
> > > @@ -155,10 +202,8 @@ int intel_wopcm_init(struct intel_wopcm *wopcm)
> > >   	struct drm_i915_private *i915 = wopcm_to_i915(wopcm);
> > >   	u32 guc_fw_size = intel_uc_fw_get_upload_size(&i915->guc.fw);
> > >   	u32 huc_fw_size = intel_uc_fw_get_upload_size(&i915->huc.fw);
> > > -	u32 ctx_rsvd = context_reserved_size(i915);
> > >   	u32 guc_wopcm_base;
> > >   	u32 guc_wopcm_size;
> > > -	u32 guc_wopcm_rsvd;
> > >   	int err;
> > >   	GEM_BUG_ON(!wopcm->size);
> > > @@ -175,30 +220,17 @@ int intel_wopcm_init(struct intel_wopcm *wopcm)
> > >   		return -E2BIG;
> > >   	}
> > > -	guc_wopcm_base = ALIGN(huc_fw_size + WOPCM_RESERVED_SIZE,
> > > -			       GUC_WOPCM_OFFSET_ALIGNMENT);
> > > -	if ((guc_wopcm_base + ctx_rsvd) >= wopcm->size) {
> > > -		DRM_ERROR("GuC WOPCM base (%uKiB) is too big.\n",
> > > -			  guc_wopcm_base / 1024);
> > > -		return -E2BIG;
> > > -	}
> > > -
> > > -	guc_wopcm_size = wopcm->size - guc_wopcm_base - ctx_rsvd;
> > > -	guc_wopcm_size &= GUC_WOPCM_SIZE_MASK;
> > > +	guc_wopcm_base = calculate_min_guc_wopcm_base(huc_fw_size);
> > We already discussed this elsewhere, but just to have this part available
> > for wider audience:
> > 
> > huc_fw_size is unknown (there may be no huc firmware present) - which means
> > we're still not able to handle module reload from guc-without-huc into
> > guc-with-huc
> > 
> > Question remains whether we want to support this scenario, or whether we should
> > tie GuC and HuC together and refuse to operate early on if either one is not
> > present. We could remove a lot of code this way...
> Thanks. As we discussed I think we should describe this problem in this way:
> we shall not make any assumption on either guc_fw_size and huc_fw_size.

We can make assumptions on guc_fw_size - it's known in all the relevant cases
(enable_guc=1,2,3).
(unless we want to support varying guc_fw_size - more on that later).

> On the other handle, we do need calculate the WOPCM layout based on
> both GuC and HuC FW sizes, especially when we want to support modparam
> enable_guc to enable/disable HuC FW loading dynamically. That's why I
> suggest
> to update the FW fetch code to report the FW size no matter we turn the HuC
> FW loading on or not.

We could do that, but we don't really *need* to do that.
If user doesn't want HuC, why are we reading HuC from the filesystem? (well,
because we decided we're going to base our paritioning around it).

> Other aspect of the discussion is whether we should support enable_guc
> switching
> from 1 to 3 + Adding new HuC FW to the filesystem without a system reboot.
> In another word whether we should support FW image updates
> (add/delete/update to a new version) in the filesystem without expecting a
> system
> reboot. My point is it's unlikely we can support this since the FW sizes
> would likely
> change and we need reconfigure the WO register with the latest FW available
> in
> the FS, and I think it worth to do it to 100% make sure we won't run into
> any
> module loading/init errors.

It's never going to be 100%, but we can at least try to work with 99% ;)

The only reason why you need HuC size is because you're building the
partitioning from the bottom (starting from HuC).
We could just as well start from the top (starting from GuC).

The only thing you gain by starting from the bottom, is that you're much more
tolerant for variations in GuC fw size (because you're paritioning HuC region
precisely, leaving the rest for GuC. Well, that, and the HW workaround).
The only thing that changes when you're starting from the top, is that you're
going the other way around (more tolerant for HuC fw size variations).

I just posted a series that (among other things) tries to do partitioning from
the top.
To make it more tolerant for size variations, we could reserve a little extra
(either for GuC or HuC, depending on the approach).

-Michał

> > 
> > > +	err = calculate_max_guc_wopcm_size(wopcm, guc_wopcm_base,
> > > +					   &guc_wopcm_size);
> > > +	if (err)
> > > +		return err;
> > >   	DRM_DEBUG_DRIVER("Calculated GuC WOPCM Region: [%uKiB, %uKiB)\n",
> > >   			 guc_wopcm_base / 1024, guc_wopcm_size / 1024);
> > > -	guc_wopcm_rsvd = GUC_WOPCM_RESERVED + GUC_WOPCM_STACK_RESERVED;
> > > -	if ((guc_fw_size + guc_wopcm_rsvd) > guc_wopcm_size) {
> > > -		DRM_ERROR("Need %uKiB WOPCM for GuC, %uKiB available.\n",
> > > -			  (guc_fw_size + guc_wopcm_rsvd) / 1024,
> > > -			  guc_wopcm_size / 1024);
> > > -		return -E2BIG;
> > > -	}
> > > -
> > > -	err = check_hw_restriction(i915, guc_wopcm_base, guc_wopcm_size,
> > > -				   huc_fw_size);
> > > +	err = verify_calculated_values(i915, guc_fw_size, huc_fw_size,
> > > +				       guc_wopcm_base, guc_wopcm_size);
> > Ok - so we're calculating the values in init...
> Most likely case, we are going to get these values ready without getting
> any forcewake during the init. Then in init_hw we only need to update
> these values to the register without wasting anytime on these calculation.
> However, this patch introduce more code to handle the rare cases which
> these regs were locked with different values from the ones we calculated
> in the init phase. In this case, we will try to recalculate/verify these
> locked
> or partially locked values and try our best to go with the locked values.
> > 
> > >   	if (err)
> > >   		return err;
> > > @@ -208,6 +240,92 @@ int intel_wopcm_init(struct intel_wopcm *wopcm)
> > >   	return 0;
> > >   }
> > [snip]
> > 
> > >   static inline int write_and_verify(struct drm_i915_private *dev_priv,
> > >   				   i915_reg_t reg, u32 val, u32 mask,
> > >   				   u32 locked_bit)
> > > @@ -231,7 +349,8 @@ static inline int write_and_verify(struct drm_i915_private *dev_priv,
> > >    * will verify the register values to make sure the registers are locked with
> > >    * correct values.
> > >    *
> > > - * Return: 0 on success. -EIO if registers were locked with incorrect values.
> > > + * Return: 0 on success. Minus error code if registered were locked with
> > > + * incorrect values.-EIO if registers failed to lock with correct values.
> > >    */
> > >   int intel_wopcm_init_hw(struct intel_wopcm *wopcm)
> > >   {
> > > @@ -247,27 +366,39 @@ int intel_wopcm_init_hw(struct intel_wopcm *wopcm)
> > >   	GEM_BUG_ON(!wopcm->guc.size);
> > >   	GEM_BUG_ON(!wopcm->guc.base);
> > > +	err = verify_locked_values_and_update(wopcm);
> > > +	if (err) {
> > > +		DRM_ERROR("WOPCM registers are locked with invalid values.\n");
> > > +		goto err_out;
> > > +	}
> > > +
> > ...Only to throw everything out if the regs are locked in init_hw?
> > 
> > Since we've changed the logic, I think we should refactor a bit by changing
> > the ordering.
> > We could start from checking the regs and partition ourselves if the values are
> > not locked.
> > 
> > We can then have a common codepath for verification.
> We have power domain on at this point so we expect to spend as less
> time as possible here. We are expecting the regs were not locked or
> locked with the same values as we calculated (e.g. module reloading,
> reset, etc) in most case. But for some rare cases (BIOS did upload the
> regs) we will have this new path to handle those cases. But these are
> all abnormal cases and even should not happen.
> > 
> > >   	err = write_and_verify(dev_priv, GUC_WOPCM_SIZE, wopcm->guc.size,
> > >   			       GUC_WOPCM_SIZE_MASK | GUC_WOPCM_SIZE_LOCKED,
> > >   			       GUC_WOPCM_SIZE_LOCKED);
> Actually, I think we shall decide whether updating the reg based on the
> locking status. Will handle it in v2:)
> > > -	if (err)
> > > +	if (err) {
> > > +		DRM_ERROR("Failed to lock GUC_WOPCM_SIZE with %#x.\n",
> > > +			  wopcm->guc.size);
> > >   		goto err_out;
> > > +	}
> > >   	huc_agent = USES_HUC(dev_priv) ? HUC_LOADING_AGENT_GUC : 0;
> > >   	mask = GUC_WOPCM_OFFSET_MASK | GUC_WOPCM_OFFSET_VALID | huc_agent;
> > >   	err = write_and_verify(dev_priv, DMA_GUC_WOPCM_OFFSET,
> > >   			       wopcm->guc.base | huc_agent, mask,
> > >   			       GUC_WOPCM_OFFSET_VALID);
> > > -	if (err)
> > > +	if (err) {
> > > +		DRM_ERROR("Failed to lock DMA_GUC_WOPCM_OFFSET with %#x.\n",
> > > +			  wopcm->guc.base);
> > >   		goto err_out;
> > > +	}
> > >   	return 0;
> > >   err_out:
> > > -	DRM_ERROR("Failed to init WOPCM registers:\n");
> > >   	DRM_ERROR("DMA_GUC_WOPCM_OFFSET=%#x\n",
> > >   		  I915_READ(DMA_GUC_WOPCM_OFFSET));
> > >   	DRM_ERROR("GUC_WOPCM_SIZE=%#x\n", I915_READ(GUC_WOPCM_SIZE));
> > > +	DRM_ERROR("A system reboot is required.\n");
> > Functionally I think it's a step in the right direction, but I think we should
> > try even harder to not ever display this error msg.
> It shall be very very unlikely if we had clear system configure and FW
> update sequence. However, we are still expecting such an error. (e.g.
> Updated
> FW images (with larger sizes) and try to load the new FW without a reboot.
> we cannot 100% guarantee the correctness since the WOPCM layout is locked
> with
> values we calculated based on the old FW status in the FS, thus we will
> report such an error
> message.
> 
> Regards,
> -Jackie


More information about the Intel-gfx mailing list