[Intel-gfx] [PATCH 5/8] drm/i915/guc: Use GuC FW size and HW restrictions to determine WOPCM partition

Michał Winiarski michal.winiarski at intel.com
Mon Mar 26 10:04:34 UTC 2018


On Fri, Mar 23, 2018 at 01:00:35PM -0700, Yaodong Li wrote:
> On 03/23/2018 05:34 AM, Michał Winiarski wrote:
> > We need GuC to load HuC, but it's also possible for GuC to operate on
> > its own. We don't know what the size of HuC FW may be, so, not wanting
> > to make any platform-specific hardcoded guesses, we assume that its size
> > is 0... Which is a very bad approximation.
> > This has a very unfortunate consequence - once we've booted with GuC and
> > no HuC, we'll never be able to load HuC (unless we reboot, since the
> > registers are locked).
> > Rather than using unknown values in our computations - let's partition
> > based on GuC size.
> we can do this based on known GuC and HuC sizes without
> any assumption on FW sizes :)
> please also refer to: https://patchwork.freedesktop.org/series/40541/

You need to have HuC FW present in the filesystem do do that though.
(IIUC we still get 0 if it's not there)

> > We have one HW restriction where we're using HuC size (GuC size needs to
> > be roughly similar to HuC size - which may be unknown at this point),
> > luckily, another HW restriction makes it very unlikely to run in any
> > sort of issues in this case.
> > 
> > Signed-off-by: Michał Winiarski <michal.winiarski at intel.com>
> > Cc: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Jackie Li <yaodong.li at intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> > Cc: Michal Wajdeczko <michal.wajdeczko at intel.com>
> > ---
> >   drivers/gpu/drm/i915/intel_wopcm.c | 60 +++++++++++++++++++++-----------------
> >   1 file changed, 34 insertions(+), 26 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_wopcm.c b/drivers/gpu/drm/i915/intel_wopcm.c
> > index 52841d340002..295d302e97b9 100644
> > --- a/drivers/gpu/drm/i915/intel_wopcm.c
> > +++ b/drivers/gpu/drm/i915/intel_wopcm.c
> > @@ -167,7 +167,22 @@ static int check_ctx_rsvd_fits(struct intel_wopcm *wopcm, u32 ctx_rsvd)
> >   	return 0;
> >   }
> > -static int wopcm_check_hw_restrictions(struct intel_wopcm *wopcm)
> > +static inline void
> > +__guc_region_grow(struct intel_wopcm *wopcm, u32 size)
> > +{
> > +	/*
> > +	 * We're growing guc region in the direction of lower addresses.
> > +	 * We need to use multiples of base alignment, because it has more
> > +	 * strict alignment rules.
> > +	 */
> > +	size = DIV_ROUND_UP(size, 2);
> A bit confused here. Don't we just want to push the base downward for
> *size* bytes?

Starting from the following:

    +--------------+           
    |--------------|           
    ||            ||           
    ||            ||           
    ||  GuC       ||           
    ||  region    ||           
    ||            ||           
    ||            ||           
    ||            ||           
    ||            ||           
    +--------------+           
    |              |           
    |              |           
    |              |           
    |              |           
    |              |           
    |              |           
    |              |           
    |              |           
    |              |           
    +--------------+           

We want to grow the whole region by size bytes. Pushing the base downward gives
us this:

    +--------------+        
    |  Empty       |        
    |  space       |        
    +--------------+        
    ||            ||        
    ||            ||        
    ||  GuC       ||        
    ||  region    ||        
    ||            ||        
    ||            ||        
    ||            ||        
    ||            ||        
    +--------------+        
    |              |        
    |              |        
    |              |        
    |              |        
    |              |        
    |              |        
    |              |        
    +--------------+        

Which leaves less space for HuC (and we're also leaving a bunch of unused space
in our partitioning).

If we modify both base and size to end up with this:

    +--------------+
    |--------------|
    ||            ||
    ||            ||
    ||  GuC       ||
    ||  region    ||
    ||            ||
    ||            ||
    ||            ||
    ||            ||
    ||            ||
    +--------------+
    |              |
    |              |
    |              |
    |              |
    |              |
    |              |
    |              |
    |              |
    +--------------+

We're still satisfying the HW restriciton, but we're not wasting any space from
the top (and also we're leaving more space for HuC).

> > +	size = ALIGN(size, GUC_WOPCM_OFFSET_ALIGNMENT);
> Hmm, I think we only need align it to 4K boundary.

No - because we're modifying both base (16K alignment) and size (4K
alignment). 

> > +
> > +	wopcm->guc.base -= size;
> > +	wopcm->guc.size += size;
> > +}
> > +
> > +static void wopcm_adjust_for_hw_restrictions(struct intel_wopcm *wopcm)
> >   {
> >   	struct drm_i915_private *i915 = wopcm_to_i915(wopcm);
> >   	u32 huc_fw_size = intel_uc_fw_get_upload_size(&i915->huc.fw);
> > @@ -177,22 +192,18 @@ static int wopcm_check_hw_restrictions(struct intel_wopcm *wopcm)
> >   		size = gen9_size_for_dword_gap_restriction(wopcm->guc.base,
> >   							   wopcm->guc.size);
> >   		if (size)
> > -			goto err;
> > +			__guc_region_grow(wopcm, size);
> >   		size = gen9_size_for_huc_restriction(wopcm->guc.size,
> >   						     huc_fw_size);
> Note that huc_fw_size might be zero in enable_guc=1 (GuC only) case.
> Once the registers were locked, enable_guc=3 may still fail since
> huc_fw_size
> may still large enough to break the restriction we want to enforce that:
> hw_fw_size + 16KB > guc_fw_size.

Well - as mentioned in the commit message, since we also have a "dword gap"
restriction, we're left with pretty large GuC size, which makes this problem
dissaperar for real-world HuC FW sizes.

But yeah - I agree, we're not able to guarantee this.

> >   		if (size)
> > -			goto err;
> > -	}
> > -
> > -	return 0;
> > +			__guc_region_grow(wopcm, size);
> > -err:
> > -	DRM_ERROR("GuC WOPCM size %uKiB is too small. %uKiB more needed.\n",
> > -		  wopcm->guc.size / 1024,
> > -		  size / 1024);
> > -
> > -	return -E2BIG;
> > +		GEM_BUG_ON(gen9_size_for_dword_gap_restriction(wopcm->guc.base,
> > +							       wopcm->guc.size));
> > +		GEM_BUG_ON(gen9_size_for_huc_restriction(wopcm->guc.size,
> > +							 huc_fw_size));
> > +	}
> >   }
> >   static bool wopcm_check_components_fit(struct intel_wopcm *wopcm)
> > @@ -218,21 +229,16 @@ static bool wopcm_check_components_fit(struct intel_wopcm *wopcm)
> >   	return 0;
> >   }
> > -static int wopcm_guc_init(struct intel_wopcm *wopcm)
> > +static int wopcm_guc_region_init(struct intel_wopcm *wopcm)
> >   {
> >   	struct drm_i915_private *dev_priv = wopcm_to_i915(wopcm);
> > -	u32 huc_fw_size = intel_uc_fw_get_upload_size(&dev_priv->huc.fw);
> > +	u32 guc_fw_size = intel_uc_fw_get_upload_size(&dev_priv->guc.fw);
> >   	u32 ctx_rsvd = context_reserved_size(dev_priv);
> > -	wopcm->guc.base = ALIGN_DOWN(huc_fw_size_in_wopcm(huc_fw_size),
> > -				     GUC_WOPCM_OFFSET_ALIGNMENT);
> > +	wopcm->guc.size = guc_fw_size_in_wopcm(guc_fw_size);
> > -	wopcm->guc.size = ALIGN(wopcm->size - wopcm->guc.base - ctx_rsvd,
> > -				PAGE_SIZE);
> > -
> > -	DRM_DEBUG_DRIVER("GuC WOPCM Region: [%uKiB, %uKiB)\n",
> > -			 wopcm->guc.base / 1024,
> > -			 (wopcm->guc.base + wopcm->guc.size) / 1024);
> > +	wopcm->guc.base = ALIGN_DOWN(wopcm->size - ctx_rsvd - wopcm->guc.size,
> > +				     GUC_WOPCM_OFFSET_ALIGNMENT);
> guc.size + ctx_rsvd > wopcm->size check?

It's being done in check_ctx_rsvd_fits later on (in wopcm_check_components_fit).

> >   	return 0;
> >   }
> > @@ -255,18 +261,20 @@ int intel_wopcm_init(struct intel_wopcm *wopcm)
> >   	GEM_BUG_ON(!wopcm->size);
> > -	err = wopcm_guc_init(wopcm);
> > +	err = wopcm_guc_region_init(wopcm);
> Again, wopcm will contain invalid data on failure. But maybe we can
> go with it now :)

I could go back to passing around a bunch of locals, but since we're only
accessing wopcm here I think passing around struct intel_wopcm makes the code a
bit cleaner. And we're failing the driver load on error anyways.

-Michał

> 
> Regards,
> -Jackie


More information about the Intel-gfx mailing list