[Intel-gfx] [PATCH] drm/i915/guc: ensure CSB FIFOs after GuC reset do not have odd entries

Ceraolo Spurio, Daniele daniele.ceraolospurio at intel.com
Wed Dec 7 02:00:47 UTC 2022



On 12/6/2022 3:49 PM, Andrzej Hajda wrote:
> CSB FIFOs stores 64-bit Context Status Buffers used by GuC firmware. They
> are accessed by 32-bit register. Reads must occur in pairs to obtain
> a single 64-bit CSB entry. The second read pops the CSB entry off the FIFO.
> In case GuC reset happens between the reads, FIFO must be read once, to
> recover proper behaviour.

 From the description, this seems to be a bug in the GuC firmware. The 
firmware is supposed to make sure all stale CSB entries are discarded 
when it gets reloaded, so it looks like the issue here is that that code 
is not correctly handling the case where there are an odd number of 
stale dwords. All the registers you're reading in this patch are GuC 
registers, so it should be possible to implement this fix within the 
firmware.
That said, since we do need to keep support for current/older GuC 
versions, we'll still need to merge this WA and then disable it when we 
detect that we're loading a GuC version that includes the fix. However, 
I'd prefer it if we could first get confirmation that there is indeed a 
bug in the stale CSB handling inside of GuC (I believe Antonio is 
already looking at that) and that this is the best way to WA the issue, 
because normally we try to avoid touching internal GuC regs from i915 
unless there are no alternatives.

> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/7351
> Signed-off-by: Andrzej Hajda <andrzej.hajda at intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_reset.c      | 25 ++++++++++++++++++++++
>   drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h | 13 +++++++++++
>   2 files changed, 38 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index 24736ebee17c28..8e64b9024e3258 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -721,6 +721,30 @@ bool intel_has_reset_engine(const struct intel_gt *gt)
>   	return INTEL_INFO(gt->i915)->has_reset_engine;
>   }
>   
> +static void recover_csb_fifos(struct intel_gt *gt)
> +{
> +	const struct {
> +		u32 bit;
> +		i915_reg_t csb;
> +	} csb_map[] = {
> +		{ .bit = GUC_CSB_READ_FLAG_RCS, .csb = GUC_CS_CSB },
> +		{ .bit = GUC_CSB_READ_FLAG_VCS, .csb = GUC_VCS_CSB },
> +		{ .bit = GUC_CSB_READ_FLAG_VECS, .csb = GUC_VECS_CSB },
> +		{ .bit = GUC_CSB_READ_FLAG_BCS, .csb = GUC_BCS_CSB },
> +		{ .bit = GUC_CSB_READ_FLAG_CCS, .csb = GUC_CCS_CSB },

For MTL we'd also need the GSC_CSB, but hopefully we can get the updated 
GuC before we remove the force_probe and therefore not have to support 
this WA on MTL.

> +	};
> +	u32 dbg;
> +
> +	if (!intel_uc_uses_guc_submission(&gt->uc))
> +		return;

The GuC still gets the CSB interrupts even if we're not using GuC 
submission, although in that case it just pops the CSB entries out of 
the FIFO without looking at them. Not sure if we still need the WA in 
that case (again need input from the GuC side).

Daniele

> +
> +	dbg = intel_uncore_read(gt->uncore, GUCINT_DEBUG2);
> +	for (int i = 0; i < ARRAY_SIZE(csb_map); ++i) {
> +		if (dbg & csb_map[i].bit)
> +			intel_uncore_read(gt->uncore, csb_map[i].csb);
> +	}
> +}
> +
>   int intel_reset_guc(struct intel_gt *gt)
>   {
>   	u32 guc_domain =
> @@ -731,6 +755,7 @@ int intel_reset_guc(struct intel_gt *gt)
>   
>   	intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL);
>   	ret = gen6_hw_domain_reset(gt, guc_domain);
> +	recover_csb_fifos(gt);
>   	intel_uncore_forcewake_put(gt->uncore, FORCEWAKE_ALL);
>   
>   	return ret;
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
> index 9915de32e894e1..beeb7fbff99453 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
> @@ -154,4 +154,17 @@ struct guc_doorbell_info {
>   #define GUC_INTR_SW_INT_1		BIT(1)
>   #define GUC_INTR_SW_INT_0		BIT(0)
>   
> +#define GUCINT_DEBUG2			_MMIO(0xC5A4)
> +#define   GUC_CSB_READ_FLAG_CCS		BIT(16)
> +#define   GUC_CSB_READ_FLAG_BCS		BIT(3)
> +#define   GUC_CSB_READ_FLAG_VECS	BIT(2)
> +#define   GUC_CSB_READ_FLAG_VCS		BIT(1)
> +#define   GUC_CSB_READ_FLAG_RCS		BIT(0)
> +
> +#define GUC_CS_CSB			_MMIO(0xC5B0)
> +#define GUC_BCS_CSB			_MMIO(0xC5B4)
> +#define GUC_VCS_CSB			_MMIO(0xC5B8)
> +#define GUC_VECS_CSB			_MMIO(0xC5BC)
> +#define GUC_CCS_CSB			_MMIO(0xC5E0)
> +
>   #endif



More information about the Intel-gfx mailing list