[Intel-gfx] [PATCH 1/2] drm/i915: limit double GT reset to pre-MTL
Ceraolo Spurio, Daniele
daniele.ceraolospurio at intel.com
Wed Mar 22 21:03:20 UTC 2023
On 3/22/2023 12:44 PM, John Harrison wrote:
> On 3/20/2023 14:10, Daniele Ceraolo Spurio wrote:
>> Commit 3db9d590557d ("drm/i915/gt: Reset twice") modified the code to
>> always hit the GDRST register twice when doing a reset, with the
>> reported aim to fix invalid post-reset engine state on some platforms
>> (Jasperlake being the only one actually mentioned).
> It still concerns me that there are no actual details about this issue
> from a hardware perspective as to what/why it goes wrong, the comment
> is fully of non-definitive language - 'appears to', 'should', 'is
> still a concern that'. And there is no w/a number associated with it.
> It all feels extremely suspect and warrants a great big FIXME tag as a
> minimum.
I agree that the whole thing is unclear and we could add a FIXME, but
IMO that is outside the scope of this patch as I'm not adding the code
in question. This should be discussed with the original author/reviewers.
Daniele
>
> John.
>
>
>>
>> This is a problem on MTL, due to the fact that we have to apply a time
>> consuming WA (coming in the next patch) every time we hit the GDRST
>> register in a way that can include the GSC engine. Even post MTL, the
>> expectation is that we'll have some work to do before and after hitting
>> the GDRST if the GSC is involved.
>>
>> Since the issue requiring the double reset seems to be limited to older
>> platforms, instead of trying to handle the double-reset on MTL and
>> future platforms it is just easier to turn it off. The default on MTL is
>> also for GuC to own engine reset, with i915 only covering full-GT reset.
>>
>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>> Cc: Andi Shyti <andi.shyti at linux.intel.com>
>> Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
>> Cc: Gwan-gyeong Mun <gwan-gyeong.mun at intel.com>
>> Cc: John Harrison <John.C.Harrison at Intel.com>
>> ---
>> drivers/gpu/drm/i915/gt/intel_reset.c | 35 +++++++++++++++------------
>> 1 file changed, 20 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c
>> b/drivers/gpu/drm/i915/gt/intel_reset.c
>> index 0bb9094fdacd..2c3463f77e5c 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
>> @@ -268,9 +268,27 @@ static int ilk_do_reset(struct intel_gt *gt,
>> intel_engine_mask_t engine_mask,
>> static int gen6_hw_domain_reset(struct intel_gt *gt, u32
>> hw_domain_mask)
>> {
>> struct intel_uncore *uncore = gt->uncore;
>> - int loops = 2;
>> + int loops;
>> int err;
>> + /*
>> + * On some platforms, e.g. Jasperlake, we see that the engine
>> register
>> + * state is not cleared until shortly after GDRST reports
>> completion,
>> + * causing a failure as we try to immediately resume while the
>> internal
>> + * state is still in flux. If we immediately repeat the reset, the
>> + * second reset appears to serialise with the first, and since
>> it is a
>> + * no-op, the registers should retain their reset value.
>> However, there
>> + * is still a concern that upon leaving the second reset, the
>> internal
>> + * engine state is still in flux and not ready for resuming.
>> + *
>> + * Starting on MTL, there are some prep steps that we need to do
>> when
>> + * resetting some engines that need to be applied every time we
>> write to
>> + * GEN6_GDRST. As those are time consuming (tens of ms), we
>> don't want
>> + * to perform that twice, so, since the Jasperlake issue hasn't
>> been
>> + * observed on MTL, we avoid repeating the reset on newer
>> platforms.
>> + */
>> + loops = GRAPHICS_VER_FULL(gt->i915) < IP_VER(12, 70) ? 2 : 1;
>> +
>> /*
>> * GEN6_GDRST is not in the gt power well, no need to check
>> * for fifo space for the write or forcewake the chip for
>> @@ -279,20 +297,7 @@ static int gen6_hw_domain_reset(struct intel_gt
>> *gt, u32 hw_domain_mask)
>> do {
>> intel_uncore_write_fw(uncore, GEN6_GDRST, hw_domain_mask);
>> - /*
>> - * Wait for the device to ack the reset requests.
>> - *
>> - * On some platforms, e.g. Jasperlake, we see that the
>> - * engine register state is not cleared until shortly after
>> - * GDRST reports completion, causing a failure as we try
>> - * to immediately resume while the internal state is still
>> - * in flux. If we immediately repeat the reset, the second
>> - * reset appears to serialise with the first, and since
>> - * it is a no-op, the registers should retain their reset
>> - * value. However, there is still a concern that upon
>> - * leaving the second reset, the internal engine state
>> - * is still in flux and not ready for resuming.
>> - */
>> + /* Wait for the device to ack the reset requests. */
>> err = __intel_wait_for_register_fw(uncore, GEN6_GDRST,
>> hw_domain_mask, 0,
>> 2000, 0,
>
More information about the Intel-gfx
mailing list