[Intel-gfx] [PATCH] drm/i915/display: Reset message bus after each read/write operation

Rodrigo Vivi rodrigo.vivi at intel.com
Thu Oct 5 15:13:34 UTC 2023


On Thu, Oct 05, 2023 at 03:05:31AM -0400, Kahola, Mika wrote:
> > -----Original Message-----
> > From: Vivi, Rodrigo <rodrigo.vivi at intel.com>
> > Sent: Wednesday, October 4, 2023 3:56 PM
> > To: Kahola, Mika <mika.kahola at intel.com>
> > Cc: intel-gfx at lists.freedesktop.org
> > Subject: Re: [Intel-gfx] [PATCH] drm/i915/display: Reset message bus after each read/write operation
> > 
> > On Wed, Oct 04, 2023 at 01:25:04PM +0300, Mika Kahola wrote:
> > > Every know and then we receive the following error when running for
> > > example IGT test kms_flip.
> > >
> > > [drm] *ERROR* PHY G Read 0d80 failed after 3 retries.
> > > [drm] *ERROR* PHY G Write 0d81 failed after 3 retries.
> > >
> > > Since the error is sporadic in nature, the patch proposes to reset the
> > > message bus after every successful or unsuccessful read or write
> > > operation. However, testing revealed that this alone is not sufficient
> > > method an additiona delay is also introduces anything from 200us to
> > > 300us. This delay is experimental value and has no specification to
> > > back it up.
> > 
> > have you tried the delays without the bus_reset?
> Yes, we have bumped up the delay, first from 0x100 to 0x200 and then as per 
> BSpec change 0xa000 and I have tried 0xf000. Increasing the timeout reduces
> the frequency of this error but doesn't solve this issue.

what is exactly this BSPec's 0xa000? where can I see it? So maybe you can
update the message above removing the 'no specification to back it up'.

Oh, and my english is bad, but it looks to me that 'empirical' might
sound better than 'experimental' for this case, since you really did
a lot of experiments before coming to this final conclusion.

> 
> > have you talked to hw architects about this?
> Yes, HW guys requested traces which I provided but based on these the sequence we use in i915
> is correct.
> 
> > 
> > I wonder if we should add the delay inside the bus_reset itself?
> > although the bit 15 clear check should be enough by itself and it doesn't look like it is a hw/fw reset involved to justify the extra
> > delay.
> That should be enough. To me, it looks like when reading/writing to the bus maybe too fast, the hw cannot handle that and we need
> to reset and let things settle down before trying again.
> 
> > 
> > well, at least some /* FIXME: */ or /* XXX: */ comments is desired along with the messages if we are going with this hack without
> > understanding why...
> True, I will add these the the patch.
> 
> Thanks for review!
> 
> -Mika-
> > 
> > >
> > > Signed-off-by: Mika Kahola <mika.kahola at intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/display/intel_cx0_phy.c | 6 ++++++
> > >  1 file changed, 6 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/i915/display/intel_cx0_phy.c
> > > b/drivers/gpu/drm/i915/display/intel_cx0_phy.c
> > > index abd607b564f1..a71b8a29d6b0 100644
> > > --- a/drivers/gpu/drm/i915/display/intel_cx0_phy.c
> > > +++ b/drivers/gpu/drm/i915/display/intel_cx0_phy.c
> > > @@ -220,9 +220,12 @@ static u8 __intel_cx0_read(struct drm_i915_private *i915, enum port port,
> > >  	/* 3 tries is assumed to be enough to read successfully */
> > >  	for (i = 0; i < 3; i++) {
> > >  		status = __intel_cx0_read_once(i915, port, lane, addr);
> > > +		intel_cx0_bus_reset(i915, port, lane);
> > >
> > >  		if (status >= 0)
> > >  			return status;
> > > +
> > > +		usleep_range(200, 300);
> > >  	}
> > >
> > >  	drm_err_once(&i915->drm, "PHY %c Read %04x failed after %d
> > > retries.\n", @@ -299,9 +302,12 @@ static void __intel_cx0_write(struct drm_i915_private *i915, enum port port,
> > >  	/* 3 tries is assumed to be enough to write successfully */
> > >  	for (i = 0; i < 3; i++) {
> > >  		status = __intel_cx0_write_once(i915, port, lane, addr, data,
> > > committed);
> > > +		intel_cx0_bus_reset(i915, port, lane);
> > >
> > >  		if (status == 0)
> > >  			return;
> > > +
> > > +		usleep_range(200, 300);
> > >  	}
> > >
> > >  	drm_err_once(&i915->drm,
> > > --
> > > 2.34.1
> > >


More information about the Intel-gfx mailing list