[PATCH v1] drm/i915/selftest: Log throttle reasons on failure

Rodrigo Vivi rodrigo.vivi at intel.com
Tue Dec 10 22:38:51 UTC 2024


On Tue, Dec 10, 2024 at 10:53:10AM +0200, Raag Jadav wrote:
> On Mon, Dec 09, 2024 at 11:28:39AM -0500, Rodrigo Vivi wrote:
> > On Sat, Dec 07, 2024 at 08:14:42AM +0200, Raag Jadav wrote:
> > > Cc: Chris
> > > 
> > > On Fri, Dec 06, 2024 at 10:45:18AM -0500, Rodrigo Vivi wrote:
> > > > On Thu, Dec 05, 2024 at 01:44:13PM +0530, Raag Jadav wrote:
> > > > > Log throttle reasons on selftest failure which will be useful for
> > > > > debugging.
> > > > > 
> > > > > Signed-off-by: Raag Jadav <raag.jadav at intel.com>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/gt/selftest_rps.c | 7 +++++--
> > > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c b/drivers/gpu/drm/i915/gt/selftest_rps.c
> > > > > index dcef8d498919..1e0e59bc69b6 100644
> > > > > --- a/drivers/gpu/drm/i915/gt/selftest_rps.c
> > > > > +++ b/drivers/gpu/drm/i915/gt/selftest_rps.c
> > > > > @@ -478,8 +478,11 @@ int live_rps_control(void *arg)
> > > > >  			min, max, ktime_to_ns(min_dt), ktime_to_ns(max_dt));
> > > > >  
> > > > >  		if (limit == rps->min_freq) {
> > > > 
> > > > I was going to merge this, but then I noticed that this prints only
> > > > when the throttle moves that to our min_freq...  When PCODE throttle
> > > > the freq, the guaranteed freq can be at any point, not necessarily
> > > > to the minimal, so this print is not very effective in the end of the day
> > > 
> > > Makes me wonder why such a criteria at all?
> > 
> > very good question...
> > Perhaps we need to revamp entirely this selftest or kill it?
> 
> Depends. Do we qualify throttling as a failure?
> If yes, we'll keep hitting this every now and then.
> If no, then just dropping this condition might be enough.

hmm that will make CI angry... we can remove the condition and
then tune down the msg to debug and not error.

But perhaps the test was done with the assumption in mind that
a throttle to a minimum is a catastrofic error, which I disagree.

Throttle is throttle is normal operation and depending on many
external factors and many things that are out of our control and
that changes from platform to platform.

> 
> Raag
> 
> > > > > -			pr_err("%s: GPU throttled to minimum!\n",
> > > > > -			       engine->name);
> > > > > +			u32 throttle = intel_uncore_read(gt->uncore,
> > > > > +							 intel_gt_perf_limit_reasons_reg(gt));
> > > > > +
> > > > > +			pr_err("%s: GPU throttled to minimum frequency with reasons 0x%08x\n",
> > > > > +			       engine->name, throttle & GT0_PERF_LIMIT_REASONS_MASK);
> > > > >  			show_pstate_limits(rps);
> > > > >  			err = -ENODEV;
> > > > >  			break;
> > > > > -- 
> > > > > 2.34.1
> > > > > 


More information about the Intel-gfx mailing list