[Intel-gfx] [PATCH 2/2] igt/pm_rps: Add checks for freq = idle (RPn) in specific cases.

Imre Deak imre.deak at intel.com
Fri Dec 4 07:22:11 PST 2015


On to, 2015-12-03 at 16:43 -0800, Bob Paauwe wrote:
> On Tue, 1 Dec 2015 19:43:05 +0200
> Imre Deak <imre.deak at intel.com> wrote:
> 
> > On ti, 2015-12-01 at 09:22 -0800, Bob Paauwe wrote:
> > > On Tue, 1 Dec 2015 15:56:55 +0200
> > > Imre Deak <imre.deak at intel.com> wrote:
> > > 
> > > > On ma, 2015-11-30 at 16:23 -0800, Bob Paauwe wrote:
> > > > > Now that the frequency can drop below the user specified
> > > > > minimum
> > > > > when
> > > > > the gpu is idle, add some checking to verify that it really
> > > > > does
> > > > > drop
> > > > > down to the RPn frequency in specific cases.
> > > > > 
> > > > > To do this, modify the main test flow to take two 'check'
> > > > > routines
> > > > > instead of one. When running the config-min-max-idle subtest
> > > > > make
> > > > > use of the second check routine to verify that the frequency
> > > > > drops
> > > > > to RPn instead of simply <= user min frequency.  For all
> > > > > other
> > > > > subtests, use the original check routines for both checks.
> > > > > 
> > > > > Signed-off-by: Bob Paauwe <bob.j.paauwe at intel.com>
> > > > 
> > > > I don't see the point. The frequency should always drop to the
> > > > idle
> > > > frequency if the GPU is idle, it's not enough that it's below
> > > > MIN-
> > > > freq. 
> > > > So why do we need separate CUR-freq<=MIN-freq and CUR-
> > > > freq==idle-
> > > > freq
> > > > checks?
> > > 
> > > I started from the premise that it should always drop to idle but
> > > that's just not the case.
> > 
> > It is the correct premise and if it doesn't hold then there is a
> > real
> > bug either in the testcase or the kernel which needs to be
> > addressed
> > differently. I haven't found anything concrete but one suspect is
> > the
> > logic that waits for the GPU idle condition, maybe the timeout
> > value idle_check() or the hard-coded duration in do_load_gpu() is
> > incorrect. In any case let's not paper over this issue, the very
> > reason
> > we have test cases is to uncover such bugs.
> > 
> > > The min_max_config() function is used for
> > > both the idle and loaded subtests.  If you try to check for
> > > freq=idle
> > > when doing the loaded subtest it will fail since it never goes
> > > idle.
> > > Even in the idle subtest there are cases where it doesn't drop
> > > down
> > > to
> > > idle. 
> > 
> > The only place we should check for freq==idle is in idle_check()
> > and
> > that one is not called during the loaded subtest, it wouldn't even
> > make
> > sense to do so. So I'm not sure how this subtest fails for you. 
> > 
> > > I suppose I could duplicate min_max_config and have a
> > > min_max_config_idle() and min_max_conifg_not_idle() for use by
> > > the
> > > different subtests. 
> > 
> > The point of the "check" argument of min_max_config() is to
> > distinguish
> > between the idle and loaded cases. The check functions passed in
> > know
> > already if they can expect the frequency to reach the idle
> > frequency
> > or not.
> > 
> > > The real problem is that the test was not designed to handle the
> > > case
> > > where the freq could drop below min and probably needs to be
> > > re-designed.  I've been trying to find a quick fix vs. a complete
> > > overhaul as this isn't really a priority for me.
> > 
> > I think we only need your first patch and if there is any failure
> > after
> > that we have to root cause that and fix it properly.
> > 
> > --Imre
> 
> You're right.  I'm working with BXT and it seems like it's got some
> unique issues with RPS.  I just sent a new patch based on the first
> one
> but with the changes you suggested to check for ==idle instead of
> <=min.
> 
> Maybe you have some insights into what I'm seeing with BXT?  The first
> problem I had was that I would see threshold up interrupts but not any
> threshold down interrupts.  The missing down interrupts was related to
> the BIOS setting.  I had runtime PM disabled so it seems strange that I
> was getting the up interrupts. 

Yes, I saw this too. I wonder if we could check this somehow and not
enable RPS if BIOS hasn't set things up properly. Sagar has a patch
that checks the RC6 setup [1]; that's not exactly RPS, but since they
are setup at the same place I think we could use that for now for RPS
too.

> With the BIOS setting corrected, the driver started disabling the down
> interrupts once the freq dropped to min or just below because of the min
> vs. idle difference.  I have a patch for this that I'll send shortly.

Hm, that's not necessarily a problem. To reach the idle frequency we
don't depend on the up/down interrupts. The idle frequency gets set
explicitly from intel_mark_idle(), so if you don't reach the idle freq
then this function doesn't get called for some reason. Or as I said
earlier we just don't wait enough in do_load_gpu() or idle_check() in
the test, so we read out cur-freq before intel_mark_idle() would get
called.

> The remaining issue seems to be some type of timing issue. I've had
> to
> add a 35000us sleep between updating the interrupt enable register
> (0xA168) and the posting read of freq. register (0xA008), otherwise
> it
> acts like the change to the interrupt enable register never happened.
> None of the documentation I have indicates that this is needed.

What happens exactly? The frequency gets stuck above min-freq and you
never get a down interrupt? Not sure how this could happen, but there
is an interesting comment in intel_rps_limits() about this scenario.

--Imre

[1] http://lists.freedesktop.org/archives/intel-gfx/2015-November/07938
6.html


> 
> Bob
> 
> 
> > 
> > > 
> > > > 
> > > > > ---
> > > > >  tests/pm_rps.c | 47 ++++++++++++++++++++++++++++++++++----
> > > > > ----
> > > > > -----
> > > > >  1 file changed, 34 insertions(+), 13 deletions(-)
> > > > > 
> > > > > diff --git a/tests/pm_rps.c b/tests/pm_rps.c
> > > > > index f919625..96225ce 100644
> > > > > --- a/tests/pm_rps.c
> > > > > +++ b/tests/pm_rps.c
> > > > > @@ -364,7 +364,7 @@ static int get_hw_rounded_freq(int
> > > > > target)
> > > > >  	return ret;
> > > > >  }
> > > > >  
> > > > > -static void min_max_config(void (*check)(void), bool
> > > > > load_gpu)
> > > > > +static void min_max_config(void (*check)(void), void
> > > > > (*check2)(void), bool load_gpu)
> > > > >  {
> > > > >  	int fmid = (origfreqs[RPn] + origfreqs[RP0]) / 2;
> > > > >  
> > > > > @@ -384,7 +384,7 @@ static void min_max_config(void
> > > > > (*check)(void),
> > > > > bool load_gpu)
> > > > >  	writeval(stuff[MAX].filp, origfreqs[RP0]);
> > > > >  	if (load_gpu)
> > > > >  		do_load_gpu();
> > > > > -	check();
> > > > > +	check2();
> > > > >  
> > > > >  	igt_debug("\nIncrease min to midpoint...\n");
> > > > >  	writeval(stuff[MIN].filp, fmid);
> > > > > @@ -412,7 +412,7 @@ static void min_max_config(void
> > > > > (*check)(void),
> > > > > bool load_gpu)
> > > > >  	writeval(stuff[MIN].filp, origfreqs[RPn]);
> > > > >  	if (load_gpu)
> > > > >  		do_load_gpu();
> > > > > -	check();
> > > > > +	check2();
> > > > >  
> > > > >  	igt_debug("\nDecrease min below RPn
> > > > > (invalid)...\n");
> > > > >  	writeval_inval(stuff[MIN].filp, 0);
> > > > > @@ -420,11 +420,11 @@ static void min_max_config(void
> > > > > (*check)(void),
> > > > > bool load_gpu)
> > > > >  
> > > > >  	igt_debug("\nDecrease max to midpoint...\n");
> > > > >  	writeval(stuff[MAX].filp, fmid);
> > > > > -	check();
> > > > > +	check2();
> > > > >  
> > > > >  	igt_debug("\nDecrease max to RPn...\n");
> > > > >  	writeval(stuff[MAX].filp, origfreqs[RPn]);
> > > > > -	check();
> > > > > +	check2();
> > > > >  
> > > > >  	igt_debug("\nDecrease max below RPn
> > > > > (invalid)...\n");
> > > > >  	writeval_inval(stuff[MAX].filp, 0);
> > > > > @@ -436,11 +436,11 @@ static void min_max_config(void
> > > > > (*check)(void),
> > > > > bool load_gpu)
> > > > >  
> > > > >  	igt_debug("\nIncrease max to midpoint...\n");
> > > > >  	writeval(stuff[MAX].filp, fmid);
> > > > > -	check();
> > > > > +	check2();
> > > > >  
> > > > >  	igt_debug("\nIncrease max to RP0...\n");
> > > > >  	writeval(stuff[MAX].filp, origfreqs[RP0]);
> > > > > -	check();
> > > > > +	check2();
> > > > >  
> > > > >  	igt_debug("\nIncrease max above RP0
> > > > > (invalid)...\n");
> > > > >  	writeval_inval(stuff[MAX].filp, origfreqs[RP0] +
> > > > > 1000);
> > > > > @@ -461,7 +461,7 @@ static void basic_check(void)
> > > > >  
> > > > >  #define IDLE_WAIT_TIMESTEP_MSEC 100
> > > > >  #define IDLE_WAIT_TIMEOUT_MSEC 10000
> > > > > -static void idle_check(void)
> > > > > +static void idle_check_min(void)
> > > > >  {
> > > > >  	int freqs[NUMFREQ];
> > > > >  	int wait = 0;
> > > > > @@ -482,6 +482,27 @@ static void idle_check(void)
> > > > >  	igt_debug("Required %d msec to reach cur<=min\n",
> > > > > wait);
> > > > >  }
> > > > >  
> > > > > +static void idle_check_idle(void)
> > > > > +{
> > > > > +	int freqs[NUMFREQ];
> > > > > +	int wait = 0;
> > > > > +
> > > > > +	/* Monitor frequencies until cur settles down to
> > > > > min,
> > > > > which
> > > > > should
> > > > > +	 * happen within the allotted time */
> > > > > +	do {
> > > > > +		read_freqs(freqs);
> > > > > +		dump(freqs);
> > > > > +		checkit(freqs);
> > > > > +		if (freqs[CUR] == freqs[RPn])
> > > > > +			break;
> > > > > +		usleep(1000 * IDLE_WAIT_TIMESTEP_MSEC);
> > > > > +		wait += IDLE_WAIT_TIMESTEP_MSEC;
> > > > > +	} while (wait < IDLE_WAIT_TIMEOUT_MSEC);
> > > > > +
> > > > > +	igt_assert_eq(freqs[CUR], freqs[RPn]);
> > > > > +	igt_debug("Required %d msec to reach cur=idle\n",
> > > > > wait);
> > > > > +}
> > > > > +
> > > > >  #define LOADED_WAIT_TIMESTEP_MSEC 100
> > > > >  #define LOADED_WAIT_TIMEOUT_MSEC 3000
> > > > >  static void loaded_check(void)
> > > > > @@ -577,7 +598,7 @@ static void reset(void)
> > > > >  
> > > > >  	igt_debug("Removing load...\n");
> > > > >  	load_helper_stop();
> > > > > -	idle_check();
> > > > > +	idle_check_min();
> > > > >  }
> > > > >  
> > > > >  /*
> > > > > @@ -635,7 +656,7 @@ static void blocking(void)
> > > > >  	matchit(pre_freqs, post_freqs);
> > > > >  
> > > > >  	igt_debug("Removing load...\n");
> > > > > -	idle_check();
> > > > > +	idle_check_min();
> > > > >  }
> > > > >  
> > > > >  static void pm_rps_exit_handler(int sig)
> > > > > @@ -686,14 +707,14 @@ igt_main
> > > > >  	}
> > > > >  
> > > > >  	igt_subtest("basic-api")
> > > > > -		min_max_config(basic_check, false);
> > > > > +		min_max_config(basic_check, basic_check,
> > > > > false);
> > > > >  
> > > > >  	igt_subtest("min-max-config-idle")
> > > > > -		min_max_config(idle_check, true);
> > > > > +		min_max_config(idle_check_min,
> > > > > idle_check_idle,
> > > > > true);
> > > > >  
> > > > >  	igt_subtest("min-max-config-loaded") {
> > > > >  		load_helper_run(HIGH);
> > > > > -		min_max_config(loaded_check, false);
> > > > > +		min_max_config(loaded_check, loaded_check,
> > > > > false);
> > > > >  		load_helper_stop();
> > > > >  	}
> > > > >  
> > > 
> > > 
> > > 
> 
> 
> 


More information about the Intel-gfx mailing list