[Intel-gfx] [PATCH 4/4] drm/i915: Use czclk_freq in vlv c0 residency calculations

Ville Syrjälä ville.syrjala at linux.intel.com
Tue Sep 29 05:29:21 PDT 2015


On Mon, Sep 28, 2015 at 11:47:15PM +0300, Imre Deak wrote:
> On Thu, 2015-09-24 at 23:29 +0300, ville.syrjala at linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > 
> > Replace the use of mem_freq/4 with czclk_freq in the vlv c0 residency
> > calculations.
> > 
> > Also deal with VLV_COUNT_RANGE_HIGH which affects all RCx residency
> > counters. We have just enough bits to do this without intermediate
> > divisions.
> > 
> > Signed-off-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_irq.c | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index 07c87e0..d78ef64 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -998,12 +998,16 @@ static bool vlv_c0_above(struct drm_i915_private *dev_priv,
> >  			 int threshold)
> >  {
> >  	u64 time, c0;
> > +	unsigned int mul = 100;
> >  
> >  	if (old->cz_clock == 0)
> >  		return false;
> >  
> > +	if (I915_READ(VLV_COUNTER_CONTROL) & VLV_COUNT_RANGE_HIGH)
> > +		mul <<= 8;
> 
> Could've been a separate patch.
> 
> > +
> >  	time = now->cz_clock - old->cz_clock;
> > -	time *= threshold * dev_priv->mem_freq;
> > +	time *= threshold * dev_priv->czclk_freq;
> 
> Not introduced in this patch, but the above doesn't look correct to me.
> Time is cycles _divided_ by frequency, so imo the above should be either
> a division, or better we should calculate c0 (10ns) cycles here.

I think it's correct. It's just moved the division over the to other
side. So what we want to check is:

threshold * (czts - czts_old)     mul * (c0 - c0_old)
----------------------------- <= --------------------
     cz_to_milli_sec                  czclk_freq

Or actually maybe better think it as 

            (czts - czts_old) * czclk_freq
threshold * ------------------------------  <= mul * (c0 - c0_old)
	           cz_to_milli_sec

The fact that the "cz" timestamp is not in cz clock units forces us to
do this silly conversion. I have no idea why Punit wants to give out the
timestamp in some normalized units. If it would instead give us the raw
cz clock timestamp we could just do
"threshold * (czts - czts_old) <= mul * (c0 - c0_old)"

So yeah, another case of the hardware (well, Punit firmware in this case
I suppose) being "helpful" :(

I think I even tried looking for a raw cz timestamp register so that we
could avoid this mess, but I couldn't find one.
 
> >  
> >  	/* Workload can be split between render + media, e.g. SwapBuffers
> >  	 * being blitted in X after being rendered in mesa. To account for
> > @@ -1011,7 +1015,7 @@ static bool vlv_c0_above(struct drm_i915_private *dev_priv,
> >  	 */
> >  	c0 = now->render_c0 - old->render_c0;
> >  	c0 += now->media_c0 - old->media_c0;
> > -	c0 *= 100 * VLV_CZ_CLOCK_TO_MILLI_SEC * 4 / 1000;
> > +	c0 *= mul * VLV_CZ_CLOCK_TO_MILLI_SEC;
> 
> Based on the above this would need to be fixed too.
> 
> The above can be done as a follow-up if needed; this patch does what it
> says, so:
> Reviewed-by: Imre Deak <imre.deak at intel.com>
> 
> >  
> >  	return c0 >= time;
> >  }
> 

-- 
Ville Syrjälä
Intel OTC


More information about the Intel-gfx mailing list