[Intel-gfx] [PATCH] drm/i915/selftests: Measure the energy consumed while in RC6

Andi Shyti andi at etezian.org
Wed Mar 25 08:58:54 UTC 2020


Hi Chris,

On Wed, Mar 25, 2020 at 08:10:56AM +0000, Chris Wilson wrote:
> Measure and compare the energy consumed, as reported by the rapl MSR,
> by the GPU while in RC0 and RC6 states. Throw an error if RC6 does not
> at least halve the energy consumption of RC0, as this more than likely
> means we failed to enter RC0 correctly.
> 
> If we can't measure the energy draw with the MSR, then it will report 0
> for both measurements. Since the measurement works on all gen6+, this seems
> worth flagging as an error.
> 
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> Cc: Andi Shyti <andi.shyti at intel.com>

would be nice to have a revision history, given that I got quite 
some versions of this patch.

> +static u64 energy_uJ(struct intel_rc6 *rc6)
> +{
> +	unsigned long long power;
> +	u32 units;
> +
> +	if (rdmsrl_safe(MSR_RAPL_POWER_UNIT, &power))
> +		return 0;
> +
> +	units = (power & 0x1f00) >> 8;
> +
> +	if (rdmsrl_safe(MSR_PP1_ENERGY_STATUS, &power))
> +		return 0;
> +
> +	return (1000000 * power) >> units; /* convert to uJ */
> +}

shall we put this in a library?

>  	res[0] = rc6_residency(rc6);
> +	dt = ktime_get();
> +	rc0_power = energy_uJ(rc6);
>  	msleep(250);
> +	rc0_power = energy_uJ(rc6) - rc0_power;
> +	dt = ktime_sub(ktime_get(), dt);
>  	res[1] = rc6_residency(rc6);
>  	if ((res[1] - res[0]) >> 10) {
>  		pr_err("RC6 residency increased by %lldus while disabled for 250ms!\n",
> @@ -63,13 +85,23 @@ int live_rc6_manual(void *arg)
>  		goto out_unlock;
>  	}
>  
> +	rc0_power = div64_u64(NSEC_PER_SEC * rc0_power, ktime_to_ns(dt));
> +	if (!rc0_power) {

is this likely to happen?

>  	res[0] = rc6_residency(rc6);
> +	dt = ktime_get();
> +	rc6_power = energy_uJ(rc6);
>  	msleep(100);
> +	rc6_power = energy_uJ(rc6) - rc6_power;
> +	dt = ktime_sub(ktime_get(), dt);
>  	res[1] = rc6_residency(rc6);
> -
>  	if (res[1] == res[0]) {
>  		pr_err("Did not enter RC6! RC6_STATE=%08x, RC6_CONTROL=%08x, residency=%lld\n",
>  		       intel_uncore_read_fw(gt->uncore, GEN6_RC_STATE),
> @@ -78,6 +110,15 @@ int live_rc6_manual(void *arg)
>  		err = -EINVAL;
>  	}
>  
> +	rc6_power = div64_u64(NSEC_PER_SEC * rc6_power, ktime_to_ns(dt));
> +	pr_info("GPU consumed %llduW in RC0 and %llduW in RC6\n",
> +		rc0_power, rc6_power);
> +	if (2 * rc6_power > rc0_power) {
> +		pr_err("GPU leaked energy while in RC6!\n");
> +		err = -EINVAL;
> +		goto out_unlock;
> +	}

nice,

Reviewed-by: Andi Shyti <andi.shyti at intel.com>

Thanks,
Andi


More information about the Intel-gfx mailing list