[Intel-gfx] [PATCH] x86: Downgrade clock throttling thermal event critical error

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Wed Oct 10 11:59:59 UTC 2018


On 09/10/2018 12:37, Chris Wilson wrote:
> Under CI testing, it is common for the cpus to overheat with the
> continuous workloads and end up being throttled. As the cpus still
> function, it is less of a critical error meriting urgent action, but an
> expected yet significant condition (pr_note).
> 
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Petri Latvala <petri.latvala at intel.com>
> ---
>   arch/x86/kernel/cpu/mcheck/therm_throt.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
> index 2da67b70ba98..bc57b5988589 100644
> --- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
> +++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
> @@ -184,10 +184,10 @@ static void therm_throt_process(bool new_event, int event, int level)
>   	/* if we just entered the thermal event */
>   	if (new_event) {
>   		if (event == THERMAL_THROTTLING_EVENT)
> -			pr_crit("CPU%d: %s temperature above threshold, cpu clock throttled (total events = %lu)\n",
> -				this_cpu,
> -				level == CORE_LEVEL ? "Core" : "Package",
> -				state->count);
> +			pr_notice("CPU%d: %s temperature above threshold, cpu clock throttled (total events = %lu)\n",
> +				  this_cpu,
> +				  level == CORE_LEVEL ? "Core" : "Package",
> +				  state->count);
>   		return;
>   	}
>   	if (old_event) {
> 

It even sounds it wouldn't be far fetched to argue these days notice is 
the correct log level for thermal throttling. Unless there are more 
sources of throttling messages. TBC when I get back to my Skull Canyon. 
That one certainly logs something like this shortly after invoking make -j8.

Regards,

Tvrtko


More information about the Intel-gfx mailing list