[PATCH] drm/amd: Don't wake dGPUs while reading sensors

Mario Limonciello superm1 at kernel.org
Mon Aug 26 00:19:28 UTC 2024



On 8/25/24 15:40, Luna Nova wrote:
> Raised this as an issue a while back on the bug tracker and it got closed as WONTFIX. https://gitlab.freedesktop.org/drm/amd/-/issues/2229
> Been running a patched kernel with a similar patch locally ever since because even figuring out everything on the system that's accidentally waking the GPU was too time consuming.
> 
> I'd love if this gets accepted.
> I think fundamentally waking the device to ask how much power it is using thus increasing the power usage makes no sense - by trying to measure it we changed it, so if power can't be measured while off it only makes sense to return an error. Same applies for other sensors that currently wake the GPU - most of them are changing the property by waking it.
> 
> Because this behavior is odd and it's not obvious on single GPU systems that anything's going wrong app and lib devs are likely to keep making this "mistake" forever.
> 
> Luna

So FWIW I did file a v2 [1] that "undoes" the debugfs changes.

[1] 
https://lore.kernel.org/amd-gfx/20240823145527.150749-1-mario.limonciello@amd.com/

If there is too much push back to an error code another option we can do 
is return "0" for this case, which will make "sense" for some sysfs 
files specifically if in d3cold.  However for d3hot and some sysfs it 
isn't fully true.


More information about the amd-gfx mailing list