gpu_metrics does not provide 'current_gfxclk', 'current_uclk', 'average_cpu_power' & 'temperature_core' on AMD Ryzen 7000 CPU

Quan, Evan Evan.Quan at amd.com
Fri Feb 10 03:30:56 UTC 2023


[AMD Official Use Only - General]

For some members, "0" is a valid value. 
Thus "0xffff" is used instead to tell the output is invalid/unsupported.

BR
Evan
> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of
> sfrcorne
> Sent: Wednesday, February 8, 2023 7:12 AM
> To: Alex Deucher <alexdeucher at gmail.com>
> Cc: amd-gfx at lists.freedesktop.org
> Subject: Re: gpu_metrics does not provide 'current_gfxclk', 'current_uclk',
> 'average_cpu_power' & 'temperature_core' on AMD Ryzen 7000 CPU
> 
> Dear Alex,
> 
> If current_gfxclk is not supported for my CPU, then using
> average_gfxclk_frequency instead is indeed the best solution in my opinion.
> I will try to get a fix merged for my CPU in Mangohud.
> 
> On a side note: you mentioned that unsupported fields would be 0 but I
> don't think this is correct. In the Linux kernel/driver there is a line of code
> that first set all values to 0xFF by a memset() and then populates the
> supported fields.
> 
> see
> "https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/pm/s
> wsmu/smu_cmn.c#L999": memset(header, 0xFF, structure_size);
> 
> The value of the unsupported uint16_t fields thus should be 0xFFFF (or 65535
> in decimal). This is also what I get when reading the gpu_metrics file. I just
> wanted to mention this in case someone reads this in the Archive.
> 
> Anyway, thank you for your help!
> 
> Kind regards,
> sfrcorne
> 
> ------- Original Message -------
> On Tuesday, February 7th, 2023 at 05:05, Alex Deucher
> <alexdeucher at gmail.com> wrote:
> 
> 
> > On Mon, Feb 6, 2023 at 5:48 PM sfrcorne sfrcorne at protonmail.com wrote:
> >
> > > Dear Alex,
> > >
> > > First of all, thank you for your response. Personally, I use a Ryzen 5 7600X
> however people with a Ryzen 9 7900X are also reporting this issue. The
> relevant bug report in Mangohud can be found here:
> "https://github.com/flightlessmango/MangoHud/issues/868".
> > >
> > > I looked around a bit in both the Mangohud source code and the Linux
> kernel source code.
> > >
> > > (Mangohud source): From what I understand, Mangohud looks for a file
> "/sys/class/drm/card*/device/gpu_metrics". If this file exists (and it does
> exists on my machine), it tries to read this file and extract the relevant GPU
> data (and in case of an APU also the CPU data) from it (these are the values I
> was talking about in my previous mail). When the file
> "/sys/class/drm/card*/device/gpu_metrics" exists, it will not use the data
> provided by hwmon (/sys/class/hwmon/hwmon*/*).
> > >
> > > (Linux kernel): The gpu_metrics file contains different data, depending
> on what version is used. All valid versions can be found in the source code:
> "https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/includ
> e/kgd_pp_interface.h#L725". For my CPU/APU the 'gpu_metrics_v2_1'
> structure is used (I tested this by reading the gpu_metrics file myself).
> Furthermore, I think that for my case, this structure is set by the function
> "https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/pm/s
> wsmu/smu13/smu_v13_0_5_ppt.c#L459" but I am not completely sure
> about this.
> >
> >
> > The metrics provided by the SMU firmware varies from asic to asic.
> > For things that are not supported by the metrics table for a
> > particular asic, those fields would be 0. You can see what metrics are
> > supported for your asic in smu_v13_0_5_get_gpu_metrics() as that
> > function populates the supported fields from the firmware to the
> > common structure. current_gfxclk is not supported in your asic, but
> > average_gfxclk_frequency is. So you'd want to use whichever field is
> > available for a particular asic in Mangohud.
> >
> > > Lastly, I am not familiar with umr. I assume that you are referring to
> "https://gitlab.freedesktop.org/tomstdenis/umr"? If I find some time this
> weekend, then I will look into this some more.
> >
> >
> > Yes, that is the right link. umr uses the same interface as mangohud,
> > so you should see the same data.
> >
> > Alex
> >
> > > Kind regards,
> > > sfrcorne
> > >
> > > ------- Original Message -------
> > > On Monday, February 6th, 2023 at 22:22, Alex Deucher
> alexdeucher at gmail.com wrote:
> > >
> > > > On Mon, Feb 6, 2023 at 9:22 AM sfrcorne sfrcorne at protonmail.com
> wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > I hope this is the correct place to ask my question. I was not sure if I
> should have opened a new issue on Gitlab or send an email here, since I
> don't know know whether this is a bug or intended behaviour.
> > > > >
> > > > > The question is about the new AMD Ryzen 7000 CPU's. These new
> CPU's have an iGPU and consequently provide a gpu_metrics file for
> monitoring the GPU/CPU (APU?). This file is used by programs like
> Mangohud, that try to read (among other values) the following 4 values:
> > > > > - current_gfxclk
> > > > > - current_uclk
> > > > > - average_cpu_power
> > > > > - temperature_core
> > > > > However it appears that on AMD Ryzen 7000 CPU's these 4 values are
> not provided/updated in the gpu_metrics file. Other values like
> 'average_core_power', 'temperature_l3' and the other 'current_<x>clk' are
> also not provided/updated but these are not used by Mangohud at the
> moment.
> > > > >
> > > > > Is this intentional or a bug? And will this be fix and/or will support for
> these 4 values be added in the future?
> > > >
> > > > What specific CPU/APU is this? I don't recall off hand how
> > > > mangohud queries this stuff, but you can take a look at the hwmon
> > > > interfaces exposed by the driver or if you want the whole metrics
> > > > table, you can use umr to fetch and decode it via the kernel
> > > > interface. That will allow you to verify that the firmware is producing
> the proper data.
> > > >
> > > > Alex


More information about the amd-gfx mailing list