Bug Report: [PowerPlay] MCLK can't be set above 1107MHz on Vega 64

Quan, Evan Evan.Quan at amd.com
Tue May 7 06:12:00 UTC 2019


Hi Yanik,

I just sent out several patches(with you in the CC list) and I believe the 1st patch may fix your issue(raise SOCCLK with mclk).

Regards,
Evan
From: Yanik Yiannakis <yanik at yiannakis.de>
Sent: 2019年5月6日 18:56
To: Quan, Evan <Evan.Quan at amd.com>; amd-gfx at lists.freedesktop.org; Deucher, Alexander <Alexander.Deucher at amd.com>
Subject: Re: Bug Report: [PowerPlay] MCLK can't be set above 1107MHz on Vega 64

[CAUTION: External Email]

Hello Evan,

Yes I always used that command to commit my changes. I also have amdgpu.ppfeaturemask=0xffffffff as a boot parameter and I set power_dpm_force_performance_level to manual. Sorry for omitting that I assumed it was evident.

I have heard that the MCLK can only be as high as the SOCCLK. That would make sense because the SOCCLK of my Vega 64 is 1107MHz in its highest state. I noticed that on Windows the SOCCLK is raised automatically if the user sets the MCLK high enough through Wattman.

To replicate this on Linux I manually edited the pp_table to change the MCLK to 1175MHz and the SOCCLK to 1180MHz. The new SOCCLK was displayed in pp_dpm_socclk and in Unigine Superposition the FPS increased as expected (compared to an MCLK of 1107MHz). As a final test I edited the pp_table to set the MCLK to 1220MHz (this was unstable on Windows) and the SOCCLK to 1250MHz. This resulted in a crash (just like on Windows) which indicates that the MCLK really was set to 1220MHz.

My understanding of the situation is that powerplay doesn't automatically raise the SOCCLK like Wattman.
It would be cool if the user had the ability to overclock the SOCCLK through powerplay.

Greetings,
Yanik


On 06.05.19 10:13, Quan, Evan wrote:
+Alex,

Hi Yanik,

Did you ever run the following command to let your OD settings take effect (before running games)? Otherwise, they did not take effect actually.
echo "c" > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:00.0/0000:03:00.0/pp_od_clk_voltage

Regards,
Evan
From: Yanik Yiannakis <yanik at yiannakis.de><mailto:yanik at yiannakis.de>
Sent: Monday, April 29, 2019 7:44 AM
To: rex.zhu at amd.com<mailto:rex.zhu at amd.com>; Quan, Evan <Evan.Quan at amd.com><mailto:Evan.Quan at amd.com>; amd-gfx at lists.freedesktop.org<mailto:amd-gfx at lists.freedesktop.org>
Subject: Bug Report: [PowerPlay] MCLK can't be set above 1107MHz on Vega 64


Hello,

I experience a bug that prevents me from setting the MCLK of my Vega 64 LC above 1107MHz.

I am using Unigine Superposition 1.1 in "Game"-mode to check the performance by watching the FPS.


Behaviour with a single monitor:

First I set the MCLK to a known stable value below 1108MHz:

$ echo "m 3 1100 950" > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:00.0/0000:03:00.0/pp_od_clk_voltage

In Unigine Superposition the FPS increase as expected.

pp_dpm_mclk also confirms the change.

$ watch cat /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:00.0/0000:03:00.0/pp_dpm_mclk

0: 167Mhz
1: 500Mhz
2: 800Mhz
3: 1100Mhz *



After that I set the MCLK to a stable value above 1107MHz:

$ echo "m 3 1200 950" > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:00.0/0000:03:00.0/pp_od_clk_voltage

In Unigine Superposition the FPS drop drastically.

pp_dpm_mclk indicates that the MCLK is stuck in state 0 (167MHz):

0: 167Mhz *
1: 500Mhz
2: 800Mhz
3: 1200Mhz



Behaviour with multiple monitors that have different refresh rates:

My monitors have different refresh rates. This causes the MCLK to stay in state 3 (945MHz stock) which is the expected behaviour as I understand it.



Now I try to set the MCLK to a value above 1107MHz:

$ echo "m 3 1200 950" > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:00.0/0000:03:00.0/pp_od_clk_voltage

The FPS in Unigine Superposition remain the same as they were with 945MHz.

pp_dpm_mclk shows however that the value was set:

0: 167Mhz
1: 500Mhz
2: 800Mhz
3: 1200Mhz *



Then I set the MCLK to a value of 1107MHz or lower:

$ echo "m 3 1100 950" > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:00.0/0000:03:00.0/pp_od_clk_voltage

The FPS in Unigine Superposition increase.

pp_dpm_mclk again confirms the set value:

0: 167Mhz
1: 500Mhz
2: 800Mhz
3: 1100Mhz *


Finally I increase MCLK to a known unstable value:

$ echo "m 3 1300 950" > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:00.0/0000:03:00.0/pp_od_clk_voltage

The FPS in Unigine Superposition remain the same. I therefore believe the value was not actually applied.

However pp_dpm_mclk shows that it was:

0: 167Mhz
1: 500Mhz
2: 800Mhz
3: 1300Mhz *



amdgpu_pm_info also claims that the value was set:

$ sudo watch cat /sys/kernel/debug/dri/1/amdgpu_pm_info

GFX Clocks and Power:
        1300 MHz (MCLK)
        27 MHz (SCLK)
        1348 MHz (PSTATE_SCLK)
        800 MHz (PSTATE_MCLK)
        825 mV (VDDGFX)
        4.0 W (average GPU)

Again, I think the displayed MCLK is false and the memory still runs at 1100MHz because the performance in Unigine Superposition indicates this and 1300MHz would cause a crash immediately.

A stable value (e.g. 1200MHz) causes the same behaviour. I just chose 1300MHz to be sure.





Tested on these Kernels:

Arch-Linux 5.0.9 (Arch)

Linux 5.1-rc6 (Ubuntu)

Linux 5.0 with amd-staging-drm-next (Ubuntu) (https://github.com/M-Bab/linux-kernel-amdgpu-binaries)

(Same behaviour on every kernel.)



Tested on this hardware:

CPU: Intel i7-8700k

Motherboard: MSI Z370 Gaming Pro Carbon

GPU: Powercolor Vega 64 Liquid Cooled (Memory stable below 1220MHz, tested on Windows 10 with Wattman and Unigine Superposition)



Unigine Superposition "Game"-Mode settings:

Preset: Custom

Fullscreen: Disabled

Resolution: 3840x2160 (4K UHD)

Shaders Quality: Extreme

Textures Quality: High

Vsync: Off

Depth of Field: On

Motion Blur: On



I hope this helps.

Yanik Yiannakis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20190507/60081b9f/attachment-0001.html>


More information about the amd-gfx mailing list