<html>
<head>
<base href="https://bugs.freedesktop.org/">
</head>
<body>
<p>
<div>
<b><a class="bz_bug_link
bz_status_NEW "
title="NEW - Crashes / Resets From AMDGPU / Radeon VII"
href="https://bugs.freedesktop.org/show_bug.cgi?id=110674#c81">Comment # 81</a>
on <a class="bz_bug_link
bz_status_NEW "
title="NEW - Crashes / Resets From AMDGPU / Radeon VII"
href="https://bugs.freedesktop.org/show_bug.cgi?id=110674">bug 110674</a>
from <span class="vcard"><a class="email" href="mailto:tom@r.je" title="Tom B <tom@r.je>"> <span class="fn">Tom B</span></a>
</span></b>
<pre>Created <span class=""><a href="attachment.cgi?id=145038" name="attach_145038" title="5.2.7 dmesg with hard_min_level logged">attachment 145038</a> <a href="attachment.cgi?id=145038&action=edit" title="5.2.7 dmesg with hard_min_level logged">[details]</a></span>
5.2.7 dmesg with hard_min_level logged
As mentioned in the previous post, I started logging the value of
hard_min_level. I hadn't realised that vega20_set_uclk_to_highest_dpm_level
would be called so many times.
Here's what I found: The value of hard_min_level is 1001 in both 5.0.13 and
5.2.7 so the issue is not the value from the dpm table. The dpm table is
probably correct. Something prevents smum_send_msg_to_smc_with_parameter
accepting the value.
However, what is interesting is that it doesn't always fail.
[ 4.082105] amdgpu: [powerplay] hard_min_level: 1001
[ 4.372684] [drm] Initialized amdgpu 3.32.0 20150101 for 0000:44:00.0 on
minor 0
[ 4.517204] amdgpu: [powerplay] Failed to send message 0x28, response 0x0
[ 4.517205] amdgpu: [powerplay] [SetUclkToHightestDpmLevel] Set hard min
uclk failed!
Each hard_min_level line in the log is from
vega20_set_uclk_to_highest_dpm_level and there are multiple calls to it, which
don't fail, before the card is initialised.
This is from 5.2.7:
[ 3.698907] amdgpu 0000:44:00.0: ring vce2 uses VM inv eng 14 on hub 1
[ 4.082105] amdgpu: [powerplay] hard_min_level: 1001
[ 4.372684] [drm] Initialized amdgpu 3.32.0 20150101 for 0000:44:00.0 on
minor 0
[ 4.517204] amdgpu: [powerplay] Failed to send message 0x28, response 0x0
[ 4.517205] amdgpu: [powerplay] [SetUclkToHightestDpmLevel] Set hard min
uclk failed!
[ 5.361482] amdgpu: [powerplay] Failed to send message 0x28, response 0x0
And the same from 5.0.13:
[ 3.352380] amdgpu 0000:44:00.0: ring vce2 uses VM inv eng 14 on hub 1
[ 3.722422] amdgpu: [powerplay] hard_min_level: 1001
[ 3.766269] amdgpu: [powerplay] hard_min_level: 1001
[ 4.029679] [drm] Initialized amdgpu 3.27.0 20150101 for 0000:44:00.0 on
minor 0
There are a couple of things here:
1. vega20_set_fclk_to_highest_dpm_level is called twice between the "ring vce2"
line and "Initialized"
2. My patched code looks like this:
pr_err("hard_min_level: %d\n",
dpm_table->dpm_state.hard_min_level);
PP_ASSERT_WITH_CODE(!(ret =
smum_send_msg_to_smc_with_parameter(hwmgr,
PPSMC_MSG_SetHardMinByFreq,
(PPCLK_UCLK << 16 ) |
dpm_table->dpm_state.hard_min_level)),
"[SetUclkToHightestDpmLevel] Set hard min uclk
failed!",
return ret);
Yet the log shows:
- My debug line
- Initialized amdgpu 3.32.0 20150101 for 0000:44:00.0 on minor 0
- [SetUclkToHightestDpmLevel] Set hard min uclk failed!
So initialization is happening between (and possibly a result of) sending the
message and getting the response.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>