[Bug 110674] Crashes / Resets From AMDGPU / Radeon VII
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Mon Aug 12 17:40:12 UTC 2019
https://bugs.freedesktop.org/show_bug.cgi?id=110674
--- Comment #90 from Tom B <tom at r.je> ---
I'm not sure this is helpful but I managed to somewhat test the race condition
theory.
If you follow the callstack:
vega20_set_fclk_to_highest_dpm_level -> smum_send_msg_to_smc_with_parameter ->
vega20_send_msg_to_smc_with_parameter -> vega20_wait_for_response ->
phm_wait_for_register_unequal you find this code in smu_helper.c:
int phm_wait_on_register(struct pp_hwmgr *hwmgr, uint32_t index,
uint32_t value, uint32_t mask)
{
uint32_t i;
uint32_t cur_value;
if (hwmgr == NULL || hwmgr->device == NULL) {
pr_err("Invalid Hardware Manager!");
return -EINVAL;
}
for (i = 0; i < hwmgr->usec_timeout; i++) {
cur_value = cgs_read_register(hwmgr->device, index);
if ((cur_value & mask) == (value & mask))
break;
udelay(1);
}
/* timeout means wrong logic*/
if (i == hwmgr->usec_timeout)
return -1;
return 0;
}
The timeout there is interesting. I increased it.
for (i = 0; i < hwmgr->usec_timeout*10; i++) {
cur_value = cgs_read_register(hwmgr->device, index);
if ((cur_value & mask) == (value & mask))
break;
udelay(1);
}
The PC takes significantly longer to boot (10 or so seconds when it's usually
instant) and the error still occurs. So I'm not sure it's just a matter of
waiting.
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20190812/28d08c6b/attachment.html>
More information about the dri-devel
mailing list