Re: 回复: [REGRESSION] amdgpu: async system error exception from hdp_v5_0_flush_hdp()
Alexey Klimov
alexey.klimov at linaro.org
Tue Apr 22 02:20:59 UTC 2025
On Thu Apr 17, 2025 at 2:08 PM BST, Alex Deucher wrote:
> On Wed, Apr 16, 2025 at 8:43 PM Fugang Duan <fugang.duan at cixtech.com> wrote:
>>
>> 发件人: Alex Deucher <alexdeucher at gmail.com> 发送时间: 2025年4月16日 22:49
>> >收件人: Alexey Klimov <alexey.klimov at linaro.org>
>> >On Wed, Apr 16, 2025 at 9:48 AM Alexey Klimov <alexey.klimov at linaro.org> wrote:
>> >>
>> >> On Wed Apr 16, 2025 at 4:12 AM BST, Fugang Duan wrote:
>> >> > 发件人: Alexey Klimov <alexey.klimov at linaro.org> 发送时间: 2025年4月16
>> >日 2:28
>> >> >>#regzbot introduced: v6.12..v6.13
>> >>
>> >> [..]
>> >>
>> >> >>The only change related to hdp_v5_0_flush_hdp() was
>> >> >>cf424020e040 drm/amdgpu/hdp5.0: do a posting read when flushing HDP
>> >> >>
>> >> >>Reverting that commit ^^ did help and resolved that problem. Before
>> >> >>sending revert as-is I was interested to know if there supposed to
>> >> >>be a proper fix for this or maybe someone is interested to debug this or
>> >have any suggestions.
>> >> >>
>> >> > Can you revert the change and try again
>> >> > https://gitlab.com/linux-kernel/linux/-/commit/cf424020e040be35df05b
>> >> > 682b546b255e74a420f
>> >>
>> >> Please read my email in the first place.
>> >> Let me quote just in case:
>> >>
>> >> >The only change related to hdp_v5_0_flush_hdp() was
>> >> >cf424020e040 drm/amdgpu/hdp5.0: do a posting read when flushing HDP
>> >>
>> >> >Reverting that commit ^^ did help and resolved that problem.
>> >
>> >We can't really revert the change as that will lead to coherency problems. What
>> >is the page size on your system? Does the attached patch fix it?
>> >
>> >Alex
>> >
>> 4K page size. We can try the fix if we got the environment.
>
> OK. that patch won't change anything then. Can you try this patch instead?
Config I am using is basically defconfig wrt memory parameters, yeah, i use 4k.
So I tested that patch, thank you, and some other different configurations --
nothing helped. Exactly the same behaviour with the same backtrace.
So it seems that it is firmware problem after all?
Thanks,
Alexey
More information about the amd-gfx
mailing list