WARNING in amdgpu_sync_keep_later / dma_fence_is_later should be rate limited
Rafał Miłecki
zajec5 at gmail.com
Thu Sep 21 20:11:22 UTC 2023
On 21.09.2023 21:52, Deucher, Alexander wrote:
>> backporting commit 187916e6ed9d ("drm/amdgpu: install stub fence into
>> potential unused fence pointers") to stable kernels resulted in lots of
>> WARNINGs on some devices. In my case I was getting 3 WARNINGs per
>> second (~150 lines logged every second). Commit ended up being reverted for
>> stable but it exposed a potential problem. My messages log size was reaching
>> gigabytes and was running my /tmp/ out of space.
>>
>> Could someone take a look at amdgpu_sync_keep_later / dma_fence_is_later
>> and make sure its logging is rate limited to avoid such situations in the future,
>> please?
>>
>> Revert in linux-5.15.x:
>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=li
>> nux-5.15.y&id=fae2d591f3cb31f722c7f065acf586830eab8c2a
>>
>> openSUSE bug report:
>> https://bugzilla.opensuse.org/show_bug.cgi?id=1215523
>
> These patches were never intended for stable. They were picked up by Sasha's stable autoselect tools and automatically applied to stable kernels.
Are you saying massive WARNINGs in dma_fence_is_later() can't happen
in any other case? I understand it was an incorrect backport action but
I thought we may learn from it and still add some rate limit.
More information about the amd-gfx
mailing list