[WARNING][AMDGPU] WQ_MEM_RECLAIM with Radeon RX 6600

Tejun Heo tj at kernel.org
Wed Dec 18 18:08:59 UTC 2024


Hello, sorry about the delay.

On Mon, Dec 16, 2024 at 04:34:00PM -0800, Matthew Brost wrote:
> > However, after further discussion, I think the warning is actually a
> > false positive.  See this discussion:
> > https://lists.freedesktop.org/archives/amd-gfx/2024-November/117349.html
> > 
> > From the thread:
> > "Question is - does check_flush_dependency() need to skip the
> > !WQ_MEM_RECLAIM flushing WQ_MEM_RECLAIM warning *if* the work is already
> > running *and* it was called from cancel_delayed_work_sync()?"
> > 
> 
> See my reply just now [1] — I’m going to have to disagree with AMD's
> assessment, but I’m not certain.
> 
> Again, I believe Tejun is the authority here.

I think we can skip the warning if the flushing is coming from
cancel*_work_sync() as flush takes place iff the work item already has a
worker running - ie. it can't be blocked from lack of memory. Tvrtko, can
you write up a patch to exclude the condition from check_flush_dependency()?
I think it can just skip check_flush_dependency() when @from_cancel is set.

Taking a step back, if an actual dependency develops in the future - memory
reclaim actually blocking on gpu work items, one way to handle that would be
adding subsystem-wide workqueues so that the rescuer can be shared across
GPU drivers / devices. As long as they don't depend on each other for making
forward progress, which they most likely wouldn't, sharing a rescuer across
them is completely fine.

Thanks.

-- 
tejun


More information about the amd-gfx mailing list