[PATCH] drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini

Guilherme G. Piccoli gpiccoli at igalia.com
Tue Jan 31 18:22:51 UTC 2023


On 31/01/2023 14:52, Alex Deucher wrote:
> [...]
>> (b) We can't use sched.ready, which would make sense...but amdgpu
>> overrides its meaning, the driver manipulates this value for its own
>> purposes of tracking ring init, or something like that.
>>
>> This is the tangential topic: what should we do here? My understanding
>> of Alex's message is that we could have a "ready" field in the ring
>> structure and stop messing with sched.ready - does it make sense Alex?
> 
> Yes, I think so.  The tricky part will be figuring out which current
> sched.ready checks are checking for the scheduler being ready vs.
> whether the ring itself is ready.
> 

Thanks, makes sense!

$ grep -nr "sched.ready" drivers/gpu/drm/amd/ | wc -l
83

Maybe not super tough, I hope heh

>>
>> Guchun / Christian, does it also make sense for you?
>>
>>
>> Regarding (a), I could re-submit having s/sched.name/sched.ops, no
>> biggies, I tested both to be fair, before sending...I just chose name
>> but any field that is proper initialized on drm_sched_init() would work.
> 
> Yeah, I think ops is fine.  You could even use sched.ready after we
> clean up the use of that in the driver.  There are already a bunch of
> places where we check sched.ready to check if the scheduler really is
> ready.

Hmm..unfortunately, doesn't work. This was a case in which sched.ready
was set to true in the ring init routine, but scheduler wasn't properly
initialized. So, a different key for comparison is required..I'll
re-submit with sched.ops.

After a potential rework of the driver to get rid of sched.ready
manipulation, then it could be fixed to properly use this flag...makes
sense to you?

Tnx again for the prompt review!
Cheers,


Guilherme


More information about the dri-devel mailing list