[PATCH] drm/xe: Unlink client during vm close

Lucas De Marchi lucas.demarchi at intel.com
Fri Jul 19 16:29:02 UTC 2024


On Fri, Jul 19, 2024 at 10:14:42AM GMT, Upadhyay, Tejas wrote:
>
>
>> -----Original Message-----
>> From: Brost, Matthew <matthew.brost at intel.com>
>> Sent: Friday, July 19, 2024 12:22 PM
>> To: Upadhyay, Tejas <tejas.upadhyay at intel.com>
>> Cc: intel-xe at lists.freedesktop.org
>> Subject: Re: [PATCH] drm/xe: Unlink client during vm close
>>
>> On Thu, Jul 18, 2024 at 11:08:42PM -0600, Upadhyay, Tejas wrote:
>> >
>> >
>> > > -----Original Message-----
>> > > From: Brost, Matthew <matthew.brost at intel.com>
>> > > Sent: Thursday, July 18, 2024 9:28 PM
>> > > To: Upadhyay, Tejas <tejas.upadhyay at intel.com>
>> > > Cc: intel-xe at lists.freedesktop.org
>> > > Subject: Re: [PATCH] drm/xe: Unlink client during vm close
>> > >
>> > > On Thu, Jul 18, 2024 at 06:47:52PM +0530, Tejas Upadhyay wrote:
>> > > > We have async call which does not know if client unlinked from vm
>> > > > by the time it is accessed. Set client unlink early during
>> > > > xe_vm_close() so that async API do not touch closed client info.
>> > > >
>> > > > Also, debugs related to job timeout is not useful when its "no
>> > > > process" or client already unlinked.
>> > > >
>> > >
>> > > It kernel exec queue timeout jobs, now the 'Timedout job' message
>> > > will not be displayed which is not ideal.
>> > >
>> > > > Fixes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2273
>> > >
>> > > Where is exactly is this access coming from?
>> > > BUG: kernel NULL pointer dereference, address: 0000000000000058
>> >
>> > In guc_exec_queue_timedout_job() accessing "q->vm->xef->drm" after
>> client closed fd causing crash. We cant take ref and keep client awake till jobs
>> timedout is what I thought.
>> >
>>
>> Taking ref to q->vm->xef is exactly what Umesh's series [1] here is doing. I
>> believe this is the correct behavior and based on you comment above, I also I
>> believe it will fix this issue. Please test with this series. Hopefully Umesh gets
>> this in soon.
>>
>> [1] https://patchwork.freedesktop.org/series/135865/
>
>This series also fixes https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2273.

applied it yesterday, so hopefull drm-xe-next is now clean.

Lucas De Marchi


More information about the Intel-xe mailing list