[PATCH v2 4/9] drm/sched: Split free_job into own work item

Boris Brezillon boris.brezillon at collabora.com
Tue Sep 12 13:53:59 UTC 2023


On Tue, 12 Sep 2023 15:34:41 +0200
Danilo Krummrich <dakr at redhat.com> wrote:

> On Tue, Sep 12, 2023 at 03:27:05PM +0200, Boris Brezillon wrote:
> > On Fri, 25 Aug 2023 15:45:49 +0200
> > Christian König <christian.koenig at amd.com> wrote:
> >   
> > > >>>> I tried this patch with Nouveau and found a race condition:
> > > >>>>
> > > >>>> In drm_sched_run_job_work() the job is added to the pending_list via
> > > >>>> drm_sched_job_begin(), then the run_job() callback is called and the scheduled
> > > >>>> fence is signaled.
> > > >>>>
> > > >>>> However, in parallel drm_sched_get_cleanup_job() might be called from
> > > >>>> drm_sched_free_job_work(), which picks the first job from the pending_list and
> > > >>>> for the next job on the pending_list sets the scheduled fence' timestamp field.    
> > > >> Well why can this happen in parallel? Either the work items are scheduled to
> > > >> a single threaded work queue or you have protected the pending list with
> > > >> some locks.
> > > >>    
> > > > Xe uses a single-threaded work queue, Nouveau does not (desired
> > > > behavior).
> > > >
> > > > The list of pending jobs is protected by a lock (safe), the race is:
> > > >
> > > > add job to pending list
> > > > run_job
> > > > signal scheduled fence
> > > >
> > > > dequeue from pending list
> > > > free_job
> > > > update timestamp
> > > >
> > > > Once a job is on the pending list its timestamp can be accessed which
> > > > can blow up if scheduled fence isn't signaled or more specifically unless
> > > > DMA_FENCE_FLAG_TIMESTAMP_BIT is set.    
> > 
> > I'm a bit lost. How can this lead to a NULL deref? Timestamp is a
> > ktime_t embedded in dma_fence, and finished/scheduled are both
> > dma_fence objects embedded in drm_sched_fence. So, unless
> > {job,next_job}->s_fence is NULL, or {job,next_job} itself is NULL, I
> > don't really see where the NULL deref is. If s_fence is NULL, that means
> > drm_sched_job_init() wasn't called (unlikely to be detected that late),
> > or ->free_job()/drm_sched_job_cleanup() was called while the job was
> > still in the pending list. I don't really see a situation where job
> > could NULL to be honest.  
> 
> I think the problem here was that a dma_fence' timestamp field is within a union
> together with it's cb_list list_head [1]. If a timestamp is set before the fence
> is actually signalled, dma_fence_signal_timestamp_locked() will access the
> cb_list to run the particular callbacks registered to this dma_fence. However,
> writing the timestap will overwrite this list_head since it's a union, hence
> we'd try to dereference the timestamp while iterating the list.

Ah, right. I didn't notice it was a union, thought it was a struct...

> 
> [1] https://elixir.bootlin.com/linux/latest/source/include/linux/dma-fence.h#L87
> 
> > 
> > While I agree that updating the timestamp before the fence has been
> > flagged as signaled/timestamped is broken (timestamp will be
> > overwritten when dma_fence_signal(scheduled) is called) I don't see a
> > situation where it would cause a NULL/invalid pointer deref. So I
> > suspect there's another race causing jobs to be cleaned up while
> > they're still in the pending_list.
> >   
> > > 
> > > Ah, that problem again. No that is actually quite harmless.
> > > 
> > > You just need to double check if the DMA_FENCE_FLAG_TIMESTAMP_BIT is 
> > > already set and if it's not set don't do anything.  
> >   
> 



More information about the dri-devel mailing list