[PATCH v6] drm/sched: Make sure we wait for all dependencies in kill_jobs_cb()

Boris Brezillon boris.brezillon at collabora.com
Thu Jun 22 09:56:36 UTC 2023


On Wed, 21 Jun 2023 11:03:48 -0400
Luben Tuikov <luben.tuikov at amd.com> wrote:

> On 2023-06-21 10:53, Boris Brezillon wrote:
> > On Wed, 21 Jun 2023 10:41:22 -0400
> > Luben Tuikov <luben.tuikov at amd.com> wrote:
> >   
> >> On 2023-06-21 10:18, Boris Brezillon wrote:  
> >>> Hello Luben,
> >>>
> >>> On Wed, 21 Jun 2023 09:56:40 -0400
> >>> Luben Tuikov <luben.tuikov at amd.com> wrote:
> >>>     
> >>>> On 2023-06-19 03:19, Boris Brezillon wrote:    
> >>>>> drm_sched_entity_kill_jobs_cb() logic is omitting the last fence popped
> >>>>> from the dependency array that was waited upon before
> >>>>> drm_sched_entity_kill() was called (drm_sched_entity::dependency field),
> >>>>> so we're basically waiting for all dependencies except one.
> >>>>>
> >>>>> In theory, this wait shouldn't be needed because resources should have
> >>>>> their users registered to the dma_resv object, thus guaranteeing that
> >>>>> future jobs wanting to access these resources wait on all the previous
> >>>>> users (depending on the access type, of course). But we want to keep
> >>>>> these explicit waits in the kill entity path just in case.
> >>>>>
> >>>>> Let's make sure we keep all dependencies in the array in
> >>>>> drm_sched_job_dependency(), so we can iterate over the array and wait
> >>>>> in drm_sched_entity_kill_jobs_cb().
> >>>>>
> >>>>> We also make sure we wait on drm_sched_fence::finished if we were
> >>>>> originally asked to wait on drm_sched_fence::scheduled. In that case,
> >>>>> we assume the intent was to delegate the wait to the firmware/GPU or
> >>>>> rely on the pipelining done at the entity/scheduler level, but when
> >>>>> killing jobs, we really want to wait for completion not just scheduling.
> >>>>>
> >>>>> v6:
> >>>>> - Back to v4 implementation
> >>>>> - Add Christian's R-b
> >>>>>
> >>>>> v5:
> >>>>> - Flag deps on which we should only wait for the scheduled event
> >>>>>   at insertion time
> >>>>>
> >>>>> v4:
> >>>>> - Fix commit message
> >>>>> - Fix a use-after-free bug
> >>>>>
> >>>>> v3:
> >>>>> - Always wait for drm_sched_fence::finished fences in
> >>>>>   drm_sched_entity_kill_jobs_cb() when we see a sched_fence
> >>>>>
> >>>>> v2:
> >>>>> - Don't evict deps in drm_sched_job_dependency()      
> >>>>
> >>>> Hmm, why is this in reverse chronological order?
> >>>> It's very confusing.    
> >>>
> >>> Dunno, that's how I've always ordered things, and quick look at some
> >>> dri-devel patches [1][2] makes me think I'm not the only one to start
> >>> from the latest submission.
> >>>
> >>> [1]https://lkml.org/lkml/2023/6/19/941
> >>> [2]https://lore.kernel.org/dri-devel/cover.1686729444.git.Sandor.yu@nxp.com/T/#t
> >>>     
> >>>>    
> >>>>>
> >>>>> Signed-off-by: Boris Brezillon <boris.brezillon at collabora.com>
> >>>>> Suggested-by: "Christian König" <christian.koenig at amd.com>
> >>>>> Reviewed-by: "Christian König" <christian.koenig at amd.com>      
> >>>>
> >>>> These three lines would usually come after the CCs.    
> >>>
> >>> Again, I think I've always inserted those tags before the Cc, but I can
> >>> re-order things if you prefer. Let me know if you want me to send a v7
> >>> addressing the Cc+changelog ordering.    
> >>
> >> No, it's not necessary for this patch, but in the future I'd rather follow
> >> chronological ordering for the versions, and in the Cc list. It's similar
> >> to how the patch description follows (narrative text) and to how we reply
> >> back to emails, and prevalently in the kernel log in drm ("git log" should
> >> suffice).
> >>
> >> Reading in chronological progression builds a narrative, a picture, in one's
> >> mind and makes it easy to see justifications for said narrative, or see reasons
> >> to change the narrative.
> >>
> >> That is, one can make a better decision knowing the full history, rather than
> >> only the latest change.
> >>
> >> (And in fact when I read the version revision list, my eyes skip over v[X]
> >> and just read down, so I was wondering why and how Christian R-B the patch
> >> in v2, and it wasn't until I actually saw that they were ordered in reverse
> >> chronological order, which was in fact v6--listed first, which I'd assumed
> >> was listed last.)
> >>
> >> Do you have access or do you know who is pushing this patch to drm-misc-fixes?  
> > 
> > I can push it.
> >   
> 
> Acked-by: Luben Tuikov <luben.tuikov at amd.com>

Queued to drm-misc-fixes after re-ordering things in the commit message
as you suggested.

Regards,

Boris



More information about the dri-devel mailing list