[PATCH] drm/sched: Remove optimization that causes hang when killing dependent jobs
Greg Kroah-Hartman
gregkh at linuxfoundation.org
Wed Jul 16 12:05:13 UTC 2025
On Wed, Jul 16, 2025 at 01:32:42PM +0200, Philipp Stanner wrote:
> On Wed, 2025-07-16 at 13:15 +0200, Greg Kroah-Hartman wrote:
> > On Wed, Jul 16, 2025 at 12:58:28PM +0200, Christian König wrote:
> > > On 16.07.25 12:46, Philipp Stanner wrote:
> > > > +Cc Greg, Sasha
> > > >
> > > > On Wed, 2025-07-16 at 12:40 +0200, Michel Dänzer wrote:
> > > > > On 16.07.25 11:57, Philipp Stanner wrote:
> > > > > > On Wed, 2025-07-16 at 09:43 +0000, cao, lin wrote:
> > > > > > >
> > > > > > > Hi Philipp,
> > > > > > >
> > > > > > >
> > > > > > > Thank you for the review. I found that this optimization
> > > > > > > was
> > > > > > > introduced 9 years ago in commit
> > > > > > > 777dbd458c89d4ca74a659f85ffb5bc817f29a35 ("drm/amdgpu: drop
> > > > > > > a
> > > > > > > dummy
> > > > > > > wakeup scheduler").
> > > > > > >
> > > > > > >
> > > > > > > Given that the codebase has undergone significant changes
> > > > > > > over
> > > > > > > these
> > > > > > > 9 years. May I ask if I still need to include the Fixes:
> > > > > > > tag?
> > > > > >
> > > > > > Yes. It's a helpful marker to see where the problem comes
> > > > > > from, and
> > > > > > it
> > > > > > adds redundancy helping the stable-kernel maintainers in
> > > > > > figuring
> > > > > > out
> > > > > > to which kernels to backport it to.
> > > > > >
> > > > > > If stable can't apply a patch to a very old stable kernel
> > > > > > because
> > > > > > the
> > > > > > code base changed too much, they'll ping us and we might
> > > > > > provide a
> > > > > > dedicated fix.
> > > > > >
> > > > > > So like that:
> > > > > >
> > > > > > Cc: stable at vger.kernel.org # v4.6+
> > > > > > Fixes: 777dbd458c89 ("drm/amdgpu: drop a dummy wakeup
> > > > > > scheduler")
> > > > >
> > > > > FWIW, Fixes: alone is enough for getting backported to stable
> > > > > branches, Cc: stable is redundant with it.
> > > >
> > > > Both are used all the time together, though. And the official
> > > > documentation does not list dropping Cc: stable as a valid option
> > > > in
> > > > this regard
> > > >
> > > > https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html#option-1
> > > >
> > > >
> > > > As long as the official documentation demands it, I'm not willing
> > > > to
> > > > drop it. If the docu were to be changed, that would be fine by
> > > > me, too.
> > >
> > > As far as I understand "CC: stable" and "Fixes:" tags are to handle
> > > two distinct use cases.
> >
> > Yes.
> >
> > > "CC: stable..." means please backport, eventually with a kernel
> > > version and/or necessary pre-requisites.
> >
> > Yes.
> >
> > > "Fixes:" only backport if you have this patch in your tree as well.
> > > In other words it is a restriction when to backport something.
> >
> > No.
> >
> > "Fixes:" is only for you to say "this commit fixes this other
> > commit".
> > And when you add a cc: stable, that will get you a FAILED email if
> > the
> > commit does NOT apply that far back.
>
> Does that mean we should NOT add Fixes: if the fixing patch does not
> apply on top of that old commit?
Add Fixes: if you feel it accurataly describes the commit that caused
the problem that this commit is fixing. That is independant of "you
need other commits after that to apply this one", that issue can be
resolved by reading the stable kernel rules document and following what
it says there to do for that.
> And if so, should we drop the Fixes: tag completely in such cases as
> Lin suggested in this thread?
If you don't want to ever be notified of any failures of stable patches
being applied as far back as they should be applied, sure, don't put a
Fixes: tag. That means I do a "best effort" and just stop applying when
they don't apply anymore.
It also means that if you do NOT have a Fixes: tag, and the commit ends
up getting assigned a CVE, we have to assume that the bug has been there
since "the beginning of time" and will mark it as such. Which might
cause you headaches if you are responsible for keeping older kernels
alive for vendors :)
your call.
hope this helps, and really, this should all be documented already,
right? If not, what is missing (becides the CVE stuff.)
thanks,
greg k-h
More information about the dri-devel
mailing list