[PATCH] drm/sched: Remove optimization that causes hang when killing dependent jobs
Philipp Stanner
phasta at mailbox.org
Wed Jul 16 14:11:27 UTC 2025
On Wed, 2025-07-16 at 14:05 +0200, Greg Kroah-Hartman wrote:
> On Wed, Jul 16, 2025 at 01:32:42PM +0200, Philipp Stanner wrote:
> > On Wed, 2025-07-16 at 13:15 +0200, Greg Kroah-Hartman wrote:
> > > On Wed, Jul 16, 2025 at 12:58:28PM +0200, Christian König wrote:
> > > > On 16.07.25 12:46, Philipp Stanner wrote:
> > > > > +Cc Greg, Sasha
> > > > >
> > > > > On Wed, 2025-07-16 at 12:40 +0200, Michel Dänzer wrote:
> > > > > > On 16.07.25 11:57, Philipp Stanner wrote:
> > > > > > > On Wed, 2025-07-16 at 09:43 +0000, cao, lin wrote:
> > > > > > > >
> > > > > > > > Hi Philipp,
> > > > > > > >
> > > > > > > >
> > > > > > > > Thank you for the review. I found that this
> > > > > > > > optimization
> > > > > > > > was
> > > > > > > > introduced 9 years ago in commit
> > > > > > > > 777dbd458c89d4ca74a659f85ffb5bc817f29a35 ("drm/amdgpu:
> > > > > > > > drop
> > > > > > > > a
> > > > > > > > dummy
> > > > > > > > wakeup scheduler").
> > > > > > > >
> > > > > > > >
> > > > > > > > Given that the codebase has undergone significant
> > > > > > > > changes
> > > > > > > > over
> > > > > > > > these
> > > > > > > > 9 years. May I ask if I still need to include the
> > > > > > > > Fixes:
> > > > > > > > tag?
> > > > > > >
> > > > > > > Yes. It's a helpful marker to see where the problem comes
> > > > > > > from, and
> > > > > > > it
> > > > > > > adds redundancy helping the stable-kernel maintainers in
> > > > > > > figuring
> > > > > > > out
> > > > > > > to which kernels to backport it to.
> > > > > > >
> > > > > > > If stable can't apply a patch to a very old stable kernel
> > > > > > > because
> > > > > > > the
> > > > > > > code base changed too much, they'll ping us and we might
> > > > > > > provide a
> > > > > > > dedicated fix.
> > > > > > >
> > > > > > > So like that:
> > > > > > >
> > > > > > > Cc: stable at vger.kernel.org # v4.6+
> > > > > > > Fixes: 777dbd458c89 ("drm/amdgpu: drop a dummy wakeup
> > > > > > > scheduler")
> > > > > >
> > > > > > FWIW, Fixes: alone is enough for getting backported to
> > > > > > stable
> > > > > > branches, Cc: stable is redundant with it.
> > > > >
> > > > > Both are used all the time together, though. And the official
> > > > > documentation does not list dropping Cc: stable as a valid
> > > > > option
> > > > > in
> > > > > this regard
> > > > >
> > > > > https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html#option-1
> > > > >
> > > > >
> > > > > As long as the official documentation demands it, I'm not
> > > > > willing
> > > > > to
> > > > > drop it. If the docu were to be changed, that would be fine
> > > > > by
> > > > > me, too.
> > > >
> > > > As far as I understand "CC: stable" and "Fixes:" tags are to
> > > > handle
> > > > two distinct use cases.
> > >
> > > Yes.
> > >
> > > > "CC: stable..." means please backport, eventually with a kernel
> > > > version and/or necessary pre-requisites.
> > >
> > > Yes.
> > >
> > > > "Fixes:" only backport if you have this patch in your tree as
> > > > well.
> > > > In other words it is a restriction when to backport something.
> > >
> > > No.
> > >
> > > "Fixes:" is only for you to say "this commit fixes this other
> > > commit".
> > > And when you add a cc: stable, that will get you a FAILED email
> > > if
> > > the
> > > commit does NOT apply that far back.
> >
> > Does that mean we should NOT add Fixes: if the fixing patch does
> > not
> > apply on top of that old commit?
>
> Add Fixes: if you feel it accurataly describes the commit that caused
> the problem that this commit is fixing. That is independant of "you
> need other commits after that to apply this one", that issue can be
> resolved by reading the stable kernel rules document and following
> what
> it says there to do for that.
>
> > And if so, should we drop the Fixes: tag completely in such cases
> > as
> > Lin suggested in this thread?
>
> If you don't want to ever be notified of any failures of stable
> patches
> being applied as far back as they should be applied, sure, don't put
> a
> Fixes: tag. That means I do a "best effort" and just stop applying
> when
> they don't apply anymore.
>
> It also means that if you do NOT have a Fixes: tag, and the commit
> ends
> up getting assigned a CVE, we have to assume that the bug has been
> there
> since "the beginning of time" and will mark it as such. Which might
> cause you headaches if you are responsible for keeping older kernels
> alive for vendors :)
>
> your call.
>
> hope this helps, and really, this should all be documented already,
> right? If not, what is missing (becides the CVE stuff.)
It does help, thank you.
Regarding documentation, I can only tell you that the stable kernel
docu only sparringly mentions the Fixes: tag and it certainly doesn't
mention what you detail above.
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html#option-1
I think such questions are an excellent example for an FAQ section
"
FAQ
1. Do patches for the stable kernel have to include a Fixes: tag?
"
P.
>
> thanks,
>
> greg k-h
More information about the dri-devel
mailing list