[PATCH] drm/sched: Remove optimization that causes hang when killing dependent jobs

Wed Jul 16 14:24:43 UTC 2025

On Wed, Jul 16, 2025 at 04:11:27PM +0200, Philipp Stanner wrote:
> On Wed, 2025-07-16 at 14:05 +0200, Greg Kroah-Hartman wrote:
> > On Wed, Jul 16, 2025 at 01:32:42PM +0200, Philipp Stanner wrote:
> > > On Wed, 2025-07-16 at 13:15 +0200, Greg Kroah-Hartman wrote:
> > > > On Wed, Jul 16, 2025 at 12:58:28PM +0200, Christian König wrote:
> > > > > On 16.07.25 12:46, Philipp Stanner wrote:
> > > > > > +Cc Greg, Sasha
> > > > > > 
> > > > > > On Wed, 2025-07-16 at 12:40 +0200, Michel Dänzer wrote:
> > > > > > > On 16.07.25 11:57, Philipp Stanner wrote:
> > > > > > > > On Wed, 2025-07-16 at 09:43 +0000, cao, lin wrote:
> > > > > > > > > 
> > > > > > > > > Hi Philipp,
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Thank you for the review. I found that this
> > > > > > > > > optimization
> > > > > > > > > was
> > > > > > > > > introduced 9 years ago in commit
> > > > > > > > > 777dbd458c89d4ca74a659f85ffb5bc817f29a35 ("drm/amdgpu:
> > > > > > > > > drop
> > > > > > > > > a
> > > > > > > > > dummy
> > > > > > > > > wakeup scheduler").
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Given that the codebase has undergone significant
> > > > > > > > > changes
> > > > > > > > > over
> > > > > > > > > these
> > > > > > > > > 9 years. May I ask if I still need to include the
> > > > > > > > > Fixes:
> > > > > > > > > tag?
> > > > > > > > 
> > > > > > > > Yes. It's a helpful marker to see where the problem comes
> > > > > > > > from, and
> > > > > > > > it
> > > > > > > > adds redundancy helping the stable-kernel maintainers in
> > > > > > > > figuring
> > > > > > > > out
> > > > > > > > to which kernels to backport it to.
> > > > > > > > 
> > > > > > > > If stable can't apply a patch to a very old stable kernel
> > > > > > > > because
> > > > > > > > the
> > > > > > > > code base changed too much, they'll ping us and we might
> > > > > > > > provide a
> > > > > > > > dedicated fix.
> > > > > > > > 
> > > > > > > > So like that:
> > > > > > > > 
> > > > > > > > Cc: stable at vger.kernel.org # v4.6+
> > > > > > > > Fixes: 777dbd458c89 ("drm/amdgpu: drop a dummy wakeup
> > > > > > > > scheduler")
> > > > > > > 
> > > > > > > FWIW, Fixes: alone is enough for getting backported to
> > > > > > > stable
> > > > > > > branches, Cc: stable is redundant with it.
> > > > > > 
> > > > > > Both are used all the time together, though. And the official
> > > > > > documentation does not list dropping Cc: stable as a valid
> > > > > > option
> > > > > > in
> > > > > > this regard
> > > > > > 
> > > > > > https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html#option-1
> > > > > > 
> > > > > > 
> > > > > > As long as the official documentation demands it, I'm not
> > > > > > willing
> > > > > > to
> > > > > > drop it. If the docu were to be changed, that would be fine
> > > > > > by
> > > > > > me, too.
> > > > > 
> > > > > As far as I understand "CC: stable" and "Fixes:" tags are to
> > > > > handle
> > > > > two distinct use cases.
> > > > 
> > > > Yes.
> > > > 
> > > > > "CC: stable..." means please backport, eventually with a kernel
> > > > > version and/or necessary pre-requisites.
> > > > 
> > > > Yes.
> > > > 
> > > > > "Fixes:" only backport if you have this patch in your tree as
> > > > > well.
> > > > > In other words it is a restriction when to backport something.
> > > > 
> > > > No.
> > > > 
> > > > "Fixes:" is only for you to say "this commit fixes this other
> > > > commit".
> > > > And when you add a cc: stable, that will get you a FAILED email
> > > > if
> > > > the
> > > > commit does NOT apply that far back.
> > > 
> > > Does that mean we should NOT add Fixes: if the fixing patch does
> > > not
> > > apply on top of that old commit?
> > 
> > Add Fixes: if you feel it accurataly describes the commit that caused
> > the problem that this commit is fixing.  That is independant of "you
> > need other commits after that to apply this one", that issue can be
> > resolved by reading the stable kernel rules document and following
> > what
> > it says there to do for that.
> > 
> > > And if so, should we drop the Fixes: tag completely in such cases
> > > as
> > > Lin suggested in this thread?
> > 
> > If you don't want to ever be notified of any failures of stable
> > patches
> > being applied as far back as they should be applied, sure, don't put
> > a
> > Fixes: tag.  That means I do a "best effort" and just stop applying
> > when
> > they don't apply anymore.
> > 
> > It also means that if you do NOT have a Fixes: tag, and the commit
> > ends
> > up getting assigned a CVE, we have to assume that the bug has been
> > there
> > since "the beginning of time" and will mark it as such.  Which might
> > cause you headaches if you are responsible for keeping older kernels
> > alive for vendors :)
> > 
> > your call.
> > 
> > hope this helps, and really, this should all be documented already,
> > right?  If not, what is missing (becides the CVE stuff.)
> 
> It does help, thank you.
> 
> Regarding documentation, I can only tell you that the stable kernel
> docu only sparringly mentions the Fixes: tag and it certainly doesn't
> mention what you detail above.
> 
> https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html#option-1

Yes, it does not mention Fixes: because that's not the way to get a
patch applied to the stable trees.

It says, in plain words:

	Option 1

	To have a patch you submit for mainline inclusion later
	automatically picked up for stable trees, add this tag in the
	sign-off area:
		Cc: stable at vger.kernel.org

Simple, complete, and correct, what more do you want?  :)

thanks,

greg k-h