<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Exchange Server">
<!-- converted from rtf -->
<style><!-- .EmailQuote { margin-left: 1pt; padding-left: 4pt; border-left: #800000 2px solid; } --></style>
</head>
<body>
<font face="Calibri" size="2"><span style="font-size:10pt;">
<div style="padding-right:5pt;padding-left:5pt;"><font color="blue">[AMD Official Use Only - AMD Internal Distribution Only]<br>
</font></div>
<div style="margin-top:5pt;"><font face="Times New Roman" size="3"><span style="font-size:12pt;"><br>
</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">I agree with the overall approach and support Chris's suggestion. The function amdgpu_job_stop_all_jobs_on_sched is now only applicable to a few older AMD hardware models when they encounter uncorrectable
hardware errors. We have discontinued this method for several generations of newer hardware.</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;"> </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">CC <a href="mailto:christian.koenig@amd.com"><font color="#2B579A"><span style="background-color:#E1DFDD;">@Christian König</span></font></a>/<a href="mailto:Alexander.Deucher@amd.com"><font color="#2B579A"><span style="background-color:#E1DFDD;">@Deucher,
Alexander</span></font></a></span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;"> </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">Regards,<br>
Hawking</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;"> </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">-----Original Message-----<br>
From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Christian König<br>
Sent: Thursday, February 6, 2025 21:47<br>
To: phasta@kernel.org; Tvrtko Ursulin <tvrtko.ursulin@igalia.com>; amd-gfx@lists.freedesktop.org<br>
Cc: kernel-dev@igalia.com; Danilo Krummrich <dakr@kernel.org>; Matthew Brost <matthew.brost@intel.com><br>
Subject: Re: [PATCH 1/4] drm/scheduler: Add drm_sched_cancel_all_jobs helper</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;"> </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">Am 06.02.25 um 14:35 schrieb Philipp Stanner:</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">> On Wed, 2025-02-05 at 15:33 +0000, Tvrtko Ursulin wrote:</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> The helper copies code from the existing </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> amdgpu_job_stop_all_jobs_on_sched with the purpose of reducing the </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> amount of driver code which directly touch scheduler internals.</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>></span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> If or when amdgpu manages to change the approach for handling the </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> permanently wedged state this helper can be removed.</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">> Have you checked how many other drivers might need such a helper?</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">></span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">> I have a bit mixed feelings about this, because, AFAICT, in the past </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">> helpers have been added for just 1 driver, such as </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">> drm_sched_wqueue_ready(), and then they have stayed for almost a </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">> decade.</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">></span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">> AFAIU this is just code move, and only really "decouples" amdgpu in </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">> the sense of having an official scheduler function that does what </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">> amdgpu used to do.</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">></span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">> So my tendency here would be to continue "allowing" amdgpu to touch </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">> the scheduler internals until amdgpu fixes this "permanently wedged </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">> state". And if that's too difficult, couldn't the helper reside in a </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">> amdgpu/sched_helpers.c or similar?</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">></span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">> I think that's better than adding 1 helper for just 1 driver and then </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">> supposedly removing it again in the future.</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;"> </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">Yeah, agree to that general approach.</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;"> </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">What amdgpu does here is kind of nasty and looks unnecessary, but changing it means we need time from Hawkings and his people involved on RAS for amdgpu.</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;"> </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">When we move the code to the scheduler we make it official scheduler interface to others to replicate and that is exactly what we should try to avoid.</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;"> </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">So my suggestion is to add a /* TODO: This is nasty and should be avoided */ to the amdgpu code instead.</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;"> </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">Regards,</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">Christian.</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;"> </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">></span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">> P.</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">></span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> Signed-off-by: Tvrtko Ursulin <<a href="mailto:tvrtko.ursulin@igalia.com">tvrtko.ursulin@igalia.com</a>></span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> Cc: Christian König <<a href="mailto:christian.koenig@amd.com">christian.koenig@amd.com</a>></span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> Cc: Danilo Krummrich <<a href="mailto:dakr@kernel.org">dakr@kernel.org</a>></span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> Cc: Matthew Brost <<a href="mailto:matthew.brost@intel.com">matthew.brost@intel.com</a>></span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> Cc: Philipp Stanner <<a href="mailto:phasta@kernel.org">phasta@kernel.org</a>></span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> ---</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> drivers/gpu/drm/scheduler/sched_main.c | 44</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> ++++++++++++++++++++++++++</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> include/drm/gpu_scheduler.h | 1 +</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> 2 files changed, 45 insertions(+)</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>></span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> b/drivers/gpu/drm/scheduler/sched_main.c</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> index a48be16ab84f..0363655db22d 100644</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> --- a/drivers/gpu/drm/scheduler/sched_main.c</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> +++ b/drivers/gpu/drm/scheduler/sched_main.c</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> @@ -703,6 +703,50 @@ void drm_sched_start(struct drm_gpu_scheduler </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> *sched, int errno)</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> }</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> EXPORT_SYMBOL(drm_sched_start);</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> +/**</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + * drm_sched_cancel_all_jobs - Cancel all queued and scheduled jobs</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + *</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + * @sched: scheduler instance</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + * @errno: error value to set on signaled fences</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + *</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + * Signal all queued and scheduled jobs and set them to error state.</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + *</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + * Scheduler must be stopped before calling this.</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + */</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> +void drm_sched_cancel_all_jobs(struct drm_gpu_scheduler *sched, int</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> errno)</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> +{</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + struct drm_sched_entity *entity;</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + struct drm_sched_fence *s_fence;</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + struct drm_sched_job *job;</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + enum drm_sched_priority p;</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> +</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + drm_WARN_ON_ONCE(sched, !sched->pause_submit);</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> +</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + /* Signal all jobs not yet scheduled */</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + for (p = DRM_SCHED_PRIORITY_KERNEL; p < sched->num_rqs; p++)</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> {</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + struct drm_sched_rq *rq = sched->sched_rq[p];</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> +</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + spin_lock(&rq->lock);</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + list_for_each_entry(entity, &rq->entities, list) {</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + while ((job =</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> to_drm_sched_job(spsc_queue_pop(&entity->job_queue)))) {</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + s_fence = job->s_fence;</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + dma_fence_signal(&s_fence-</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>>> scheduled);</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + dma_fence_set_error(&s_fence-</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>>> finished, errno);</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + dma_fence_signal(&s_fence-</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>>> finished);</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + }</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + }</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + spin_unlock(&rq->lock);</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + }</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> +</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + /* Signal all jobs already scheduled to HW */</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + list_for_each_entry(job, &sched->pending_list, list) {</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + s_fence = job->s_fence;</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + dma_fence_set_error(&s_fence->finished, errno);</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + dma_fence_signal(&s_fence->finished);</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> + }</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> +}</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> +EXPORT_SYMBOL(drm_sched_cancel_all_jobs);</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> +</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> /**</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> * drm_sched_resubmit_jobs - Deprecated, don't use in new code!</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> *</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> diff --git a/include/drm/gpu_scheduler.h </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> b/include/drm/gpu_scheduler.h index a0ff08123f07..298513f8c327 100644</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> --- a/include/drm/gpu_scheduler.h</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> +++ b/include/drm/gpu_scheduler.h</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> @@ -579,6 +579,7 @@ void drm_sched_wqueue_stop(struct </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> drm_gpu_scheduler *sched);</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> void drm_sched_wqueue_start(struct drm_gpu_scheduler *sched);</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> void drm_sched_stop(struct drm_gpu_scheduler *sched, struct </span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> drm_sched_job *bad);</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> void drm_sched_start(struct drm_gpu_scheduler *sched, int errno);</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> +void drm_sched_cancel_all_jobs(struct drm_gpu_scheduler *sched, int</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> errno);</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched);</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> void drm_sched_increase_karma(struct drm_sched_job *bad);</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;">>> void drm_sched_reset_karma(struct drm_sched_job *bad);</span></font></div>
<div><font face="Calibri" size="2"><span style="font-size:11pt;"> </span></font></div>
</span></font>
</body>
</html>