[PATCH 1/8] drm/sched: Allow drivers to skip the reset and keep on running
Maíra Canal
mcanal at igalia.com
Wed May 7 12:33:31 UTC 2025
Hi Tvrtko,
Thanks for the review!
On 06/05/25 08:32, Tvrtko Ursulin wrote:
>
> On 03/05/2025 21:59, Maíra Canal wrote:
>> When the DRM scheduler times out, it's possible that the GPU isn't hung;
>> instead, a job may still be running, and there may be no valid reason to
>> reset the hardware. This can occur in two situations:
>>
>> 1. The GPU exposes some mechanism that ensures the GPU is still making
>> progress. By checking this mechanism, we can safely skip the reset,
>> rearm the timeout, and allow the job to continue running until
>> completion. This is the case for v3d and Etnaviv.
>> 2. TDR has fired before the IRQ that signals the fence. Consequently,
>> the job actually finishes, but it triggers a timeout before
>> signaling
>> the completion fence.
>>
>> These two scenarios are problematic because we remove the job from the
>> `sched->pending_list` before calling `sched->ops->timedout_job()`. This
>> means that when the job finally signals completion (e.g. in the IRQ
>> handler), the scheduler won't call `sched->ops->free_job()`. As a result,
>> the job and its resources won't be freed, leading to a memory leak.
>>
>> To resolve this issue, we create a new `drm_gpu_sched_stat` that allows a
>> driver to skip the reset. This new status will indicate that the job
>> should be reinserted into the pending list, and the driver will still
>> signal its completion.
>
[...]
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index
>> 1a7e377d4cbb4fc12ed93c548b236970217945e8..fe9043b6d43141bee831b5fc16b927202a507d51 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -389,11 +389,13 @@ struct drm_sched_job {
>> * @DRM_GPU_SCHED_STAT_NONE: Reserved. Do not use.
>> * @DRM_GPU_SCHED_STAT_NOMINAL: Operation succeeded.
>> * @DRM_GPU_SCHED_STAT_ENODEV: Error: Device is not available anymore.
>> + * @DRM_GPU_SCHED_STAT_RUNNING: GPU is still running, so skip the reset.
>
> s/GPU/job/ ?
>
>> */
>> enum drm_gpu_sched_stat {
>> DRM_GPU_SCHED_STAT_NONE,
>> DRM_GPU_SCHED_STAT_NOMINAL,
>> DRM_GPU_SCHED_STAT_ENODEV,
>> + DRM_GPU_SCHED_STAT_RUNNING,
>
> I am wondering if we could make it more obvious what is the difference
> between "nominal" and "running" and from whose point of view should
> those statuses be considered.
> > So far we have "nominal" which means scheduler/hardware is working
fine
> but the job may or may have not been cancelled. With "running" we kind
> of split it into two sub-statuses and it would be nice for that to be
> intuitively visible from the naming. But I struggle to suggest an
> elegant name while preserving nominal as is.
I was thinking: how about changing DRM_GPU_SCHED_STAT_NOMINAL to
DRM_GPU_SCHED_STAT_RESET (the hardware is fine, but we reset it)?
Then, when we skip the reset, we would have DRM_GPU_SCHED_STAT_NOMINAL
(which means the hardware is fine and we didn't reset it).
I'm open to other suggestions.
>
> Thinking out loud here - perhaps that is pointing towards an alternative
> that instead of a new status, a new helper to re-insert the single job
> (like drm_sched_resubmit_job(sched, job)) would fit better? Although it
> would be more churn.
>
Although your solution might be more elegant, I'm worried that such a
function could be used improperly by new users (e.g. being called in
contexts other than `timedout_job()`).
I'd prefer to have a new status as it'll be use solely for
`timedout_job()` (making it harder for users to use it inappropriately).
With the addition of Matthew's feedback (calling
`drm_sched_run_free_queue()` after adding the job to the pending list),
I think it makes even more sense to keep it inside the timeout function.
I hope others can chime in and give their opinions about your idea.
Best Regards,
- Maíra
> Regards,
>
> Tvrtko
>
>> };
>> /**
>>
>
More information about the Intel-xe
mailing list