<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Aug 10, 2023 at 8:06 PM Danilo Krummrich <<a href="mailto:dakr@redhat.com">dakr@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">If a sched job depends on a dma-fence from a job from the same GPU<br>
scheduler instance, but a different scheduler entity, the GPU scheduler<br>
does only wait for the particular job to be scheduled, rather than for<br>
the job to fully complete. This is due to the GPU scheduler assuming<br>
that there is a scheduler instance per ring. However, the current<br>
implementation, in order to avoid arbitrary amounts of kthreads, has a<br>
single scheduler instance while scheduler entities represent rings.<br>
<br>
As a workaround, set the DRM_SCHED_FENCE_DONT_PIPELINE for all<br>
out-fences in order to force the scheduler to wait for full job<br>
completion for dependent jobs from different entities and same scheduler<br>
instance.<br>
<br>
There is some work in progress [1] to address the issues of firmware<br>
schedulers; once it is in-tree the scheduler topology in Nouveau should<br>
be re-worked accordingly.<br>
<br>
[1] <a href="https://lore.kernel.org/dri-devel/20230801205103.627779-1-matthew.brost@intel.com/" rel="noreferrer" target="_blank">https://lore.kernel.org/dri-devel/20230801205103.627779-1-matthew.brost@intel.com/</a><br>
<br>
Signed-off-by: Danilo Krummrich <<a href="mailto:dakr@redhat.com" target="_blank">dakr@redhat.com</a>><br>
---<br>
 drivers/gpu/drm/nouveau/nouveau_sched.c | 22 ++++++++++++++++++++++<br>
 1 file changed, 22 insertions(+)<br>
<br>
diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c<br>
index 3424a1bf6af3..88217185e0f3 100644<br>
--- a/drivers/gpu/drm/nouveau/nouveau_sched.c<br>
+++ b/drivers/gpu/drm/nouveau/nouveau_sched.c<br>
@@ -292,6 +292,28 @@ nouveau_job_submit(struct nouveau_job *job)<br>
        if (job->sync)<br>
                done_fence = dma_fence_get(job->done_fence);<br>
<br>
+       /* If a sched job depends on a dma-fence from a job from the same GPU<br>
+        * scheduler instance, but a different scheduler entity, the GPU<br>
+        * scheduler does only wait for the particular job to be scheduled,<br></blockquote><div><br></div><div>s/does only wait/only waits/</div><div><br></div><div>Reviewed-by: Faith Ekstrand <faith.ekstrand@collaboralcom></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
+        * rather than for the job to fully complete. This is due to the GPU<br>
+        * scheduler assuming that there is a scheduler instance per ring.<br>
+        * However, the current implementation, in order to avoid arbitrary<br>
+        * amounts of kthreads, has a single scheduler instance while scheduler<br>
+        * entities represent rings.<br>
+        *<br>
+        * As a workaround, set the DRM_SCHED_FENCE_DONT_PIPELINE for all<br>
+        * out-fences in order to force the scheduler to wait for full job<br>
+        * completion for dependent jobs from different entities and same<br>
+        * scheduler instance.<br>
+        *<br>
+        * There is some work in progress [1] to address the issues of firmware<br>
+        * schedulers; once it is in-tree the scheduler topology in Nouveau<br>
+        * should be re-worked accordingly.<br>
+        *<br>
+        * [1] <a href="https://lore.kernel.org/dri-devel/20230801205103.627779-1-matthew.brost@intel.com/" rel="noreferrer" target="_blank">https://lore.kernel.org/dri-devel/20230801205103.627779-1-matthew.brost@intel.com/</a><br>
+        */<br>
+       set_bit(DRM_SCHED_FENCE_DONT_PIPELINE, &job->done_fence->flags);<br>
+<br>
        if (job->ops->armed_submit)<br>
                job->ops->armed_submit(job);<br>
<br>
<br>
base-commit: 68132cc6d1bcbc78ade524c6c6c226de42139f0e<br>
-- <br>
2.41.0<br>
<br>
</blockquote></div></div>