<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
Am 13.01.25 um 09:43 schrieb Philipp Stanner:<br>
<blockquote type="cite" cite="mid:582e10673bb749f18ebf8a18f46ca573df396576.camel@redhat.com">[SNIP]<span style="white-space: pre-wrap">
</span>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap=""></pre>
<span style="white-space: pre-wrap">
</span>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">The handling of NULL values is half-baked.
In my opinion, you should define if drm_sched_pick_best() may put a
NULL into
rq. If your answer is yes, it might put a NULL there; then, there
should be a
BUG_ON(!entity->rq) after the invocation of
drm_sched_entity_select_rq().
If your answer is no, the BUG_ON() should be in
drm_sched_pick_best().
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
Yeah good point.
We might not want a BUG_ON(), that is only justified when we prevent
further damage (e.g. random data corruption or similar).
I suggest using a WARN(!shed, "Submission without activated
sheduler!").
This way the system has at least a chance of survival should the
scheduler become ready later on.
On the other hand the BUG_ON() or the NULL pointer deref should only
kill the application thread which is submitting something before the
driver is resumed. So that might help to pinpoint where the actually
issue is.
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
As I see it the BUG_ON() would just be a more pretty NULL pointer
deref. If we agree that this is effectively a misuse of the scheduler
API we probably want to add it to make it more pretty, though?</pre>
</blockquote>
<br>
The only alternative I can see is that the scheduler API gracefully
handles submits to non-ready schedulers. E.g. that
drm_sched_entity_push_job() detects this condition and instead of
pushing the job sets and error code and signals the fences.<br>
<br>
But that might not be a good idea.<br>
<br>
It just moves the crash from one place to another and in general I
fully agree the driver is misusing the scheduler API to do something
which won't work and potentially crash the whole system.<br>
<br>
<blockquote type="cite" cite="mid:582e10673bb749f18ebf8a18f46ca573df396576.camel@redhat.com">
<pre class="moz-quote-pre" wrap="">@Philipp:
BTW, I only just discovered this thread by coincidence. Please use
get_maintainer. The scheduler currently has 4 maintainers, and none of
them is on CC.</pre>
</blockquote>
<br>
Oh good, point I was already wondering why nobody else commented and
didn't realized that nobody was on CC.<br>
<br>
Thanks,<br>
Christian.<br>
<br>
<blockquote type="cite" cite="mid:582e10673bb749f18ebf8a18f46ca573df396576.camel@redhat.com">
<pre class="moz-quote-pre" wrap="">
Danke,
P.
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">
Regards,
Christian.
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">
That helps guys with zero domain knowledge, like me, to figure out
how
this is all
supposed to work.
best regards,
Philipp
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
</pre>
</blockquote>
<br>
</body>
</html>