<div dir="ltr"><div dir="ltr"><span style="">On Tue, 20 Apr 2021 at 15:58, Christian König <<a href="mailto:ckoenig.leichtzumerken@gmail.com">ckoenig.leichtzumerken@gmail.com</a>> wrote:</span></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>
<div>Am 20.04.21 um 16:53 schrieb Daniel
Stone:</div><blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, 19 Apr 2021 at
11:48, Marek Olšák <<a href="mailto:maraeo@gmail.com" target="_blank">maraeo@gmail.com</a>> wrote:</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div><span>Deadlock mitigation to recover from
segfaults:</span><br>
</div>
<div>- The kernel knows which process is obliged to signal
which fence. This information is part of the Present
request and supplied by userspace.<br>
</div>
<div>- If the producer crashes, the kernel signals the
submit fence, so that the consumer can make forward
progress.</div>
<div>- If the consumer crashes, the kernel signals the
return fence, so that the producer can reclaim the
buffer.</div>
<div>- A GPU hang signals all fences. Other deadlocks will
be handled like GPU hangs.</div>
</div>
</blockquote>
<div><br>
</div>
<div>Another thought: with completely arbitrary userspace
fencing, none of this is helpful either. If the compositor
can't guarantee that a hostile client has submitted a fence
which will never be signaled, then it won't be waiting on
it, so it already needs infrastructure to handle something
like this. </div>
</div>
</div>
</blockquote>
<br>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div>That already handles the crashed-client case, because if
the client crashes, then its connection will be dropped,
which will trigger the compositor to destroy all its
resources anyway, including any pending waits.</div>
</div>
</div>
</blockquote>
<br>
Exactly that's the problem. A compositor isn't immediately informed
that the client crashed, instead it is still referencing the buffer
and trying to use it for compositing.<br></div></blockquote><div><br></div><div>If the compositor no longer has a guarantee that the buffer will be ready for composition in a reasonable amount of time (which dma_fence gives us, and this proposal does not appear to give us), then the compositor isn't trying to use the buffer for compositing, it's waiting asynchronously on a notification that the fence has signaled before it attempts to use the buffer.</div><div><br></div><div>Marek's initial suggestion is that the kernel signal the fence, which would unblock composition (and presumably show garbage on screen, or at best jump back to old content).</div><div><br></div><div>My position is that the compositor will know the process has crashed anyway - because its socket has been closed - at which point we destroy all the client's resources including its windows and buffers regardless. Signaling the fence doesn't give us any value here, _unless_ the compositor is just blindly waiting for the fence to signal ... which it can't do because there's no guarantee the fence will ever signal.</div><div> </div><div>Cheers,</div><div>Daniel</div></div></div>