<!DOCTYPE html> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> </head> <body> <div class="moz-cite-prefix">On 15.01.24 18:54, Michel Dänzer wrote:<br> </div> <blockquote type="cite" cite="mid:7194a09a-afe8-4eae-8288-c72e2ac7d0a6@daenzer.net"> <pre class="moz-quote-pre" wrap="">On 2024-01-15 18:26, Friedrich Vock wrote: [snip] </pre> <blockquote type="cite"><span style="white-space: pre-wrap"> </span> <pre class="moz-quote-pre" wrap="">The fundamental problem here is that not telling applications that something went wrong when you just canceled their work midway is an out-of-spec hack. When there is a report of real-world apps breaking because of that hack, reports of different apps working (even if it's convenient that they work) doesn't justify keeping the broken code. </pre> </blockquote> <pre class="moz-quote-pre" wrap=""> If the breaking apps hit multiple soft resets in a row, I've laid out a pragmatic solution which covers both cases. </pre> </blockquote> Hitting soft reset every time is the lucky path. Once GPU work is interrupted out of nowhere, all bets are off and it might as well trigger a full system hang next time. No hang recovery should be able to cause that under any circumstance.<br> <blockquote type="cite" cite="mid:7194a09a-afe8-4eae-8288-c72e2ac7d0a6@daenzer.net"> <pre class="moz-quote-pre" wrap=""> </pre> <blockquote type="cite"> <pre class="moz-quote-pre" wrap="">If mutter needs to be robust against faults it caused itself, it should be robust against GPU resets. </pre> </blockquote> <pre class="moz-quote-pre" wrap=""> It's unlikely that the hangs I've seen were caused by mutter itself, more likely Mesa or amdgpu. Anyway, this will happen at some point, the reality is it hasn't yet though. </pre> </blockquote> </body> </html>