<html>
<head>
<base href="https://bugs.freedesktop.org/" />
</head>
<body>
<p>
<div>
<b><a class="bz_bug_link
bz_status_NEW "
title="NEW - glmark2-es2-wayland shortly freezes on some frames with egl_dri2 backend (Nouveau/GK20A)"
href="https://bugs.freedesktop.org/show_bug.cgi?id=86690#c4">Comment # 4</a>
on <a class="bz_bug_link
bz_status_NEW "
title="NEW - glmark2-es2-wayland shortly freezes on some frames with egl_dri2 backend (Nouveau/GK20A)"
href="https://bugs.freedesktop.org/show_bug.cgi?id=86690">bug 86690</a>
from <span class="vcard"><a class="email" href="mailto:gnurou@gmail.com" title="Alexandre Courbot <gnurou@gmail.com>"> <span class="fn">Alexandre Courbot</span></a>
</span></b>
<pre>As discussed on IRC, it appears that running "perf top" at the same time as any
Weston EGL client delivers the same behavior of frozen frames in said client,
without affecting Weston itself or other non-EGL clients. "perf top" reports a
high usage (~30% CPU time) of _raw_spin_unlock_irq() *only* when the condition
occurs.
Very strange. I have run strace on the EGL client for both conditions
(non-bound EGL program, and EGL program while "perf top" is running), and
noticed that these freezes correspond to long consecutive series of calls to
sched_yield(), e.g:
17:42:56.337425 sched_yield() = 0
17:42:56.337830 sched_yield() = 0
...
17:42:57.713679 sched_yield() = 0
17:42:57.714122 sched_yield() = 0
sched_yield() is never called in the case of a well-behaving client.
Grepping into Mesa, I found one loop in nouveau_fence_wait() where
sched_yield() is called. Removing this call does not fix the issue, but the
calls the sched_yield() are not visible in the trace anymore, which tends to
confirm this is where the misbehavior is happening.
The series of sched_yield() calls are always between the two following ioctls:
17:42:56.266742 ioctl(6, 0xc0406481, 0xbeeec788) = 0
17:42:56.268952 sched_yield() = 0
17:42:56.269466 sched_yield() = 0
...
17:42:56.281481 sched_yield() = 0
17:42:56.281848 sched_yield() = 0
17:42:56.282210 ioctl(6, 0xc0406481, 0xbeeec750) = 0
with the calls to sched_yield() removed, the strace log shows large delays
between these ioctls, hinting again that nouveau_fence_wait() is busy-waiting
in that loop:
17:56:00.248145 ioctl(6, 0xc0406481, 0xbecf2768) = 0
17:56:01.155312 ioctl(6, 0xc0406481, 0xbecf2730) = 0
What is really remarkable is that this is true for both cases: a non-capped EGL
client, or a (capped or non-capped) EGL client while "perf top" is running. The
same ioctls are involved, in the same pattern.
While this seems to make it clear that we are waiting for fences, I still
cannot imagine how "perf top" can have such a strong influence on this.
So it seems that in both cases we are waiting for fences. I have attached a
full strace of weston-simple-egl in case this could be helpful.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>