<html>
<head>
<base href="https://bugs.freedesktop.org/">
</head>
<body>
<p>
<div>
<b><a class="bz_bug_link
bz_status_REOPENED "
title="REOPENED - [dri][swr] stack overflow / infinite loop with GALLIUM_DRIVER=swr"
href="https://bugs.freedesktop.org/show_bug.cgi?id=97102#c12">Comment # 12</a>
on <a class="bz_bug_link
bz_status_REOPENED "
title="REOPENED - [dri][swr] stack overflow / infinite loop with GALLIUM_DRIVER=swr"
href="https://bugs.freedesktop.org/show_bug.cgi?id=97102">bug 97102</a>
from <span class="vcard"><a class="email" href="mailto:0xe2.0x9a.0x9b@gmail.com" title="Jan Ziak <0xe2.0x9a.0x9b@gmail.com>"> <span class="fn">Jan Ziak</span></a>
</span></b>
<pre>(In reply to Bruce Cherniak from <a href="show_bug.cgi?id=97102#c11">comment #11</a>)
<span class="quote">> As Tim suggests, pruning empty nodes is probably the best solution for the
> crash.
>
> For performance, however, I'm not sure how many cores to expose in your
> case. cpuinfo shows that there are 4 threads across 2 cores, which we
> detect as 2 cores, with 2 hyperthreads. Due to the way OpenSWR loads the
> processor, we have found that not using the hyperthreads as OpenSWR workers
> yields the best performance. This may or may not be the case with your
> processor.
>
> Something you can try is to set the environment variable
> KNOB_MAX_THREADS_PER_CORE=0. This will allow OpenSWR to use all 4 threads.
>
> Please report back on how this affects performance.</span >
An AMD dual core x86 module is in terms of performance close to two separate
x86 cores:
- Kaveri/Steamroller module: 1 instruction fetch unit, 2 instruction decoders,
2 integer cores, 1 AVX core, 1 L1i cache, 2 L1d caches
- Two separate cores: 2 instruction {fetch,decode} units, {integer,AVX} cores,
2 L1{i,d} caches
In my experience, the statement that x86 module is close to 2 separate cores is
generally true. Many programs (gcc (make -j4), ...) scale close to what they
scale on two separate x86 cores.
----
# export LIBGL_ALWAYS_SOFTWARE=1
# export GALLIUM_DRIVER=swr
# glxgears
350.080 FPS
# KNOB_MAX_THREADS_PER_CORE=0 glxgears
615.980 FPS
----
Unigine Sanctuary 1.6.3 1024x768_windowed:
Default: 0.166578 FPS
KNOB_MAX_THREADS_PER_CORE=0: 0.440662 FPS</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
<li>You are the QA Contact for the bug.</li>
</ul>
</body>
</html>