[Mesa-dev] Determinism in the results of llvmpipe?

Fri Nov 18 03:19:39 UTC 2016

Am 18.11.2016 um 02:11 schrieb Ilia Mirkin:
> On Thu, Nov 17, 2016 at 2:37 AM, Andrew A. <andj2223 at gmail.com> wrote:
>> Hello,
>>
>> I'm using Mesa's software renderer for the purposes of regression
>> testing in our graphics software. We render various scenes, save a
>> screencap of the framebuffer for each scene, then compare those
>> framebuffer captures to previously known-good captures.
>>
>> Across runs of these tests on the same hardware, the results seem to
>> be 100% identical. When running the same tests on a different machine,
>> results are *slightly* different. It's very similar within a small
>> tolerance, so this is still usable. However, I was hoping for fully
>> deterministic behavior, even if the hardware is slightly different.
>> Are there some compile time settings or some code that I can change to
>> get Mesa's llvmpipe renderer/rasterizer to be fully deterministic in
>> its output?
> 
> You can force the AVX-capable CPU to run in SSE mode. You can do this
> by setting the environment variable
> 
> LP_NATIVE_VECTOR_WIDTH=128
> 
>> I'm using llvmpipe, and these are the two different CPUs I'm using to
>> run the tests:
>> Intel(R) Xeon(R) CPU E3-1275 v3
>> Intel(R) Xeon(R) CPU X5650
> 
> The former has AVX, while the latter does not. I believe this explains
> the difference.
> 

Yep, forcing 128bit vectors should do the trick. Note that generally
8-wide execution (which we use with avx) vs 4-wide should not actually
make a difference (neither should avx itself, but we disable that too if
you force 128bit vectors), except:
- we use fma if available (at least on intel cpus, this requires avx2
even, but the former cpu is Haswell so has it) so mul+adds get turned
into fma, which is of course numerically different. (Setting 128bit
vectors will force this off too.)
- there used to be different attribute interpolation code in llvmpipe
dependent on 4-wide vs 8-wide (the different code is still there but
disabled now, but you didn't specify the mesa version, we switched to
the version which has higher precision always, this difference was
actually pretty annoying as some tests are quite sensitive to it, I'd
say this difference was far more significant than the one due to fma).
- the texture sampling code might chose different codepath (AoS vs. SoA
filtering, the latter has higher precision) depending on the exact
sampling environment (i.e. if per-pixel lod is needed for instance)
depending on 4-wide vs. 8-wide vectors. (Even if AoS filtering is chosen
for both ways, this itself also has some differences wrt how texture
coord wrapping is done based on if AVX is available, though I think this
one should return the same results, but I wouldn't quite guarantee it...
At least the filtering itself within the AoS path will be the same).

If you'd use some non-x86 cpus you'd get some more differences most
likely (we switch off denorms on x86 simd for instance and this stuff is
very much arch-dependent), and there might actually be really different
results in some cases (that is, bugs...), but for x86 at least if you
have sse41 that should be all.

Roland