[Mesa-dev] Low interpolation precision for 8 bit textures using llvmpipe

Fri Apr 12 12:34:44 UTC 2019

Hi Roland!

On 4/11/19 8:18 PM, Roland Scheidegger wrote:
> What version of mesa are you using?
The original results were generated using version 19.0.2 (from the arch 
linux repositories), but I got the same results using the current git 
version (98934e6aa19795072a353dae6020dafadc76a1e3).
> The debug flags were changed a while ago (so that those perf tweaks can
> be disabled on release builds too), it needs to be either:
> GALLIVM_PERF=no_rho_approx,no_brilinear,no_quad_lod
> or easier
> GALLIVM_PERF=no_filter_hacks (which disables these 3 things above together)
> 
> Although all of that only really affects filtering with mipmaps (not
> sure if you do?).
Using GALLIVM_PERF does not a make a difference, either, but that should 
be expected because I'm not using mipmaps, just "regular" linear 
filtering (GL_NEAREST).
> 
> 
> (more below)
See my responses below as well.
> 
> 
> Am 11.04.19 um 18:00 schrieb Dominik Drees:
>> Running with the suggested flags in the environment does not change the
>> result for the test case I described below. The results with and without
>> the environment variables set are pixel-wise equal.
>>
>> By the way, and if this of interest: For GL_NEAREST sampling the results
>> from hardware and llvmpipe are equal as well.
>>
>> Best,
>> Dominik
>>
>> On 4/11/19 4:36 PM, Ilia Mirkin wrote:
>>> llvmpipe takes a number of shortcuts in the interest of speed which
>>> cause inaccurate texturing. Try running with
>>>
>>> GALLIVM_DEBUG=no_rho_approx,no_brilinear,no_quad_lod
>>>
>>> and see if the issue still occurs.
>>>
>>> Cheers,
>>>
>>>     -ilia
>>>
>>>
>>>
>>> On Thu, Apr 11, 2019 at 8:30 AM Dominik Drees <dominik.drees at wwu.de>
>>> wrote:
>>>>
>>>> Hello, everyone!
>>>>
>>>> I have a question regarding the interpolation precision of llvmpipe.
>>>> Feel free to redirect me to somewhere else if this is not the right
>>>> place to ask. Consider the following scenario: In a fragment shader we
>>>> are sampling from a 16x16, 8 bit texture with values between 0 and 3
>>>> using linear interpolation. Then we write white to the screen if the
>>>> sampled value is > 1/255 and black otherwise. The output looks very
>>>> different when rendered with llvmpipe compared to the result produced by
>>>> rendering hardware (for both intel (mesa i965) and nvidia (proprietary
>>>> driver)).
>>>>
>>>> I've uploaded examplary output images here
>>>> (https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fimgur.com%2Fa%2FD1udpez&data=02%7C01%7Csroland%40vmware.com%7Cbdef52eb504c4078f9f808d6be96da17%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636905952501149697&sdata=vymggYHZTDLwKNh7RpcM1eSyhVA2L%2BfHNchvYS8yQPQ%3D&reserved=0)
>>>>
>>>> and the corresponding fragment shader here
>>>> (https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fpa808Req&data=02%7C01%7Csroland%40vmware.com%7Cbdef52eb504c4078f9f808d6be96da17%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636905952501149697&sdata=%2FqKVJCXFS4UswynKeSoqCKivTHAb2o%2FZwVE1nwNms3M%3D&reserved=0).
> The shader looks iffy to me, how do you use that vec4 in the if clause?
> 
> 
>>>>
>>>> My hypothesis is that llvmpipe (in contrast to hardware) only uses 8 bit
>>>> for the interpolation computation when reading from 8 bit textures and
>>>> thus loses precision in the lower bits. Is that correct? If so, does
>>>> anyone know of a workaround?
> 
> So, in theory it is indeed possible the results are less accurate with
> llvmpipe (I believe all recent hw does rgba8 filtering with more than 8
> bit precision).
> For formats fitting into rgba8, we have a fast path in llvmpipe
> (gallivm) for the lerp, which unpacks the 8bit values into 16bit values,
> does the lerp with that and packs back to 8 bit. The result is
> accurately rounded there (to 8 bit) but only for 1 lerp step - for a 2d
> texture there are 3 of those (one per direction, and a final one
> combining the result). And yes this means the filtered result only has 8
> bits.
Do I understand you correctly in that for the 2D case, the results of 
the first two lerps (done in 16 bit) are converted to 8 bit, then 
converted back to 16 bit for the final (second stage) lerp?

If so and if I'm understanding this correctly, for 2D (i.e., a 2-stage 
linear interpolation) we potentially have an error in the order of one 
bit for the final 8 bit value due to the intermediate 16->8->16 
conversion. For sampling from a 3D texture (i.e., a 3-stage linear 
interpolation) the effect would be amplified: The extra stage could 
cause an error with a magnitude of two bits of the final 8 bit result 
(if I'm doing the math in my head correctly).

Is there any (conceptual) reason why the result of a one dimensional 
interpolation step is reduced back to 8 bits before the second stage 
interpolation? Would avoiding these conversions not actually be faster 
(in addition to the improved accuracy)?
> 
> I do believe you should not rely on implementations having more accuracy
> - as far as I know the filtering we do is conformant there (it is tricky
> to do better using the fast path).
In principle you are correct. In our regressiontests we actually have 
(per test) configurable thresholds for maximum pixel distance/maximum 
number of differing pixels/neighborhood search radius etc. We could just 
increase these thresholds, but would risk missing some regressions that 
(for example) only affect a very small portion of the screen. For the 
larger part of our test suite llvmpipe actually works quite well within 
the established limits.
For some other cases where we render a relatively small 8 bit 3D volume 
the differences basically trampled the previously set thresholds and 
were quite visible to the naked eye.

> 
> There would be code to actually do filtering with full float precision,
> although there's no way to reach it with rgba8 formats unless you change
> the code (if you want to try out the theory, look at
> lp_bld_sample_soa.c, lp_build_sample_soa_code() determines whether to
> use the fast (aos) filtering path (use_aos, determined mostly by
> util_format_fits_8unorm()). If you set this to false it will use the
> full float filtering path. (FWIW I was actually thinking a while ago we
> should force this path when there's only 1 channel, albeit I never got
> around to test (benchmark) it - this is because the AoS filtering path
> is really optimized for rgba8 formats, and if you only have 1 channel
> it's quite possible float filtering is actually faster, since this
> handles the channels individually.)
> I guess though if the full float precision filtering is useful in
> general, we could add that to GALLIVM_PERF.
Forcing float precision indeed fixes the test case described below and 
our volume rendering regression tests! If this cannot be fixed in 
general I would be very happy about an option to force float precision 
via GALLIVM_PERF. FWIW, with forced float precision running our test 
suit is actually faster (~6 minutes) than "stock" master (~6:40), but 
these may be highly biased, of course.

Best,
Dominik
> 
> Roland
> 
> 
> 
> 
>>>>
>>>> A little bit of background about the use case: We are trying to move the
>>>> CI of Voreen
>>>> (https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.uni-muenster.de%2FVoreen%2F&data=02%7C01%7Csroland%40vmware.com%7Cbdef52eb504c4078f9f808d6be96da17%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636905952501149697&sdata=tZf1sxXpC0rDhAAzqXNp9UQnRmrnZceKCerfJKcMdmk%3D&reserved=0)
>>>> to the Gitlab-CI
>>>> running in docker without any hardware dependencies. Using llvmpipe for
>>>> our regression tests works in principle, but shows significant
>>>> differences in the raycasting rendering of an 8-bit-per-voxel dataset.
>>>> (The effect is of course less visible than the constructed example case
>>>> linked above, but still quite noticeable for a human.)
>>>>
>>>> Any help or pointers would be appreciated!
>>>>
>>>> Best,
>>>> Dominik
>>>>
>>>> -- 
>>>> Dominik Drees
>>>>
>>>> Department of Computer Science
>>>> Westfaelische Wilhelms-Universitaet Muenster
>>>>
>>>> email: dominik.drees at wwu.de
>>>> web:
>>>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.wwu.de%2FPRIA%2Fpersonen%2Fdrees.shtml&data=02%7C01%7Csroland%40vmware.com%7Cbdef52eb504c4078f9f808d6be96da17%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636905952501159687&sdata=tZeO2bZCQzdIz8ifZnNRbQ8tM46CCTDrDFgTeXbVWUU%3D&reserved=0
>>>>
>>>> phone: +49 251 83 - 38448
>>>>
>>>> _______________________________________________
>>>> mesa-dev mailing list
>>>> mesa-dev at lists.freedesktop.org
>>>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fmesa-dev&data=02%7C01%7Csroland%40vmware.com%7Cbdef52eb504c4078f9f808d6be96da17%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636905952501159687&sdata=d%2Fj7ZLjayR308Y0qFzFu5YqVBbQF%2B1b8tHPS75U3jco%3D&reserved=0
>>>>
>>
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fmesa-dev&data=02%7C01%7Csroland%40vmware.com%7Cbdef52eb504c4078f9f808d6be96da17%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636905952501179679&sdata=fMbBfbBWnYQbDmwTcV%2FaOVpXwTLD%2BV5PF2yGH8hvHkM%3D&reserved=0
>>
> 

-- 
Dominik Drees

Department of Computer Science
Westfaelische Wilhelms-Universitaet Muenster

email: dominik.drees at wwu.de
web: https://www.wwu.de/PRIA/personen/drees.shtml
phone: +49 251 83 - 38448

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5572 bytes
Desc: S/MIME Cryptographic Signature
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20190412/32e5a546/attachment.bin>