[Mesa-dev] [PATCH 3/3] llvmpipe: add sse code for fixed position calculation

Fri Jan 8 15:58:40 PST 2016

Am 08.01.2016 um 23:28 schrieb Jose Fonseca:
> On 08/01/16 20:04, Roland Scheidegger wrote:
>> Am 08.01.2016 um 20:33 schrieb Roland Scheidegger:
>>> Am 07.01.2016 um 19:56 schrieb Roland Scheidegger:
>>>> Am 07.01.2016 um 16:40 schrieb Jose Fonseca:
>>>>> On 07/01/16 06:18, Roland Scheidegger wrote:
>>>>>> Am 04.01.2016 um 20:38 schrieb Jose Fonseca:
>>>>>>> On 02/01/16 20:39, sroland at vmware.com wrote:
>>>>>> 
>>>>>> Hmm actually I suppose I didn't do enough testing with
>>>>>> that. This fails one piglit (completely unrelated to what
>>>>>> the test actually wants to test),
>>>>>> piglit/bin/glsl-1.50-geometry-end-primitive 128 -auto
>>>>>> -fbo.
>>>>>> 
>>>>>> This draws twice the same just shifted in x direction and
>>>>>> expects results to be the same. But due to rounding that's
>>>>>> apparently not quite the case. I am left wondering if the
>>>>>> nearest-away-from-zero rounding we did before is really
>>>>>> superior for subpixel snap or if that's just dumb luck it
>>>>>> passes with that rounding (though util_iround previously
>>>>>> used would have used the same nearest-even rounding on x86
>>>>>> without sse).
>>>>>> 
>>>>>> GL spec of course is completely silent on how rasterization
>>>>>> is done. d3d10 generally loves round-to-nearest-even for
>>>>>> float->fixed point conversions but doesn't say if that
>>>>>> applies to pixel coord snapping as well...
>>>>>> 
>>>>>> Some analysis shows that the point x value in question is 
>>>>>> 7.90039825f - when we see it a second time it is
>>>>>> 263.900391f (it is not exactly 256.0f more, but as exact as
>>>>>> float math allows). So when we do the pixel offset adjust
>>>>>> (sub 0.5f) * 256 in fixed point conversion, we get for the
>>>>>> former: 1894.50183f. This will be rounded up of course,
>>>>>> regardless the exact nearest rounding mode. For the second
>>>>>> value however, we get: 67430.5f - rounded up with the 
>>>>>> util_iround (round nearest, away from zero) function, but
>>>>>> rounded down with nearest/even rounding...
>>>>>> 
>>>>>> So, I'd blame the test. With some different float values,
>>>>>> it could easily fail no matter the exact rounding mode. 
>>>>>> (Both my nvidia and amd hardware (which should also have 8 
>>>>>> subpixel bits accuracy) manage to pass the test, albeit it
>>>>>> looks like both of them actually round down both values
>>>>>> there, not up, so the result is different to what we get in
>>>>>> any case - if they really do similar logic with subpixel
>>>>>> snapping they must have calculated slightly different float
>>>>>> values in the first place.)
>>>>>> 
>>>>>> Still not entirely sure though what's the preferred
>>>>>> rounding mode for subpixel snap, the assembly can of course
>>>>>> be adjusted either way.
>>>>>> 
>>>>>> Roland
>>>>> 
>>>>> Yes, it sounds we want to modify the test to not rely on the 
>>>>> float->fixed rounding.
>>>>> 
>>>> This potentially affects all tests which try to render twice
>>>> the same thing just shifted by some amount. I suppose usually
>>>> it shouldn't be a problem because most tests tend to use simple
>>>> squares (so, 2 vertices on the same x and y axis, with one 45
>>>> degree each) which should be numerically very unproblematic as
>>>> long as the verties don't exactly lie between two fragment
>>>> centers. This test, though, gets its vertex positions from some
>>>> more complex calculations, so not easily controlled. Not sure
>>>> how exactly it should be fixed? Maybe would need to perform
>>>> some sort of its own subpixel snap before converting back to
>>>> floats.
>>> 
>>> I wasn't actually able to trigger the same issue with that test
>>> on real hw. So I was wondering if hw actually does some rounding
>>> which makes that impossible (I think this should be doable if
>>> they'd effectively throw away the low-order bits for small
>>> values). However, that's clearly not the case, as I hacked
>>> together some test which shows that indeed this can fail on
>>> nvidia hw as well (basically, draw some tri with one edge going
>>> from 1/256, 0 to 2/256, 1, with viewport width/height of 256 and
>>> starting at x,y 0 then draw the same thing again but with
>>> viewport shifted by 256 in x direction - then loop over and over
>>> again with one vertex x position adjusted by + 1/2^24 per loop
>>> iteration until it fails - takes ages...).
>>> 
>>> However I'm actually beginning to think it would be desirable to
>>> have some "nearest-floor" or "nearest-ceil" rounding instead of
>>> nearest-even (to ensure even spacing, albeit that would only
>>> really be true if two values have the same exponent in any case).
>>> So maybe it's indeed not worth worrying about exact rounding.
>>> 
>> 
>> Oh and fwiw my attempt on fixing the piglit test was completely 
>> unsuccesful. I made some crude attempt at eliminating such
>> potential differences by adding and then subtracting 256.0f, which
>> was promptly optimized away. Maybe that's even legal... I'm leaning
>> on leaving it that way and just let it fail...
> 
> IIUC, the objective of 
> piglit/tests/spec/glsl-1.50/execution/geometry/end-primitive.c is to 
> ensure EndPrimitive works, and that ensuring that geometry emitted
> by the GS stage and VS stage match precisely is a means to that end,
> and not a goal itself.
> 
> If so, I think it might be more productive to emit an array of
> points, such that their pixel centers are aligned perfectly with
> screen pixels, and just verify that the expected pixels are all lit.
They don't even need to be perfectly aligned. As long as they are
aligned to the same subpixel (granted that's different depending on hw,
but always at least 4) it should be fine.

> 
> Another alternative maybe is to not shift the viewport at all --
> draw the two things on the exact same viewport.
You mean one after another? So readback the first reference result, then
clear and draw the other one and compare that to the reference? That
will work, albeit it's visually not appealing (as you can't compare the
results easily).

Roland