[Mesa-dev] [PATCH 3/3] llvmpipe: add sse code for fixed position calculation

Fri Jan 8 14:28:58 PST 2016

On 08/01/16 20:04, Roland Scheidegger wrote:
> Am 08.01.2016 um 20:33 schrieb Roland Scheidegger:
>> Am 07.01.2016 um 19:56 schrieb Roland Scheidegger:
>>> Am 07.01.2016 um 16:40 schrieb Jose Fonseca:
>>>> On 07/01/16 06:18, Roland Scheidegger wrote:
>>>>> Am 04.01.2016 um 20:38 schrieb Jose Fonseca:
>>>>>> On 02/01/16 20:39, sroland at vmware.com wrote:
>>>>>
>>>>> Hmm actually I suppose I didn't do enough testing with that. This fails
>>>>> one piglit (completely unrelated to what the test actually wants to
>>>>> test), piglit/bin/glsl-1.50-geometry-end-primitive 128 -auto -fbo.
>>>>>
>>>>> This draws twice the same just shifted in x direction and expects
>>>>> results to be the same. But due to rounding that's apparently not quite
>>>>> the case. I am left wondering if the nearest-away-from-zero rounding we
>>>>> did before is really superior for subpixel snap or if that's just dumb
>>>>> luck it passes with that rounding (though util_iround previously used
>>>>> would have used the same nearest-even rounding on x86 without sse).
>>>>>
>>>>> GL spec of course is completely silent on how rasterization is done.
>>>>> d3d10 generally loves round-to-nearest-even for float->fixed point
>>>>> conversions but doesn't say if that applies to pixel coord snapping as
>>>>> well...
>>>>>
>>>>> Some analysis shows that the point x value in question is
>>>>> 7.90039825f - when we see it a second time it is 263.900391f (it is not
>>>>> exactly 256.0f more, but as exact as float math allows).
>>>>> So when we do the pixel offset adjust (sub 0.5f) * 256 in fixed point
>>>>> conversion, we get for the former: 1894.50183f. This will be rounded up
>>>>> of course, regardless the exact nearest rounding mode.
>>>>> For the second value however, we get: 67430.5f - rounded up with the
>>>>> util_iround (round nearest, away from zero) function, but rounded down
>>>>> with nearest/even rounding...
>>>>>
>>>>> So, I'd blame the test. With some different float values, it could
>>>>> easily fail no matter the exact rounding mode.
>>>>> (Both my nvidia and amd hardware (which should also have 8 subpixel bits
>>>>> accuracy) manage to pass the test, albeit it looks like both of them
>>>>> actually round down both values there, not up, so the result is
>>>>> different to what we get in any case - if they really do similar logic
>>>>> with subpixel snapping they must have calculated slightly different
>>>>> float values in the first place.)
>>>>>
>>>>> Still not entirely sure though what's the preferred rounding mode for
>>>>> subpixel snap, the assembly can of course be adjusted either way.
>>>>>
>>>>> Roland
>>>>
>>>> Yes, it sounds we want to modify the test to not rely on the
>>>> float->fixed rounding.
>>>>
>>> This potentially affects all tests which try to render twice the same
>>> thing just shifted by some amount. I suppose usually it shouldn't be a
>>> problem because most tests tend to use simple squares (so, 2 vertices on
>>> the same x and y axis, with one 45 degree each) which should be
>>> numerically very unproblematic as long as the verties don't exactly lie
>>> between two fragment centers.
>>> This test, though, gets its vertex positions from some more complex
>>> calculations, so not easily controlled. Not sure how exactly it should
>>> be fixed? Maybe would need to perform some sort of its own subpixel snap
>>> before converting back to floats.
>>
>> I wasn't actually able to trigger the same issue with that test on real
>> hw. So I was wondering if hw actually does some rounding which makes
>> that impossible (I think this should be doable if they'd effectively
>> throw away the low-order bits for small values). However, that's clearly
>> not the case, as I hacked together some test which shows that indeed
>> this can fail on nvidia hw as well (basically, draw some tri with one
>> edge going from 1/256, 0 to 2/256, 1, with viewport width/height of 256
>> and starting at x,y 0 then draw the same thing again but with viewport
>> shifted by 256 in x direction - then loop over and over again with one
>> vertex x position adjusted by + 1/2^24 per loop iteration until it fails
>> - takes ages...).
>>
>> However I'm actually beginning to think it would be desirable to have
>> some "nearest-floor" or "nearest-ceil" rounding instead of nearest-even
>> (to ensure even spacing, albeit that would only really be true if two
>> values have the same exponent in any case). So maybe it's indeed not
>> worth worrying about exact rounding.
>>
>
> Oh and fwiw my attempt on fixing the piglit test was completely
> unsuccesful. I made some crude attempt at eliminating such potential
> differences by adding and then subtracting 256.0f, which was promptly
> optimized away. Maybe that's even legal... I'm leaning on leaving it
> that way and just let it fail...

IIUC, the objective of 
piglit/tests/spec/glsl-1.50/execution/geometry/end-primitive.c is to 
ensure EndPrimitive works, and that ensuring that geometry emitted by 
the GS stage and VS stage match precisely is a means to that end, and 
not a goal itself.

If so, I think it might be more productive to emit an array of points, 
such that their pixel centers are aligned perfectly with screen pixels, 
and just verify that the expected pixels are all lit.

Another alternative maybe is to not shift the viewport at all -- draw 
the two things on the exact same viewport.

Jose