[Mesa-dev] [PATCH 2/3][RFC v2] mesa/main/x86: Add sse2 streaming clamping

Juha-Pekka Heikkila juhapekka.heikkila at gmail.com
Thu Nov 6 01:32:00 PST 2014


On 05.11.2014 21:21, Ian Romanick wrote:
> On 11/04/2014 01:24 PM, Roland Scheidegger wrote:
>> Am 04.11.2014 um 13:05 schrieb Juha-Pekka Heikkila:
>>> +   for(i = 0; i < n; i++) {
>>> +      _mesa_clamp_float_rgba(rgba_src[i], temp, min, max);
>>> +
>>> +      *operand = _mm_mul_ps(multiplier, *operand);
>>> +      truncated_integers = _mm_cvttps_epi32(*operand);
>>> +      mmove = _mm_set_ps(aMap[map_p[ACOMP]], bMap[map_p[BCOMP]],
>>> +                         gMap[map_p[GCOMP]], rMap[map_p[RCOMP]] );
>>> +
>>> +      _mm_storeu_ps(rgba_dst[i], mmove);
>> The sse2 code at the end looks counterproductive to me. Not sure what
>> gcc will generate but I'd suspect it involves some simd->int domain
>> transition for the table lookups, plus another int->simd transition to
>> get the values back into simd domain (alternatively it might use
>> stores/load here) just so you can store them again...
>> It would probably be better to just store the values directly after the
>> table lookups.
>> But in any case actually I'm beginning to suspect noone really cares
>> about performance anyway for that path (who the hell uses these
>> scale/map features?) so whatever works...
> 
> Which raises another question... do we have any piglit tests that
> actually exercise this path?

No we don't. I made small test for this to see how it works, I was
planning to move my test to Piglit later.

/Juha-Pekka



More information about the mesa-dev mailing list