[Mesa-dev] [PATCH 2/3][RFC v2] mesa/main/x86: Add sse2 streaming clamping

Ian Romanick idr at freedesktop.org
Wed Nov 5 11:21:40 PST 2014


On 11/04/2014 01:24 PM, Roland Scheidegger wrote:
> Am 04.11.2014 um 13:05 schrieb Juha-Pekka Heikkila:
>> +   for(i = 0; i < n; i++) {
>> +      _mesa_clamp_float_rgba(rgba_src[i], temp, min, max);
>> +
>> +      *operand = _mm_mul_ps(multiplier, *operand);
>> +      truncated_integers = _mm_cvttps_epi32(*operand);
>> +      mmove = _mm_set_ps(aMap[map_p[ACOMP]], bMap[map_p[BCOMP]],
>> +                         gMap[map_p[GCOMP]], rMap[map_p[RCOMP]] );
>> +
>> +      _mm_storeu_ps(rgba_dst[i], mmove);
> The sse2 code at the end looks counterproductive to me. Not sure what
> gcc will generate but I'd suspect it involves some simd->int domain
> transition for the table lookups, plus another int->simd transition to
> get the values back into simd domain (alternatively it might use
> stores/load here) just so you can store them again...
> It would probably be better to just store the values directly after the
> table lookups.
> But in any case actually I'm beginning to suspect noone really cares
> about performance anyway for that path (who the hell uses these
> scale/map features?) so whatever works...

Which raises another question... do we have any piglit tests that
actually exercise this path?

> Roland



More information about the mesa-dev mailing list