[Mesa-dev] Introducing OpenSWR: High performance software rasterizer

Rowley, Timothy O timothy.o.rowley at intel.com
Tue Feb 23 04:55:05 UTC 2016


> On Feb 17, 2016, at 7:07 PM, Roland Scheidegger <sroland at vmware.com> wrote:
> 
> You could use different functions for avx and avx2 code, and plug the
> right ones in at runtime, as you can link them both just fine. It just
> requires that your code containing avx2 code is in a different compile
> unit to the one containing avx-only code. This way you only really have
> separate compiled code for the functions where there's really a
> difference (obviously, this prevents the compiler from using avx2 on its
> own in the shared parts, but I doubt that's a problem). Albeit if you
> have lots of differences scattered around (the worst would probably be
> different structures based on such difference used everywhere...) this
> might not be very practical (at a first glance, didn't look like it at
> least for avx and avx2).
> Though I'm not actually sure how you would do that for c++ template
> code, maybe it doesn't work as easily...
> In any case, so far for llvmpipe we didn't bother (except for the jitted
> code of course) to optimize for newer instruction sets precisely due to
> it being annoying (certainly prevents you from doing "let's just
> optimize this math here in this little inline function when avx is
> available" - so we still have rasterization functions which emulate
> sse41 _mm_mul_epi32 with _mm_mul_epu32 and so on).

Unfortunately we have avx and avx2 usage in the general swr code, hidden behind some macros which emulate the missing avx2 instructions on avx, so there isn’t a clear boundary layer inside the swr rasterizer we can load behind.  Additionally some of the structures will start changing size when we add avx512 support.

I was thinking that “objcopy —prefix-symbols” might be the answer to the problem of creating two versions of the rasterizer that could be linked together with the driver, but it does a global rename on all symbols (internal and externals like malloc/free/c++ constructors/etc..) leaving unresolvable externals.

Maybe a global c++ namespace might work, but I don’t see a nonintrusive way of adding that.

-Tim



More information about the mesa-dev mailing list