[Mesa-dev] [PATCH v2 08/13] mesa/format_utils: Add a general format conversion function
Brian Paul
brianp at vmware.com
Fri Sep 12 07:09:19 PDT 2014
On 09/11/2014 04:58 PM, Jason Ekstrand wrote:
>
>
> On Thu, Sep 11, 2014 at 3:53 PM, Dieter Nützel <Dieter at nuetzel-hh.de
> <mailto:Dieter at nuetzel-hh.de>> wrote:
>
> Am 12.09.2014 00:31, schrieb Jason Ekstrand:
>
> On Thu, Sep 11, 2014 at 2:55 PM, Dieter Nützel
> <Dieter at nuetzel-hh.de <mailto:Dieter at nuetzel-hh.de>>
> wrote:
>
> Am 15.08.2014 04:50, schrieb Jason Ekstrand:
>
> On Aug 14, 2014 7:13 PM, "Dieter Nützel"
> <Dieter at nuetzel-hh.de <mailto:Dieter at nuetzel-hh.de>>
> wrote:
>
>
> Am 15.08.2014 02:36, schrieb Dave Airlie:
>
> On 08/02/2014 02:11 PM, Jason Ekstrand
> wrote:
>
>
>
> Most format conversion operations
> required by GL can be
>
> performed by
>
> converting one channel at a time,
> shuffling the channels
>
> around, and
>
> optionally filling missing channels
> with zeros and ones.
>
> This
> adds a
>
> function to do just that in a
> general, yet efficient, way.
>
> v2:
> * Add better comments including full
> docs for functions
> * Don't use __typeof__
> * Use inline helpers instead of
> writing out conversions
>
> by
> hand,
>
> * Force full loop unrolling for
> better performance
>
>
>
> This file seems to anger gcc a lot.
>
> It seems to take upwards of a minute or two to
> compile here.
>
> gcc 4.8.3 on 32-bit x86.
>
> Dave.
>
>
>
> For me (on our poor little Duron 1800/2 GB) it ran ~5
>
> minutes...
>
>
> gcc 4.8.1 on 32-bit x86.
>
>
> If we'd like, the way the macros are set up, it would be
> easy to
> change it so that we do less unrolling in the cases
> where we are
> actually doing substantial format conversion and
> wouldn't notice
> the
> extra logic quite as much. I'll play with it a bit
> tomorrow or
> next
> week and see how how much of a hit we would actually
> take if we
> unrolled a little less in places.
> --Jason Ekstrand
>
>
> Ping.
>
> In a second it took 11+ minutes , here...
>
>
> 11 minutes! What system are you running? and are you using -03 or
> something? Yes, we can do something to cut it down, but it will
> probably require a configure flag; the question is what flag.
>
> --Jason
>
>
> See above, the old children's system... ;-)
> -O2 -m32 -march=athlon-mp -mtune=athlon-mp -m3dnow -msse -mmmx
> -mfpmath=sse,387 -pipe
>
> Bad? - Worked for ages on AthlonMP....8-)
> Maybe it is bad on Duron (the MP thing, much smaller cache and
> better GCC), now.
>
> Dieter
>
>
> Yeah, my recommendation would be hacking the macros to not unroll and
> keep the patch locally. If you've got a better idea as to how to
> organize the code so the compiler likes it, I'm open as long as we don't
> loose performance.
It looks like a release build with MSVC is taking quite a while to
compile this file too (actually at link time when the optimizer kicks in).
But even on my fast Linux system with gcc, the difference in compile
time between -O0 and -O3 is pretty big (2 seconds vs. 1 minute, 3 seconds).
I'm still prototyping something but it looks like breaking the top-level
switch cases in _mesa_swizzle_and_convert() into separate functions
reduces the time quite a bit. Let me pursue that a bit further and see
how it goes...
-Brian
More information about the mesa-dev
mailing list