<html> <head> <base href="https://bugs.freedesktop.org/"> </head> <body> <div> <a class="bz_bug_link bz_status_NEW " title="NEW - Regression in Mesa 17 on s390x (zSystems)" href="https://bugs.freedesktop.org/show_bug.cgi?id=100613#c31">Comment # 31</a> on <a class="bz_bug_link bz_status_NEW " title="NEW - Regression in Mesa 17 on s390x (zSystems)" href="https://bugs.freedesktop.org/show_bug.cgi?id=100613">bug 100613</a> from <a class="email" href="mailto:bcrocker@redhat.com" title="Ben Crocker <bcrocker@redhat.com>"> Ben Crocker</a> <pre>(In reply to Roland Scheidegger from <a href="show_bug.cgi?id=100613#c30">comment #30</a>) > (In reply to Rob Clark from <a href="show_bug.cgi?id=100613#c26">comment #26</a>) > > (In reply to Ben Crocker from <a href="show_bug.cgi?id=100613#c25">comment #25</a>) > > > > > > Regarding Ray's specific comment about getting scalar fetch to work > > > with "sufficient twiddling," I think it's perfectly acceptable to > > > introduce extra operations, as long as we restrict the extra > > > operations to the big-endian path. PPC64 (LE or BE) is fast enough so > > > that any performance impact will be negligible; S390 is less fast, but > > > I imagine production machines with more memory than the one we > > > experimented on here are fast enough. > > > > drive-by comment.. unless llvm is just rubbish at optimization, I don't > > think saving a few operations in the front-end IR building should be that > > important, even for LE. > Well, yes and no. Yes, if it makes things conceptually simpler (which > probably isn't really the case here). > I'm not sure how good llvm is there with the ppc backend. But for x86, no > you can't assume optimization will take care of everything neatly, in > particular for load/shuffle combinations. If you look at it, there is in > fact lots of hack code around gathering of values (on x86), simply because > llvm can't do some kind of optimizations (in particular, it can't do any > optimizations crossing scalar/vector boundaries, so if you zero-extend > values after a scalar load or after assembling into vectors makes a large > difference in generated code quality, or if you use a int load llvm will not > consider using float shuffles afterwards even if it means using 3 shuffle > instructions instead of just 1 and so on (llvm has no real model of domain > transition penalty costs, which don't exist in these cases on most cpus), > albeit that latter problem has been fixed with llvm 4.0). > However, I would not expect these particular bits to be a problem on non-x86 > cpus. I think int/float issues are x86 specific (other simd instructions > sets afaik don't tend to have different int/float load/store, shuffle or > even logic op operations). So, going for the conceptually simplest solution > should be alright (albeit for instance the scalar/vector "optimization > barrier" is probably going to affect all backends). > > > But we have shader-db so it should be possible to > > prove/disprove that theory. (Not sure if llvmpipe is instrumented for > > shader-db but if not that should be easy to solve.) > Yeah I suppose should really do that at some point... I want to emphasize at this point that the patch I described in Comments 28-29 is compile-time conditionalized for big-endian only.</pre> </div> <hr> You are receiving this mail because: <ul> <li>You are the QA Contact for the bug.</li> <li>You are the assignee for the bug.</li> </ul> </body> </html>