[Mesa-dev] [PATCH] i965 : Performance Improvement
Marathe, Yogesh
yogesh.marathe at intel.com
Fri Jul 14 09:48:37 UTC 2017
> -----Original Message-----
> From: mesa-dev [mailto:mesa-dev-bounces at lists.freedesktop.org] On Behalf
> Of Eero Tamminen
> Sent: Friday, July 14, 2017 2:20 PM
> To: mesa-dev at lists.freedesktop.org
> Subject: Re: [Mesa-dev] [PATCH] i965 : Performance Improvement
>
> Hi,
>
> On 14.07.2017 09:38, Marathe, Yogesh wrote:
> [...]
> >>>> The only reason I could see this helping is if check_state() wasn't
> >>>> inlined, but a release build with -O2 definitely inlines both
> >>>> check_and_emit_atom() and check_state().
> >>>>
> >>>> Are you using GCC? What are your CFLAGS? -O2? I hope you're not
> >>>> trying to optimize a debug build...
> >>>
> >>> Yes we are using O2 and its clang on android and it's not debug.
> >>
> >> Okay. I just built with Clang 4.0.1 and -O2 and both check_state and
> >> check_and_emit_atom() are inlined into the atom loop in
> >> brw_upload_pipeline_state().
> >>
> >> So I'm still not sure how this would improve anything.
> >
> > Yes, the improvement is not huge per say but we essentially see CPI
> > and cpu utilization is coming down with this. We also see slightly
> > improved scores on graphics benchmarks, particularly 3dmark with the
> > patch. If this was optimized out by compiler we shouldn't have seen
> > the difference on same build with and without patch. We'll confirm the clang
> version.
> >
> > I think this removes branch instructions least and being in busy path
> > this will have an impact, provided compiler doesn't do it, as you rightly
> mentioned.
>
> Did you disassemble the produced code to verify that it improved things like you
> thought it to improve?
No, This didn’t appear through code disassembly so we didn’t do it
after too, we continued with same CPI measurement, basically cpu counters,
but now that this is brought up we should do it.
>
> The reason why ask, is that just doing changes to unrelated parts of code can
> sometimes improve performance because it changes code size and therefore
> impacts how things end up being mapped to memory and cached.
>
Good point. Thanks, we'll check that angle, too. Changes aren't unrelated though.
> (In some cases I've see several percent performance increases and drops even
> from LD_PRELOADing a random, unused library to a process.)
>
>
> - Eero
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list