<div dir="ltr"><div>I can vouch for the usefulness of the combined trace timeline showing CPU core usage, filtered application events and GPU usage. Android systrace shows this data -- I've used it from both an app developer perspective to fix countless performance bugs and from a whole-system perspective to tune issues such as motopho latency for VR. The latter is only possible when the CPU timeline is combined with vendor-specific GPU data showing binning, resolves/unresolves and context preemptions.<br></div><div><br></div><div>With virtualization, we have a new level of complexity and corresponding performance bugs to track down. One example is unexpected shader compiles, but there are other slowpaths in mesa that are important to be able to see without difficulty. There is work being done to support perfetto trace data from both host and guest VM -- mesa is in both.</div><div><br></div><div>Perfetto/systrace makes this performance analysis work easier in many cases -- run an app, record trace, reproduce a glitch, and then view the trace to find out what happened.</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Feb 15, 2021 at 9:27 AM Rob Clark <<a href="mailto:robdclark@gmail.com">robdclark@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Mon, Feb 15, 2021 at 3:13 AM Tamminen, Eero T<br>
<<a href="mailto:eero.t.tamminen@intel.com" target="_blank">eero.t.tamminen@intel.com</a>> wrote:<br>
><br>
> Hi,<br>
><br>
> On Fri, 2021-02-12 at 18:20 -0800, Rob Clark wrote:<br>
> > On Fri, Feb 12, 2021 at 5:56 PM Lionel Landwerlin<br>
> > <<a href="mailto:lionel.g.landwerlin@intel.com" target="_blank">lionel.g.landwerlin@intel.com</a>> wrote:<br>
> ...<br>
> > > In our implementation that precision (in particular when a drawcall<br>
> > > ends) comes at a stalling cost unfortunately.<br>
> ><br>
> > yeah, stalling on our end too for per-draw counter snapshots.. but if<br>
> > you are looking for which shaders to optimize that doesn't matter<br>
> > *that* much.. they'll be some overhead, but it's not really going to<br>
> > change which draws/shaders are expensive.. just mean that you lose out<br>
> > on pipelining of the state changes<br>
><br>
> I don't think it makes sense to try doing this all in one step.<br>
><br>
> Unless one has resources of Google + commitment for maintaining it, I<br>
> think doing those steps with separate, dedicated tools can be better fit<br>
> for Open Source than trying to maintain a monster that tries to do<br>
> everything of analyzing:<br>
> - whether performance issue is on GPU side, CPU side, or code being too<br>
> synchronous<br>
> - where the bottlenecks are on GPU side<br>
> - where the bottlenecks are on CPU side<br>
> - what are the sync points<br>
<br>
I mean, google has a team working on perfetto, so we kinda are getting<br>
the tool here for free, all we need to do here is instrumentation for<br>
the mesa part of the system..<br>
<br>
Currently, if you look at<br>
<a href="https://chromeos.dev/en/games/optimizing-games-profiling" rel="noreferrer" target="_blank">https://chromeos.dev/en/games/optimizing-games-profiling</a> the<br>
recommendation basically amounts to "optimize on android with<br>
snapdragon profiler/etc".. which is really not a great look for mesa.<br>
(And doesn't do anything for intel at all.) Mesa is a great project,<br>
but profiling tooling, especially something for people other than mesa<br>
developers, is a glaring weakness. Perfetto looks like a great<br>
opportunity to fix that, not only for ourselves but also game<br>
developers and others.<br>
<br>
BR,<br>
-R<br>
<br>
> IMHO:<br>
> - Overall picture should not have too many details, because otherwise<br>
> one can start chasing irrelevancies [1]<br>
> - Rest of analysis works better when one concentrate on one performance<br>
> aspect (shown by the overall picture) at the time. So that activity<br>
> could have tool dedicated for that purpose<br>
><br>
><br>
> - Eero<br>
><br>
> [1] Unless one has HW assisted tool that really can tell *everything*<br>
> like ARM ETM and Intel PT with *really good* post-processing &<br>
> visualization tooling. I don't think are usable outside of large<br>
> companies though because of HW requirements and using them taking a lot<br>
> of time / expertise (1 sec trace is gigs of data).<br>
><br>
> PS. For checking on shader compiles, I've used two steps:<br>
> * script to trace frame updates & shader compiles (with ftrace uprobe on<br>
> appropriate function entry points) + monitor CPU usage & GPU usage (for<br>
> GPU, freq or power usage is enough)<br>
> -> shows whether FPS & GPU utilization dip with compiles. Frame<br>
> updates & compiles are rare enough that ftrace overhead doesn't matter<br>
><br>
> * enable Mesa shader debugging, because in next step one wants to know<br>
> what shaders they are and how they're compiled<br>
><br>
> _______________________________________________<br>
> mesa-dev mailing list<br>
> <a href="mailto:mesa-dev@lists.freedesktop.org" target="_blank">mesa-dev@lists.freedesktop.org</a><br>
> <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br>
_______________________________________________<br>
mesa-dev mailing list<br>
<a href="mailto:mesa-dev@lists.freedesktop.org" target="_blank">mesa-dev@lists.freedesktop.org</a><br>
<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br>
</blockquote></div>