[Mesa-dev] Perfetto CPU/GPU tracing

Rob Clark robdclark at gmail.com
Sat Feb 13 00:36:55 UTC 2021

On Thu, Feb 11, 2021 at 5:40 PM John Bates <jbates at chromium.org> wrote:


> Runtime Characteristics
> ~500KB additional binary size. Even with using only the basic features of perfetto, it will increase the binary size of mesa by about 500KB.

IMHO, that size is negligible.. looking at freedreno, a mesa build
*only* enabling freedreno is already ~6MB.. distros typically use
"megadriver" (ie. all the drivers linked into a single .so with hard
links for the different  ${driver}_dri.so), which on my fedora laptop
is ~21M.  Maybe if anything is relevant it is how much of that
actually gets paged into RAM from disk, but I think 500K isn't a thing
to worry about too much.

> Background thread. Perfetto uses a background thread for communication with the system tracing daemon (traced) to advertise trace data and get notification of trace start/stop.

Mesa already tends to have plenty of threads.. some of that depends on
the driver, I think currently radeonsi is the threading king, but
there are several other drivers working on threaded_context and async
compile thread pool.

It is worth mentioning that, AFAIU, perfetto can operate in
self-server mode, which seems like it would be useful for distros
which do not have the system daemon.  I'm not sure if we lose that
with percetto?

> Runtime overhead when disabled is designed to be optimal with one predicted branch, typically a few CPU cycles per event. While enabled, the overhead can be around 1 us per event.
> Integration Challenges
> The perfetto SDK is C++ and designed around macros, lambdas, inline templates, etc. There are ongoing discussions on providing an official perfetto C API, but it is not yet clear when this will land on the perfetto roadmap.
> The perfetto SDK is an amalgamated .h and .cc that adds up to 100K lines of code.
> Anything that includes perfetto.h takes a long time to compile.
> The current Perfetto SDK design is incompatible with being a shared library behind a C API.

So, C++ on it's own isn't a showstopper, mesa has plenty of C++ code.
But maybe we should verify that MSVC is happy with it, otherwise we
need to take a bit more care in some parts of the codebase.

As far as compile time, I wonder if we can regenerate the .cc/.h with
only the gpu trace parts?  But I wouldn't expect the .h to be
something widely included.  For example, for gpu timeline traces in
freedreno, I'm expecting it to look like a freedreno_perfetto.cc with
extern "C" {} around the callbacks that would hook into the
u_tracepoint tracepoints.  That one file would pull in the perfetto
.h, and we'd just not build that file if perfetto was disabled.

Overall having to add our own extern C wrappers in some places doesn't
seem like the *end* of the world.. a bit annoying, but we might end up
doing that regardless if other folks want the ability to hook in
something other than perfetto?


> Mesa Integration Alternatives

I'm kind of leaning towards the "just slurp in the .cc/.h" approach..
that is mostly because I expect to initially just add some basic gpu
timeline tracepoints, but over time iterate on adding more.. it would
be nice to not have to depend on a newer version of an external
library at each step.  That is ofc only my $0.02..


More information about the mesa-dev mailing list