[Mesa-dev] Perfetto CPU/GPU tracing
primiano at chromium.org
Thu Feb 18 11:17:56 UTC 2021
I'm one of the authors and maintainers of Perfetto, also +skyostil at .
I am really sorry for the giant bulk reply. I'll try to do my best to
answer the various open questions about Perfetto but I don't know a
better way than some heavy <snip>-ing given I'm joining the party late.
- Yep, so far the only distribution we have for the SDK is a C++
amalgamation. I am aware that it isn't great for Linux OSS projects, it
was very optimized for Google projects that have the habit of statically
- There are plans to move beyond that and have a stable C API (docs
linked below). But that will take us quite some time. We should probably
figure out some intermediate solution meanwhile.
- I'd be really keen to learn how Mesa is intending to do tracing. That
can influence a lot our upcoming design. Begin/end markers are IMO the
least interesting thing as they tend to work in whatever form and are
easy to abstract. Richer/structured trace points like
( currently used by Android GPU drivers ) are more interesting and
where most of the challenges lie.
- Maybe the discussion here needs to be split into: (1) a shorter-term
plan to iterate, figure out what works, what doesn't, see how the end
result looks like; (2) a longer term plan and on how the API surface
should look like.
I don't have strong opinions on how Mesa should proceed here and you
don't need an extra cook in the kitchen. If I really had to express a
handwavy opinion, my best advice would be to start with something you
can iterate on right now, maybe behind some compile-time flag, and come
up with a plan on how to turn into a production thing later on. We are
interested to hear your feedback and adjust the design of our stable C API.
On the tracing library / C++ vendoring / stable C API:
The way the Perfetto SDK is organized today is mainly influenced-by and
designed-for the way Google handles its projects, which boils down to:
(i) statically link everything, to minimize the testing matrix; (ii)
move fast and refactor all dependencies as needed.
It's all about "who pays the maintenance cost and when?". This tends to
work well in a large company which: (i) has a giant repo which allows
~atomic cross-project changes; (ii) has the resources to keep everything
up to date.
I am perfectly aware this is not appealing nor ideal for open source
projects and, more in general, with the way libraries in Linux
distributions work. I hear you when you say "vendoring [...] seems a bad
idea". Yes, it implicitly pushes the burden of up-revving onto the
"depender" [that's bad]
We are committed to maintain ABI stability of our tracing protocol and
socket (see https://perfetto.dev/docs/design-docs/api-and-abi). This is
because Chrome, Android, and now CrOS, and tools like gpuinspector.dev,
which all statically link perfetto in some form, have strongly different
release cycles. [that's good]
We also try to not break the C++ API too much, as robdclark@ found
trying to update through our v3..v12 monthly releases [that's good]. But
that C++ API has a too wide surface and we can't commit to make that a
fully stable API. Nor can we make the current C++ ABI stable across a
.so boundary (the C++ SDK today heavily depends on inlines to allow the
compiler to see through the layers). [that's bad]
For this reason, we recently started making plans to change the way we
do things to meet the needs of open source projects and not just
Google-internal ones. [that's good]
Specifically (Note: to open the docs below you need to join
https://groups.google.com/forum/#!forum/perfetto-dev to inherit the ACLs):
1. https://bit.ly/perfetto-debian has a plan to distribute tracing
services and SDK as standard pkg-config based packages (at least for
Debian. We'll rely on volunteers for other distros)
2. https://bit.ly/perfetto-c has a plan + ongoing discussion for having
a long-term stable C API in a libperfetto.so . The key here for us
(Perfetto) is identifying a subset of the wider C++ API that fits the
bill for projects out there and that we are comfortable maintaining
The one thing I also need to be very clear on, though, is that both the
perfetto-debian and perfetto-c discussions are very recent and will take
a while for us to get there. We can't commit to a specific timeline
right now, but if I had to make an educated estimate I'd say more
towards end-of-2021. [that's bad]
I'd also be more keen to commit once there are concrete use-cases,
ideally with iterations/feedback from a project like Mesa.
[obligatory reference at this point: https://youtu.be/Krbl911ZPBA?t=22]
> Just set uprobe for suitable buffer swap function , and parse
kernel ftrace events. (paraphrasing for context: "why do we need
instrumentation points? we can't we just use uprobes instead?")
The problem with uprobes is:
1. It's linux specific. Perhaps not a big problem for Mesa here, but the
reason why we didn't go there with Perfetto, at least until now, is that
we need to support all major OSes (Linux, CrOS, Android, Windows, macOS).
2. Even on Linux-based systems, it's really hard to have uprobes enabled
in production (I am not sure what is the situation for CrOS). In Google,
we care a lot about being able to trace from production devices without
reflashing them with dev images, because then we can just tell people
that are experiencing problems "can you just open chrome://tracing,
click on a button and give us debugging data?". Re-flashing reduces by
orders of magnitude the actionable feedback we'd be able to get from
users / testers.
The challenge of ubprobes is that it relies on dynamic rewriting of
.text pages. Whenever I mention that, a platform security team reacts
like the Frau Blucher horses (https://youtu.be/bps5hJ5DQDw?t=10), with
> Perfetto appears to be for Android / Chrome(OS?), and not available
from in common Linux distro repos.
That's right. I am aware of the problem. The plan is to address it with
bit.ly/perfetto-debian as a starter.
> Perfetto seems like an awful lot of infrastructure to capture trace
events. Why not follow the example of GPUVis, and write generic
trace_markers to ftrace?
In my experience ftrace's trace_marker:
1. Works for very simple types of events (e.g.
begin-slice/end-slice/counters) but don't work with richer / structured
event types like the ones linked above, as that gets into
stringification format, mashaling costs and interop.
2. Writing into the marker has some non-trivial cost (IIRC 1-10 us on
Android), it involves a kernel into and back from the kernel;
3. It leads to one global ring buffer, where fast events push out slower
ones, which is particularly problematic if you ever enable sched/*
events. At the userspace level, instead, we can dynamically route
different events to different buffers to mitigate this problem (see
> But maybe we should verify that MSVC is happy with it
FYI I have some work in progress to fully support the tracing protocol
on Windows. With https://r.android.com/1539396, the tracing services and
the tracing SDK build and run with both clang-cl and MSVC 2019. We are
figuring out some final details (e.g., whether to use AF_UNIX and
support only Win10+, or use a TCP socket)
> AFAIU, perfetto can operate in self-server mode, which seems like it
would be useful for distros which do not have the system daemon.
That's correct. Which is also a reason why the amalgamated source is
that large, as it also contains all the bits to run the service
in-process. (If you link with -Wl,--gc-sections or equivalent, they will
be stripped away though)
>Is it possible to build and run the Perfetto UI locally?
Yes, the UI is fully open source and fully client-only. It can be built
from the same repo by just running `tools/install-build-deps --ui` +
`./ui/build --serve` . See
> Can it display arbitrary trace events that are written to
Today the only thing that the UI displays, w.r.t trace_marker, are
events that match the syntax that Android's systrace came up with back
it boils down to a format like "B|$TGID|event_name" to begin a slice,
"E|..." to end it and so on.
Curious to hear about other uses of the trace marker and consider
importing other formats.
> Can it be extended to show i915 and i915-perf-recorder events?
It depends on what you want to extend:
1. If you want to extend the import logic and map custom events to
existing UI concepts, you just need to touch the C++ code in
//src/trace_processor (that code runs in the Web UI on the client via
WebAssembly). Good starting examples are:
i. the code that imports ninja-build logs, which is a completely custom
format, unrelated with perfetto's proto format or ftrace:
ii. the code that deals with way android's usage of the trace_marker
(those "B|<tgid>|name" mentioned above):
We are generally happy to accept patches for custom importers, as long
as they have some testing (an input trace and expected output from
queries) so we can tell if we break it while refactoring our internal code.
2. If you want to extend the UI logic with custom widgets
This is the part that is tough today and requires a lot of insider
knowledge. All the code is there, but today is not really architected in
a contributor-friendly way. We have a longer term plan to allow
customization of the UI with some extension API on the JS/TS layer, but
it's still early days for that and getting there will take longer.
> especially if the ui code stops working with our forked version
1. You can always build a blessed version of the UI and host it on any
HTTP server of your choice. Our /ui/build generates a fully static
HTML+Js+Wasm site. It doesn't need any server-side magic. It doesn't
have dependencies on Google infra. Even just `python -m http.server`
2. We are changing the way our UI deployment works (this month) and will
always leave the old versions around. This means that, while we will
keep up-revving ~monthly the main UI instance @ https://ui.perfetto.dev,
we will allow people to go back to older versions via a link like
https://ui.perfetto.dev/v1.2.3/. You can see this in action on
https://testing-dot-perfetto-ui.wl.r.appspot.com/v12.1.172/ which I just
pushed last week to test this new deployment mechanism.
Happy to talk more about all this, either here on the list or on any
If you have some more perfetto-related questions/comment/criticism feel
free to drop by our discord channel https://discord.gg/35ShE3A or ML
More information about the mesa-dev