Apitrace based frame retracer

Mon Jun 1 13:16:43 PDT 2015

On Fri, May 29, 2015 at 11:50 PM, Mark Janes <mark.a.janes at intel.com> wrote:

> I have spent some time prototyping a frame debug/optimization tool
> based on Apitrace, and I'd like to get feedback.
>
> Apitrace's retrace functionality limits the usability of qapitrace,
> because it invokes a full retrace whenever qapitrace needs more
> information.  This creates big delays as the user explores the trace
> file.
>
> Finding bugs and bottlenecks in a complex GPU workload involves
> exploration and experimentation.  Users need a more interactive
> experience than qapitrace can provide.
>
> I've done some hacking to set up a server process which retraces a
> trace file to a specified frame, then accepts subsequent retrace
> requests for renders within the frame.  Because the server process
> preserves the GL state from previous frames, it can execute any frame
> retrace request in the time it took the original app to render the
> frame.
>
> My proof-of-concept branch currently displays frame buffer images when
> a user selects a render, with the minor modification that glClear is
> called before the render, so the framebuffer only shows pixels which
> were rendered by the selected call.  Also, it parses shader assemblies
> from the "INTEL_DEBUG=vs,ps" setting, and displays the IR and assembly
> for the render.  These features were chosen to be minimally
> demonstrate the interactivity that can be accomplished with this
> approach.
>
> I've set up a wiki describing my apitrace branch, and the features I'd
> like to build with a frame retracer:
>
> https://github.com/janesma/apitrace/wiki/frameretrace-branch
>
> The wiki has some screen shots of the features I listed above.
>
> I'd like to get some input on the following:
>

First of all, I think there's a lot of potential on this idea.

There was a feature request open on batching glretrace dumps --
https://github.com/apitrace/apitrace/issues/51 -- but I never saw so much
potential, as I failed to spot the connection batching with looping calls
within a frame.

Now to specifics...

>
>  * Does anyone see technical issues with this approach?
>

There are a few to keep in mind:

1) In the wiki you said:

  "The UI commands the server process to retrace from the beginning of the
frame to a specified draw call and provide results (framebuffer images,
bound state, etc) back to the UI. This is generally safe to do because of
the repeatable nature of GL commands within a frame boundary."

And indeed quite often applications reach a steady state where the calls
executed in two successive frames are practically indistinguishable, where
the GL state at the beginning and end of the frame is virtually the same,
therefore one can play the calls of one particular frame in a loop without
altering behavior.

But IIUC the current implementation of frameretrace does not do that: after
dumping the state mid-frame, it will reset the trace position to the
beggining of the frame.  That will not work -- imagine the calls at the
beginning of the frame assume blending is disabled, and mid-frame blend is
enabled -- so if you jump straight from mid-frame to start-of-frame all
that initial draw calls will misdraw.

There's an easy solution though: always retrace until the end of the frame,
before reset the trace position.

2) The concept of frame is only well-defined on single OpenGL contexts /
single-threaded traces.  If you have multiple contexts/threads, frames are
ill-defined, and it's hard if not impossible to find a range of calls where
it's safe to loop.

4) There are many applications which don't have regular frames (e.g, OpenGL
accelerated UI toolkits, web browsers using GL for composition) they only
render new frames as response to user events, so every frame might end up
being different)

5) Even "regular" applications might have "irregular" frames  -- e.g.,
maybe the player of a first-person shooter entered a new area, and new
objects were created, old ones deleted -- whereby  replaying that frame in
a loop will lead to corrupted/invalid state.

In short, the way I see it, this "frame retracing" idea can be indeed speed
up lookups for apitrace users in many circumstances, but it should be
something that users can opt-in/out, so that they can still get work done,
when for one reason or another the assumptions made there just don't hold.

Of course, if you're goal is just profiling (and not debugging bad
rendering), then it should be always possible to find frames that meet the
assumptions.

>
>  * Should a tool like this be built within Apitrace, or should
>    it have it's own repo and link against Apitrace?  Because it
>    communicates through a socket, my POC picks up protocol buffers as
>    a dependency.  There is a bunch of threading/socket/rpc
>    infrastructure that won't apply elsewhere in Apitrace.
>

I see overlap between what it's been proposed, and what apitrace already
does or should do one day.  In particular:

- no need for a separate frame retrace daemon executable -- this can easily
be achieved on the existing retrace executables, by adding a new option to
"keep the process alive", which would keep the retrace looping over the
most recent frame, until it get a request to dump an earlier call

- sockets are not strictly necessary either  -- one could use the stdin
(just like we use stdout for ouput)  (of course, we could one day replace
stdin/out with sockets to cleanly allow retrace in a separate machine, but
it's orthogonal to what's being proposed)

- a lot of the stuff
https://github.com/janesma/apitrace/wiki/frameretrace-use-cases overlaps
with things qpapitrace does, or we'd like it to do

There a few things being proposed that I have serious reservations though:

- I don't think there's place for another gui tool -- something resembles
qapitrace but doesn't completely replace it -- in apitrace tree.  For this
to be merged, the frame retrace UI would have to be fully integrated with
qapitrace, not something on the side.

  In fact everything that works in a "frame retrace"  should work with
"full trace" mode too.  The frame vs full should be a simple switch
somewhere.

- Editing live state: (e.g, where you say "the user will be able to edit
the bound shaders", etc), but I don't see exactly how one would achieve
that. Currently qapitrace doesn't allow to change state directly, but
rather allow editing the calls that set the state.

- State tracking inside glretrace: state tracking might seem easy at first
but it's really hard to get right in the general case (and should be
avoided as much as possible since a state tracker doesn't know when the
application emits errors, and can easily diverge).  So, IMO, there should
be only one implementation of state tracking in apitrace, and that should
be in trimming (plus double-duty of x-referencing the trace later on).
Adding a bit of state tracking here, a bit of state tracking there, is a
recipe for introducing bugs and duplicate code in a lot of places.

In short, to have this in apitrace tree would involve a compromise -- you'd
compromise some of your goals and flexibility, and in exchange have more
reuse with the rest of apitrace tree, hence less code to write and maintain
by yourself.  But it's really up to you.

My goal here is ensure the scope of apitrace stays within manageable
limits, so that the things it can do it can them well.  I'd rather have
slow yet reliable results, than have very quick but "YMMV" like results. I
also can't accept things that would make the existing functionality too
complex, or a maintenance headache.

 * I would like to make use of the metrics work that is being done
>    this summer.  I'm eager to see the proposed metrics abstractions as
>    the details are worked out.
>
>  * Are there more use cases that I should consider, beyond what is
>    described in the wiki?

Honestly, unless you command an army of developers, I suspect there's
already too many use cases in there already!  (In particular, the state
tracking/editing as I said above.)

But if you want one more, something that makes a lot of sense on a frame
analysis for debugging is a pixel history --
https://github.com/apitrace/apitrace/issues/317

BTW, I think we should devise an Mesa specific GL extension to extract GLSL
IR & HW assembly in a clean fashion, instead of parsing driver debug output.

Jose
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/apitrace/attachments/20150601/3f32c294/attachment.html>