Apitrace based frame retracer

Mon Jun 8 06:52:10 PDT 2015

On Tue, Jun 2, 2015 at 1:55 AM, Mark Janes <mark.a.janes at intel.com> wrote:

> José Fonseca <jose.r.fonseca at gmail.com> writes:
>
> > On Fri, May 29, 2015 at 11:50 PM, Mark Janes <mark.a.janes at intel.com>
> wrote:
> > 2) The concept of frame is only well-defined on single OpenGL contexts /
> > single-threaded traces.  If you have multiple contexts/threads, frames
> are
> > ill-defined, and it's hard if not impossible to find a range of calls
> where
> > it's safe to loop.
>
> Could this case be handled by allowing the user to select which context
> they want to debug?
>

Potentially.

> The multi-context games that I've seen generally have a second context
> to overlay adds or other content which is not the core workload.  The
> other example is ChromeOS, which IIRC renders each tab in a separate
> context.  My hope is that the complex GL workloads on Chrome are
> benchmarks that can be easily captured/analyzed/optimized on a more
> accessible and typical platform.
>
> Is there another category of multi-context optimization that you think
> is important?
>

I was thinking more in terms of debugging than optimization.

Another category is GPU emulators (virtualization, or console emulators,
Android emulators, etc). But I admit that one is quite niche.

> > 4) There are many applications which don't have regular frames (e.g,
> OpenGL
> > accelerated UI toolkits, web browsers using GL for composition) they only
> > render new frames as response to user events, so every frame might end up
> > being different)
>
> I'm not sure how helpful a frame analysis tool would be for these
> cases.  A system trace that includes both UI inputs and the resulting GPU
> events would be more helpful in identifying latency.
>

If the frame analysis tool's only concern is performance (as GPA Frame
Analyzer's seems to be), then the answer would be: no, not useful.

But if tool is for frame debugging and optimization then yes.

And your initial email you did describe this as a "frame debug/optimization
tool" -- that's what I've been assuming on my replies.

Even if performance if the major/only use case for this tool, this idea
seems a good way to speed up state dump lookups, which seems particularly
useful for debugging. So even if this tool has an independent life as
optimization tool, I think that theres's merit to get the core concept of
this in qapitrace so we can speed up the state dump lookups when possible.

> > 5) Even "regular" applications might have "irregular" frames  -- e.g.,
> > maybe the player of a first-person shooter entered a new area, and new
> > objects were created, old ones deleted -- whereby  replaying that frame
> in
> > a loop will lead to corrupted/invalid state.
>
> I was trying to think about these cases as well.
>
>  * new objects: Retracing this frame would result in apitrace constantly
>    re-creating these objects, correct?  This would constitute a resource
>    leak during retrace, but the frame would still render correctly.  It
>    seems feasible to track resource generation during retrace.
>
>  * deletion of old resources: This would constitute a double-deletion on
>    retrace, which in many cases would be ignored or generate a GL error.
>    It would be curious for an app to use then delete a resource in a
>    single frame.  Retrace could skip deletions in the target frame if it
>    is a problem.
>
> I think it is more common for apps to create all the resources in
> advance, then bind them as needed directly before rendering.
>

Yes, I think so too.

Also note that, if you omit these create/destroys when replaying, you'll be
significantly adulterating the performance.

I think that, rather than trying to make it work there, it might be better
to say,

> > In short, the way I see it, this "frame retracing" idea can be indeed
> speed
> > up lookups for apitrace users in many circumstances, but it should be
> > something that users can opt-in/out, so that they can still get work
> done,
> > when for one reason or another the assumptions made there just don't
> hold.
> >
> > Of course, if you're goal is just profiling (and not debugging bad
> > rendering), then it should be always possible to find frames that meet
> the
> > assumptions.
> >
> >
> >>
> >>  * Should a tool like this be built within Apitrace, or should
> >>    it have it's own repo and link against Apitrace?  Because it
> >>    communicates through a socket, my POC picks up protocol buffers as
> >>    a dependency.  There is a bunch of threading/socket/rpc
> >>    infrastructure that won't apply elsewhere in Apitrace.
> >>
> >
> > I see overlap between what it's been proposed, and what apitrace already
> > does or should do one day.  In particular:
> >
> > - no need for a separate frame retrace daemon executable -- this can
> easily
> > be achieved on the existing retrace executables, by adding a new option
> to
> > "keep the process alive", which would keep the retrace looping over the
> > most recent frame, until it get a request to dump an earlier call
> >
> > - sockets are not strictly necessary either  -- one could use the stdin
> > (just like we use stdout for ouput)  (of course, we could one day replace
> > stdin/out with sockets to cleanly allow retrace in a separate machine,
> but
> > it's orthogonal to what's being proposed)
>
> I chose sockets to enable the remote machine use case.  Performance
> analysis is the motivating capability for my team.  It is especially
> needed on under-powered devices that would struggle running an apitrace
> UI.  Tablets and other form factors cannot be analyzed without a remote
> connection.
>

FYI, https://github.com/apitrace/apitrace/pull/311 already added something
like for Android. But it would indeed be nice to generalize.

> - I don't think there's place for another gui tool -- something resembles
> > qapitrace but doesn't completely replace it -- in apitrace tree.  For
> this
> > to be merged, the frame retrace UI would have to be fully integrated with
> > qapitrace, not something on the side.
>
> I understand your concerns.  I have some doubts about the use of Qt4
> widgets for qapitrace, and was planning to build features in qml.  I
> would like to understand your thoughts on extending qapitrace as Qt
> moves further away from widgets.
>

I actually have some concerns with QML precisely due to the use of OpenGL.

But if everybody agrees QML is the future, I have no problems in migrating
the GUI to it. (In short, I'm flexible on anything except two complete
disjoint GUIs.)

I also think the json interface between qapitrace and glretrace may be
> inadequate for a more complex tool.  Hand-rolling a new parser to
> exchange a new structured data type is tedious and buggy in my
> experience.
>

Actually that has been done already: JSON has been replaced with UBJSON on
master for a few weeks now.  It's even possible to choose the dump format
(JSON vs UBJSON) as a glretrace command line option, and it wouldn't be
difficult to add another.

> >   In fact everything that works in a "frame retrace"  should work with
> > "full trace" mode too.  The frame vs full should be a simple switch
> > somewhere.
> >
> > - Editing live state: (e.g, where you say "the user will be able to edit
> > the bound shaders", etc), but I don't see exactly how one would achieve
> > that. Currently qapitrace doesn't allow to change state directly, but
> > rather allow editing the calls that set the state.
>
> My plan was to create an api for setting state, which would result in
> the insertion of new GL calls directly before the target render.
>
> The retrace api for setting new shaders would be compile and link them
> up front, and pass back error state if any.  On retrace, the new program
> id would be inserted with glUseProgram directly before the render.
>
> My contention is that redundant/overlapping state and binding commands
> have little or no performance impact.
>
> > - State tracking inside glretrace: state tracking might seem easy at
> first
> > but it's really hard to get right in the general case (and should be
> > avoided as much as possible since a state tracker doesn't know when the
> > application emits errors, and can easily diverge).  So, IMO, there should
> > be only one implementation of state tracking in apitrace, and that should
> > be in trimming (plus double-duty of x-referencing the trace later on).
> > Adding a bit of state tracking here, a bit of state tracking there, is a
> > recipe for introducing bugs and duplicate code in a lot of places.
>
> I agree with you completely.  I've written a gles1/2/3 state tracker,
> and it's not a small task.  I hacked some shader tracking into my
> prototype because it was a quick way to connect compile time shader
> assemblies with bound shaders.
>
> OTOH, "query all bound state" can take a while at run time, and mixing
> it with requests for metrics and render targets may result in latency
> that makes the application unusable.
>
> > In short, to have this in apitrace tree would involve a compromise --
> you'd
> > compromise some of your goals and flexibility, and in exchange have more
> > reuse with the rest of apitrace tree, hence less code to write and
> maintain
> > by yourself.  But it's really up to you.
> >
> >
> > My goal here is ensure the scope of apitrace stays within manageable
> > limits, so that the things it can do it can them well.  I'd rather have
> > slow yet reliable results, than have very quick but "YMMV" like results.
> I
> > also can't accept things that would make the existing functionality too
> > complex, or a maintenance headache.
>
> Keeping the mechanism reliable has been a great strategy for apitrace.
> I have a great deal of respect for the work you've done, and would be
> very happy if I could produce something that was useful enough to
> include in apitrace.  Also, collaboration with folks wanting to target
> other gpus would be much more likely as part of a widely-used project
> like apitrace.
>

Yes.

>
> >  * I would like to make use of the metrics work that is being done
> >>    this summer.  I'm eager to see the proposed metrics abstractions as
> >>    the details are worked out.
> >>
> >>  * Are there more use cases that I should consider, beyond what is
> >>    described in the wiki?
> >
> >
> > Honestly, unless you command an army of developers, I suspect there's
> > already too many use cases in there already!  (In particular, the state
> > tracking/editing as I said above.)
> >
> > But if you want one more, something that makes a lot of sense on a frame
> > analysis for debugging is a pixel history --
> > https://github.com/apitrace/apitrace/issues/317
>
> Pixel history is a popular feature in Frame Analyzer.  I think it would
> be fairly easy to implement with a frame retracer:
>
>   * clear the framebuffer with an unusual color before each render
>

Unfortunately that won't work with all blend modes (particularly those that
take the destination alpha to module the source)

But the rest seems sensible.

>
>   * compare all pixels after each render.  Changed pixels go into a bloom
>     filter for that render.

>   * repeat with a second color, if you want to be completely accurate.
>
>   * when user requests "select all renders that affected this pixel",
>     iterate over the bloom filters and check for membership.
>

The only other difficulty is tracking FBO changes -- they can be quite
common nowadays.

>
> Pixel history is more valuable when you have an overdraw visualization
> of the frame buffer (brighter pixel => more gpu cost).  Developers will
> want to analyze the history of the most expensive pixels.  Overdraw can
> be built from the same data as pixel history.
>
> This is not a feature which is more likely to benefit game developers as
> compared to driver developers.  I am focusing on the driver use cases
> first.
>

Right.

Jose
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/apitrace/attachments/20150608/48c6d64a/attachment-0001.html>