WIP: "apitrace trim" is starting to get interesting

Fri Dec 2 06:48:33 PST 2011

On Tue, Nov 29, 2011 at 9:15 PM, Carl Worth <cworth at cworth.org> wrote:
> PS. What follows is more-or-less a brain-dump from a session I spent
> with the current (and still immature) "apitrace trim" applyinh it to a
> real-world case, (getting a minimal trace from a game triggering a
> rendering bug).
>
[..]
> This morning I ran through my first non-trivial case of trimming, and I
> think I got a usefully minimal trace in the end, (and learned a lot
> about what we want here).
>
> Here's the process I went through:
>
>  1. Traced a program until a bug became evident
>
>  2. Brought it up in qapitrace to a frame showing the bug, noted the
>     first and lasst call numbers from that frame.
>
>  3. Ran my "apitrace trim --trim-gets --calls=FIRST-LAST" to do some
>     initial trimming.
>
> During this step I made two changes to "apitrace trim" as seen here
> previously:
>
>    a. Added unconditional trimming of all calls after the last call
>       specified in the --calls option. This should be safe and
>       uncontroversial. It was very useful for step 5 below.
>
>    b. Disabled the trimming of out-of-range drawing operations. This
>       particular case showed that this trimming is buggy, (and it's a
>       different bug than the FBO issue mentioned above). I haven't
>       chased down that bug yet, but it's clear that we definitely need
>       to be able to easily specify that certain supposedly-safe
>       trimming operations shouldn't be performed to be able to
>       workaround bugs like this.
>
>  4. Added a new --trim-all option to "apitrace trim" that
>     unconditionally trims all calls outside the specified range.
>
>  5. Manually iterated with calls of the form:
>
> apitrace trim --trim-all --calls=RANGE bug.trace; glretrace -w bug-trim.trace
>
> Since the trace now ends with the buggy frame it was clear from the
> final frame in the retrace window whether the bug was preserved. I
> iteratively refined my range specification, (sometimes blindly and
> sometimes by poking around inside qapitrace to guess at likely
> boundaries).
>
> The final trim took a trace of over 500 thousand calls down to 54 calls
> while still preserving the bug. It took me roughly 150 steps to find
> that. Obviously, I want to automate that quite a bit more.
>
> Here are some of the things I noticed while going through this:
>
>  * If the glXMakeCurrent call is trimmed away, then glretrace
>    segfaults. So there's obviously a bug somewhere. I didn't
>    investigate where.

This is to be expected. Each gl call is actually a jump from a
dispatch table in TLS memory. If no context is bound, that dispatch is
often empty, so the thread will try to execute NULL.

Especial the glX/CGL/wgl entrypoints are quite important, and usually
there are not many of them, so it's better to preserve them.

This is a good decicion tree IMO:

- sideeffects == NO:  can be safely removed
- sideffects == YES:
  - call after the call of interest: can be safely removed
  - call before the call of interest:
      - draw call == NO: retain call (*)
      - draw call == YES:
        - framebuffer bound == YES: retrain
        - framebuffer bound == NO:
          - there is a clear/swapbuffers call between this call and
the call of interest:  can be removed (**)
          - otherwise: retain

(*) draw calls are those which take most of the time retracing, so
retaining others is not a big deal.

(**) the clear must be complete (i.e., clear render/stencil/depth if existing)

Once you can do this, and also strip the teximage calls for unused
textures, then you should have a quite minimal trace in a robust
fashion.

>  * In my case, I trimmed out all state manipulations except for the
>    handful that were actually necessary once I had my trace trimmed
>    down to one drawing operation of interest. In many scenarios, that's
>    not necessary at all, (extra state setting is unlikely to complicate
>    debugging in many situations).
>
>  * For the manual, iterative trimming I did, it would be nice to have
>    the tool detect a bit more structure from the trace. The frame
>    identification in qapitrace definitely helped, (and I plan to
>    abstract that code to add "apitrace trim --frames" soon). Beyond
>    that, it would be very convenient to automatically detect the
>    structure of things such as nested glPushMatrix/glPopMatrix
>    calls. Those calls were among the most problematic in the
>    range-based bisecting I was attempting.

I'm just about to commit the call flag branch -- a starting point to
have more classification of calls.

>    Long-term, it might be nice to have some hierarchy inside qapitrace
>    with the ability to enable/disable pieces of the hierarchy in a live
>    preview. That should make it fairly easy to drill down to find a
>    problematic drawing operation, (like firebug for the graphics
>    stack).

One thing I'd like to do eventually is to tag handles (such as texture
names) with an unique id (for the whole trace), which can be used both
for cross-reference checking and to solve the issue that names are nor
reproducible.

Jose