Apitrace based frame retracer
Mark Janes
mark.a.janes at intel.com
Tue Jun 2 12:05:44 PDT 2015
Chris Holmes <j.chris.holmes at gmail.com> writes:
> (Sorry - snipped out most of the conversation here, so consider this a
> fork of the thread)
>
>>> 2) The concept of frame is only well-defined on single OpenGL contexts /
>>> single-threaded traces. If you have multiple contexts/threads, frames are
>>> ill-defined, and it's hard if not impossible to find a range of calls where
>>> it's safe to loop.
>>
>> Could this case be handled by allowing the user to select which context
>> they want to debug?
>
> Isn't this issue already handled by the run single-threaded option?
> Even given
> multiple contexts, I thought the tracer captured timestamps for each
> command. Given
> that the frame boundaries are on swapbuffers calls - it shouldn't be
> particularly difficult
> to assume that each context runs in parallel and order/block execution
> based on the
> individual context's swapbuffer call timestamp. That being said - if
> an app is truly
> doing synchronization between multiple contexts - then the single
> threaded run option
> should still generate the correct output - just potentially more
> slowly. Even then though,
> the parallel option could be expanded by doing some detection of
> shared resources
> between the contexts.
>
>> The multi-context games that I've seen generally have a second context
>> to overlay adds or other content which is not the core workload. The
>> other example is ChromeOS, which IIRC renders each tab in a separate
>> context. My hope is that the complex GL workloads on Chrome are
>> benchmarks that can be easily captured/analyzed/optimized on a more
>> accessible and typical platform.
>>
>> Is there another category of multi-context optimization that you think
>> is important?
>>
>>> 4) There are many applications which don't have regular frames (e.g, OpenGL
>>> accelerated UI toolkits, web browsers using GL for composition) they only
>>> render new frames as response to user events, so every frame might end up
>>> being different)
>>
>> I'm not sure how helpful a frame analysis tool would be for these
>> cases. A system trace that includes both UI inputs and the resulting GPU
>> events would be more helpful in identifying latency.
>
> Why wouldn't it be helpful? The nice thing about retracing a frame is that it
> throws out the idle time between frames, i.e. time not spent doing rendering.
The features I'd like to implement support optimization for cases where
the work load is GPU bound. If the rendering is not the bottleneck,
then a tracer or profiler can be used to figure out what needs to be
optimized on the CPU side. Linux has a decent set of tools for CPU
optimization.
For a game developer faced with a GPU bound app, it can be difficult to
know which render (out of hundreds or thousands) is consuming the gpu.
Even if you can identify problematic draw calls, experimentation is
required to understand if shaders need to be improved, geometry can be
reduced, textures are too large, or if there is just something in the
state which generates a lot of work but doesn't improve the rendered
image.
Driver developers are often in the position where a benchmark performs
poorly. They need to narrow down the benchmark and examine the shaders
and state of the slowest renders.
>>> 5) Even "regular" applications might have "irregular" frames -- e.g.,
>>> maybe the player of a first-person shooter entered a new area, and new
>>> objects were created, old ones deleted -- whereby replaying that frame in
>>> a loop will lead to corrupted/invalid state.
>>
>> I was trying to think about these cases as well.
>
> There is no such thing as a regular frame in anything except a truly
> derivative
> graphics application.
> http://www.hardocp.com/image.html?image=MTQzMzEyMDM2NE1BdTlPTUdLMTVfMl8xX2wuanBn
> Any truly intensive app will have such significant frame to frame
> variation based on
> the data that the idea of a "regular" frame is impossible to define.
>
>> * new objects: Retracing this frame would result in apitrace constantly
>> re-creating these objects, correct? This would constitute a resource
>> leak during retrace, but the frame would still render correctly. It
>> seems feasible to track resource generation during retrace.
>>
>> * deletion of old resources: This would constitute a double-deletion on
>> retrace, which in many cases would be ignored or generate a GL error.
>> It would be curious for an app to use then delete a resource in a
>> single frame. Retrace could skip deletions in the target frame if it
>> is a problem.
>>
>> I think it is more common for apps to create all the resources in
>> advance, then bind them as needed directly before rendering.
>
> Many resources - yes. Geometry and textures, yes. But there is plenty of
> per-frame generated data - Per object transforms are likely CPU
> generated per-frame.
>
>>> In short, the way I see it, this "frame retracing" idea can be indeed speed
>>> up lookups for apitrace users in many circumstances, but it should be
>>> something that users can opt-in/out, so that they can still get work done,
>>> when for one reason or another the assumptions made there just don't hold.
>>>
>>> Of course, if you're goal is just profiling (and not debugging bad
>>> rendering), then it should be always possible to find frames that meet the
>>> assumptions.
>
> Being able to extract frames would be extremely helpful. There's
> really no reason
> this would isn't doable, and I would love to see it.
>
>> I chose sockets to enable the remote machine use case. Performance
>> analysis is the motivating capability for my team. It is especially
>> needed on under-powered devices that would struggle running an apitrace
>> UI. Tablets and other form factors cannot be analyzed without a remote
>> connection.
>
> I think what you're asking for is GPUView for apitrace?
> http://graphics.stanford.edu/~mdfisher/GPUView.html
The tool that I'd like to emulate is GPA Frame Analyzer. Some videos
demonstrate the workflow for the tool:
DirectX
https://www.youtube.com/watch?v=8jbPDii_QLY
https://www.youtube.com/watch?v=ZZLRNbe0HHg
OpenGL ES, for intel-powered android devices (start at 2:40):
https://www.youtube.com/watch?v=Py4iNuzBjgI
-Mark
More information about the apitrace
mailing list