<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Oct 30, 2015 at 4:16 PM, Rafael Spring <span dir="ltr"><<a href="mailto:rafael@dotproduct3d.com" target="_blank">rafael@dotproduct3d.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr">Hello,<div><br></div><div>we are currently doing some performance testing of our GPU code and have found that <span>apitrace</span> (a tool we really appreciate!) contains a profiling tool that works with OpenGL.</div><div><br></div><div>However we are a bit unsure how to interpret the results. According to the Profile Results a lot of GPU time seems to be spent in draw calls of a few simple shaders that don't draw a lot of pixels, whereas the complex shaders that draw lots of pixels seem to be fairly quick. Also some of the GPU draw call times seem to vary by orders of magnitude for the same shader and operation (i.e. one call takes ~20 us and the next draw call, performing the exact same operation with the same shader but possibly to a different part of the framebuffer, takes 10000 us).</div><div>We suspect the execution times aren't always true compute times but may contain stall times. Another theory is that some of the draw call times are actually just scheduling times and the real computation time ends up counting towards another draw call timing, when a batch of operations gets executed.</div></div></blockquote><div><br></div><div>It's possible.</div><div><br></div><div>D3D11 has D3D11_QUERY_TIMESTAMP_DISJOINT to detect when timestamps are not reliable. OpenGL has no such thing.</div><div><br></div><div>Still, we use GL_TIME_ELAPSED for durations (not deltas of GL_TIMESTAMP's), and we emit one query for every draw call, so even if something happened, the OpenGL implementation should be giving reliable results, as the BeginQuery/Draw/EndQuery should all stay together. </div><div><br></div><div>If this was a discrete GPU, it would be possible that the variation was due swaping things in/out of the on-chip memory, but given this is an integrated GPU that shouldn't happen.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div><br></div><div>Any insight into how to interpret the Profile Results and how to derive optimization insight from it would be greatly appreciated.</div><div><br></div><div>For testing we are currently using a MS Surface Pro 3 tablet running Windows 8.1.</div><div><br></div><div>Thanks,</div><div>Rafael</div></div></blockquote><div><br></div><div>I'm afraid I don't much advice.</div><div><br></div><div>Maybe with these tools you can figure out if any abnormal event is causing the variation you see:</div><div><br></div><div>- gpuview</div><div> </div><div> <a href="https://graphics.stanford.edu/~mdfisher/GPUView.html">https://graphics.stanford.edu/~mdfisher/GPUView.html</a></div><div> <a href="http://make-aitee-work.blogspot.co.uk/2014/01/diagnose-opengl-performance-problems.html">http://make-aitee-work.blogspot.co.uk/2014/01/diagnose-opengl-performance-problems.html</a></div><div> <a href="https://developer.nvidia.com/content/are-you-running-out-video-memory-detecting-video-memory-overcommitment-using-gpuview">https://developer.nvidia.com/content/are-you-running-out-video-memory-detecting-video-memory-overcommitment-using-gpuview</a></div><div><br></div><div>- <a href="https://software.intel.com/en-us/gpa">https://software.intel.com/en-us/gpa</a> </div></div><br></div><div class="gmail_extra">Jose</div></div>