<html>
    <head>
      <base href="https://bugs.freedesktop.org/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - _drm_intel_gem_bo_references() function takes half the CPU with Witcher2 game"
   href="https://bugs.freedesktop.org/show_bug.cgi?id=86969">86969</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>_drm_intel_gem_bo_references() function takes half the CPU with Witcher2 game
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>DRI
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>Other
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Keywords</th>
          <td>have-backtrace
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>medium
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>libdrm
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>dri-devel@lists.freedesktop.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>eero.t.tamminen@intel.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>currojerez@riseup.net
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Created <span class=""><a href="attachment.cgi?id=110404" name="attach_110404" title="hack/test for alternate drm_intel_gem_bo_references() semantics">attachment 110404</a> <a href="attachment.cgi?id=110404&action=edit" title="hack/test for alternate drm_intel_gem_bo_references() semantics">[details]</a></span> <a href='page.cgi?id=splinter.html&bug=86969&attachment=110404'>[review]</a>
hack/test for alternate drm_intel_gem_bo_references() semantics

Setup:

- HSW GT3e in desktop case
- Ubuntu 14.10 64-bit (kernel 3.16, Xorg 1.16)
- Latest libdrm & Mesa 32-bit builds (2014-11-07)
- Witcher2 game from Steam (32-bit)

Steps:

- Start Witcher2 with latest Mesa
- Select FullHD resolution and highest generic gfx option, then disable
anti-aliasing & ubersampling from the advanced options
- Select "Arena" option from the main menu
- After animation stops, click through "discussion" and pan around with mouse

Results:

- When panning around, some orientations show 100% (single) CPU utilization.
- "perf" reports (nearly) *half* of the CPU consumption to happen in (very
small & recursive) libdrm "_drm_intel_gem_bo_references" function.


Analysis:

Only caller of "_drm_intel_gem_bo_references" is the exported
"drm_intel_gem_bo_references" function.  Tracing the calls to that, reveals it
to be called from Mesa gen6_check_query() function. [1]

Removing libdrm _drm_intel_gem_bo_references() CPU bottleneck by doing flushes
unconditionally in gen6_check_query() removed most of the CPU consumption and
verifies the "perf" finding. However, those extra flushes made performance
marginally worse.


Printing statistics from resolving counts showed that for Witcher2, largest
relocation count in _drm_intel_gem_bo_references() was 590, but ~97% of the
calls had zero relocation counts.

Another test was changing the semantics of "drm_intel_gem_bo_references".  This
also removed most of the Witcher2 CPU consumption, potentially with speed
improvement.  *On the test machine*, Witcher2 isn't CPU bound despite ~100% CPU
load, so CPU usage doesn't directly affect that.  *However*, on a temperature
limited machine (e.g. laptop with GT3), this could have clear performance
impact as the lowered CPU consumption may allow GPU to run at higher clock
speed. Power usage should at least be effected.

Attached is patch/hack (by Fransisco Jerez) for testing this.


Conclusion:

There could be two separate functions, with slightly different semantics.  One
that is fast and does something similar to what Fransisco proposed and which
can be used by (Mesa) functions that don't need more accurate information, and
the current "libdrm _drm_intel_gem_bo_references" function for those that do
need it.


---

[1] In addition to resource usage tracing, functracer can attach to a running
process and track calls to specified (exported) function:
  <a href="https://maemo.gitorious.org/maemo-tools/functracer">https://maemo.gitorious.org/maemo-tools/functracer</a>

According to it, the callers were:

194154 calls (for the trace period):
0xf601e722 drm_intel_bo_references() at intel_bufmgr.c:298
0xf63ca55c gen6_check_query() at gen6_queryobj.c:329
0xf6144e8d _mesa_GetQueryObjectiv() at queryobj.c:620

1133 calls:
0xf601e722 drm_intel_bo_references() at intel_bufmgr.c:298
0xf63ca34c gen6_queryobj_get_results() at gen6_queryobj.c:128
0xf63ca583 gen6_check_query() at gen6_queryobj.c:333
0xf6144e8d _mesa_GetQueryObjectiv() at queryobj.c:620
mp.h:17626

62 calls:
0xf601e722 drm_intel_bo_references() at intel_bufmgr.c:298
0xf62fe973 brw_map_buffer_range() at intel_buffer_objects.c:390
0xf60754b6 _mesa_MapBufferRange() at bufferobj.c:2178</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>