<html>
    <head>
      <base href="https://bugs.freedesktop.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - PBO unpacking is not accelerated"
   href="https://bugs.freedesktop.org/show_bug.cgi?id=111043">111043</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>PBO unpacking is not accelerated
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>Mesa
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>git
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>x86-64 (AMD64)
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux (All)
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>medium
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Mesa core
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>mesa-dev@lists.freedesktop.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>malcolmjestadt@gmail.com
          </td>
        </tr>

        <tr>
          <th>QA Contact</th>
          <td>mesa-dev@lists.freedesktop.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>While investigating performance bottlenecks with RPCS3 while using Radeonsi, I
came across a scene which was only getting 1FPS, while spending 99% of the CPU
time in the driver. Further investigation led to the discovery that using the
GL_STREAM_COPY flag instead of GL_STATIC_COPY led to performance increasing to
11fps. 

This prompted us to look into Mesa's code for an explanation, since the
operation here should be moving data between GPU memory to GPU memory, and
shouldn't be faster with GL_STREAM_COPY. 

We came across this
<a href="https://gitlab.freedesktop.org/mesa/mesa/commit/a338dc01866ce50bf7555ee8dc08491c7f63b585">https://gitlab.freedesktop.org/mesa/mesa/commit/a338dc01866ce50bf7555ee8dc08491c7f63b585</a>
which provided an explanation for why GL_STREAM_COPY was faster. 

Anyways, point is we need PBO unpacking acceleration for this to be any faster.
Even when using the GL_STREAM_COPY flag about 90% of the time spent in the
graphics thread is spent in a single function in the driver. 

Thanks.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the QA Contact for the bug.</li>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>