[Mesa-dev] [PATCH 0/7] st/mesa: glReadPixels cache

Nicolai Hähnle nhaehnle at gmail.com
Fri Jun 17 15:49:36 UTC 2016


On 15.06.2016 17:16, Brian Paul wrote:
> On 06/15/2016 02:38 AM, Nicolai Hähnle wrote:
>> Hi,
>>
>> some applications use successive calls to glReadPixels to read data back.
>> This typically involves a GPU-based blit for each call for de-tiling or
>> format conversions (e.g. BGRA -> RGBA). Even when the _mesa_readpixels
>> path
>> is used, such a blit tends to be hidden behind the transfer
>> operations. The
>> overhead is rather bad for performance, since we have to wait for GPU
>> idle
>> each time.
>>
>> This patch series implements a cache which heuristically does a blit
>> of the
>> entire framebuffer into a temporary texture once, which is then
>> re-used by
>> immediately following calls to glReadPixels. The cache remains
>> disabled for
>> drivers that do not prefer blit based texture transfers, i.e.
>> softpipe/llvmpipe.
>>
>> Aside from a client's application, this also affects ~1400 piglit tests,
>> which tend to see speedups of 5-10% with this cache in my tests.
>>
>> While looking for places that invalidate the cache, I also noticed a few
>> additional spots where I believe the bitmap cache needs to be flushed.
>> I put
>> those first in this series.
>>
>> Please review!
>
> Patches 1-3 look OK to me.  Though, our bitmap cache isn't really
> conformant anyway when texturing is enabled for glBitmap.
>
> I'm worried that this optimization will negatively impact
> llvmpipe/softpipe.  I really don't want llvmpipe piglit runs to be any
> slower.  And are there cases where apps might be slower with the cache
> and HW drivers?
>
> Thoughts?

It's disabled entirely for llvmpipe/softpipe, as those will go directly 
to _mesa_readpixels bypassing the cache logic.

For HW drivers, there are two scenarios where this could slow things down:

(1) transfer_map has zero overhead and reading from the mapped memory is 
as fast as reading from a staging texture. I don't know if this applies 
to any HW driver - Ilia says that Nouveau maps directly, but then it'll 
read from VRAM, so the cache is likely to help.

(2) An application manages to trigger the cache by chance once even 
though it usually only reads a very small fraction of the framebuffer.

If point (1) is an issue, we could perhaps add a PIPE_CAP_ hint, similar 
to the hint that already exists for softpipe/llvmpipe.

I don't really know what to do about point (2).

Nicolai

>
> -Brian
>
>
>> Thanks,
>> Nicolai
>> --
>>   src/mesa/state_tracker/st_atom_framebuffer.c |   1 +
>>   src/mesa/state_tracker/st_cb_bitmap.c        |   3 +
>>   src/mesa/state_tracker/st_cb_blit.c          |   1 +
>>   src/mesa/state_tracker/st_cb_clear.c         |   1 +
>>   src/mesa/state_tracker/st_cb_compute.c       |   4 +
>>   src/mesa/state_tracker/st_cb_copyimage.c     |   4 +
>>   src/mesa/state_tracker/st_cb_drawpixels.c    |   2 +
>>   src/mesa/state_tracker/st_cb_drawtex.c       |   1 +
>>   src/mesa/state_tracker/st_cb_fbo.h           |   2 +
>>   src/mesa/state_tracker/st_cb_readpixels.c    | 244 +++++++++++++----
>>   src/mesa/state_tracker/st_cb_texture.c       |  12 +
>>   src/mesa/state_tracker/st_context.c          |   3 +
>>   src/mesa/state_tracker/st_context.h          |  11 +
>>   src/mesa/state_tracker/st_draw.c             |   1 +
>>   src/mesa/state_tracker/st_draw_feedback.c    |   1 +
>>   src/mesa/state_tracker/st_gen_mipmap.c       |   4 +
>>   16 files changed, 237 insertions(+), 58 deletions(-)
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>


More information about the mesa-dev mailing list