[Mesa-dev] [PATCH 7/7] st/mesa: cache staging texture for glReadPixels

Ilia Mirkin imirkin at alum.mit.edu
Wed Jun 15 15:20:55 UTC 2016


On Wed, Jun 15, 2016 at 4:38 AM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
> From: Nicolai Hähnle <nicolai.haehnle at amd.com>
>
> ---
>  src/mesa/state_tracker/st_cb_fbo.h        |   2 +
>  src/mesa/state_tracker/st_cb_readpixels.c | 122 ++++++++++++++++++++++++++----
>  2 files changed, 110 insertions(+), 14 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_cb_fbo.h b/src/mesa/state_tracker/st_cb_fbo.h
> index f3b310b..414a661 100644
> --- a/src/mesa/state_tracker/st_cb_fbo.h
> +++ b/src/mesa/state_tracker/st_cb_fbo.h
> @@ -58,6 +58,8 @@ struct st_renderbuffer
>     boolean software;
>     void *data;
>
> +   bool use_readpix_cache;
> +
>     /* Inputs from Driver.RenderTexture, don't use directly. */
>     boolean is_rtt; /**< whether Driver.RenderTexture was called */
>     unsigned rtt_face, rtt_slice;
> diff --git a/src/mesa/state_tracker/st_cb_readpixels.c b/src/mesa/state_tracker/st_cb_readpixels.c
> index a501b7b..42d147f 100644
> --- a/src/mesa/state_tracker/st_cb_readpixels.c
> +++ b/src/mesa/state_tracker/st_cb_readpixels.c
> @@ -46,6 +46,28 @@
>  #include "state_tracker/st_pbo.h"
>  #include "state_tracker/st_texture.h"
>
> +/* The readpixels cache caches a blitted staging texture so that back-to-back
> + * calls to glReadPixels with user pointers require less CPU-GPU synchronization.
> + *
> + * Assumptions:
> + *
> + * (1) Blits have high synchronization overheads, and it is beneficial to
> + *     use a single blit of the entire framebuffer instead of many smaller
> + *     blits (because the smaller blits cannot be batched, and we have to wait
> + *     for the GPU after each one).
> + *
> + * (2) transfer_map implicitly involves a blit as well (for de-tiling, copy
> + *     from VRAM, etc.), so that it is beneficial to replace the
> + *     _mesa_readpixels path as well when possible.

FWIW this second point isn't the case for nouveau. We map the texture
directly under many conditions. However that's still slow since VRAM
is "far away", and, without having done the tests, this seems like it
would improve the situation for nouveau as well.

BTW, what are some good tests to play with for testing this out?

> + *
> + */
> +#define USE_READPIXELS_CACHE true

Maybe add it to ST_DEBUG instead (so it can be disabled)? Your call.

  -ilia


More information about the mesa-dev mailing list