[Mesa-dev] [PATCH 4/4] i965: Implemente a tiled fast-path for glReadPixels and glGetTexImage

Chad Versace chad.versace at intel.com
Tue Jan 13 10:44:44 PST 2015


On 01/12/2015 10:22 AM, Jason Ekstrand wrote:
> From: Sisinty Sasmita Patra <sisinty.patra at intel.com>
> 
> Added intel_readpixels_tiled_mempcpy and intel_gettexsubimage_tiled_mempcpy
> functions. These are the fast paths for glReadPixels and glGetTexImage.
> 
> On chrome, using the RoboHornet 2D Canvas toDataURL test, this patch cuts
> amount of time spent in glReadPixels by more than half and reduces the time
> of the entire test by 10%.
> 
> v2: Jason Ekstrand <jason.ekstrand at intel.com>
>    - Refactor to make the functions look more like the old
>      intel_tex_subimage_tiled_memcpy
>    - Don't export the readpixels_tiled_memcpy function
>    - Fix some pointer arithmatic bugs in partial image downloads (using
>      ReadPixels with a non-zero x or y offset)
>    - Fix a bug when ReadPixels is performed on an FBO wrapping a texture
>      miplevel other than zero.
> 
> v3: Jason Ekstrand <jason.ekstrand at intel.com>
>    - Better documentation fot the *_tiled_memcpy functions
>    - Add target restrictions for renderbuffers wrapping textures
> 
> v4: Jason Ekstrand <jason.ekstrand at intel.com>
>    - Only check the return value of brw_bo_map for error and not bo->virtual
> 
> Signed-off-by: Jason Ekstrand <jason.ekstrand at intel.com>
> ---
>  src/mesa/drivers/dri/i965/intel_pixel_read.c | 142 ++++++++++++++++++++++++++-
>  src/mesa/drivers/dri/i965/intel_tex.h        |   9 ++
>  src/mesa/drivers/dri/i965/intel_tex_image.c  | 139 +++++++++++++++++++++++++-
>  3 files changed, 287 insertions(+), 3 deletions(-)

 
> +/**
> + * \brief A fast path for glReadPixels
> + *
> + * This fast path is taken when the source format is BGRA, RGBA,
> + * A or L and when the texture memory is X- or Y-tiled.  It downloads
> + * the source data by directly mapping the memory without a GTT fence.
> + * This then needs to be de-tiled on the CPU before presenting the data to
> + * the user in the linear fasion.
> + *
> + * This is a performance win over the conventional texture download path.
> + * In the conventional texture download path, the texture is either mapped
> + * through the GTT or copied to a linear buffer with the blitter before
> + * handing off to a software path.  This allows us to avoid round-tripping
> + * through the GPU (in the case where we would be blitting) and do only a
> + * single copy operation.
> + */
> +static bool
> +intel_readpixels_tiled_memcpy(struct gl_context * ctx,
> +                              GLint xoffset, GLint yoffset,
> +                              GLsizei width, GLsizei height,
> +                              GLenum format, GLenum type,
> +                              GLvoid * pixels,
> +                              const struct gl_pixelstore_attrib *pack)
> +{

[snip]

> +
> +   if (!intel_get_memcpy(rb->Format, format, type, &mem_copy, &cpp))
> +      return false;

The use of intel_get_memcpy here is surprising and relies on a lucky
coincidence, if I understand this function correctly (which maybe I don't).
Whether the driver is copying data *to* of *from* the miptree,
intel_get_memcpy is given the same parameter values. In other words, the
'tiledFormat' and 'format' parameters of intel_get_memcpy are symmetric.
Please add a comment somewhere (maybe in intel_get_memcpy's Doxygen, maybe
somewhere else is better) that documents that symmetry.



> +/**
> + * \brief A fast path for glGetTexImage.
> + *
> + * This fast path is taken when the source format is BGRA, RGBA,
> + * A or L and when the texture memory is X- or Y-tiled.  It downloads
> + * the source data by directly mapping the memory without a GTT fence.
> + * This then needs to be de-tiled on the CPU before presenting the data to
> + * the user in the linear fasion.
> + *
> + * This is a performance win over the conventional texture download path.
> + * In the conventional texture download path, the texture is either mapped
> + * through the GTT or copied to a linear buffer with the blitter before
> + * handing off to a software path.  This allows us to avoid round-tripping
> + * through the GPU (in the case where we would be blitting) and do only a
> + * single copy operation.
> + */

Rather than duplicating such a length comment block, I think it's better
to refer back to the original block with \see intel_readpixels_tiled_memcpy().

> +bool
> +intel_gettexsubimage_tiled_memcpy(struct gl_context *ctx,
> +                                  struct gl_texture_image *texImage,
> +                                  GLint xoffset, GLint yoffset,
> +                                  GLsizei width, GLsizei height,
> +                                  GLenum format, GLenum type,
> +                                  GLvoid *pixels,
> +                                  const struct gl_pixelstore_attrib *packing)

The signatures and bodies of intel_gettexsubimage_tiled_memcpy and
intel_readpixels_tiled_memcpy are nearly identical, yet the function
bodies are not trivial. The two functions should really share their
common code. As far as I can tell, the only differences between the
two function bodies are:
  - intel_readpixels_tiled_memcpy pulls the miptree out of ctx->_ColorReadBuffer
  - intel_gettexsubimage_tiled_memcpy pulls the miptree out of a texImage
  - intel_gettexsubimage_tiled_memcpy does a few sanity checks on that texImage

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 884 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150113/5a586b2a/attachment.sig>


More information about the mesa-dev mailing list