[Mesa-dev] [PATCH 4/4] i965: Implemente a tiled fast-path for glReadPixels and glGetTexImage
Jason Ekstrand
jason at jlekstrand.net
Mon Jan 5 14:42:56 PST 2015
On Sun, Jan 4, 2015 at 1:07 PM, Ben Widawsky <ben at bwidawsk.net> wrote:
> I just did a very cursory review. I assume someone smarter than me will do
> a
> real review, but if not, feel free to ping me.
>
> I think all the comments apply to both functions.
>
> On Sat, Jan 03, 2015 at 11:54:15AM -0800, Jason Ekstrand wrote:
> > From: Sisinty Sasmita Patra <sisinty.patra at intel.com>
> >
> > Added intel_readpixels_tiled_mempcpy and
> intel_gettexsubimage_tiled_mempcpy
> > functions. These are the fast paths for glReadPixels and glGetTexImage.
> >
> > v2: Jason Ekstrand <jason.ekstrand at intel.com>
> > - Refactor to make the functions look more like the old
> > intel_tex_subimage_tiled_memcpy
> > - Don't export the readpixels_tiled_memcpy function
> > - Fix some pointer arithmatic bugs in partial image downloads (using
> > ReadPixels with a non-zero x or y offset)
> > - Fix a bug when ReadPixels is performed on an FBO wrapping a texture
> > miplevel other than zero.
> >
> > Signed-off-by: Jason Ekstrand <jason.ekstrand at intel.com>
> > ---
> > src/mesa/drivers/dri/i965/intel_pixel_read.c | 134
> +++++++++++++++++++++++++-
> > src/mesa/drivers/dri/i965/intel_tex.h | 9 ++
> > src/mesa/drivers/dri/i965/intel_tex_image.c | 137
> ++++++++++++++++++++++++++-
> > 3 files changed, 277 insertions(+), 3 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/intel_pixel_read.c
> b/src/mesa/drivers/dri/i965/intel_pixel_read.c
> > index beb3152..d1e7798 100644
> > --- a/src/mesa/drivers/dri/i965/intel_pixel_read.c
> > +++ b/src/mesa/drivers/dri/i965/intel_pixel_read.c
> > @@ -38,14 +38,16 @@
> >
> > #include "brw_context.h"
> > #include "intel_screen.h"
> > +#include "intel_batchbuffer.h"
> > #include "intel_blit.h"
> > #include "intel_buffers.h"
> > #include "intel_fbo.h"
> > #include "intel_mipmap_tree.h"
> > #include "intel_pixel.h"
> > #include "intel_buffer_objects.h"
> > +#include "intel_tiled_memcpy.h"
> >
> > -#define FILE_DEBUG_FLAG DEBUG_PIXEL
> > +#define FILE_DEBUG_FLAG DEBUG_TEXTURE
> >
> > /* For many applications, the new ability to pull the source buffers
> > * back out of the GTT and then do the packing/conversion operations
> > @@ -161,17 +163,147 @@ do_blit_readpixels(struct gl_context * ctx,
> > return true;
> > }
> >
> > +/**
> > + * \brief A fast path for glReadPixels
> > + *
> > + * This fast path is taken when the source format is BGRA, RGBA,
> > + * A or L and when the texture memory is X- or Y-tiled. It downloads
> > + * the source data by mapping the memory without a GTT fence, thus
> > + * acquiring a linear view of the memory.
>
> That last sentence is confusing since you're using linear differently than
> just
> about everywhere else (though the statement is accurate).
>
> How about something like, "It maps the source data with a CPU mapping
> which then
> needs to be de-tiled [by the CPU] before presenting linear data back to the
> user."
>
Yeah, I'll get that fixed.
>
> > + *
> > + * This is a performance win over the conventional texture download
> path.
> > + * In the conventional texture download path,
> > + *
> > + */
> > +
> > +static bool
> > +intel_readpixels_tiled_memcpy(struct gl_context * ctx,
> > + GLint xoffset, GLint yoffset,
> > + GLsizei width, GLsizei height,
> > + GLenum format, GLenum type,
> > + GLvoid * pixels,
> > + const struct gl_pixelstore_attrib *pack)
> > +{
> > + struct brw_context *brw = brw_context(ctx);
> > + struct gl_renderbuffer *rb = ctx->ReadBuffer->_ColorReadBuffer;
> > +
> > + /* This path supports reading from color buffers only */
> > + if (rb == NULL)
> > + return false;
> > +
> > + struct intel_renderbuffer *irb = intel_renderbuffer(rb);
> > + int dst_pitch;
> > +
> > + /* The miptree's buffer. */
> > + drm_intel_bo *bo;
> > +
> > + int error = 0;
> > +
> > + uint32_t cpp;
> > + mem_copy_fn mem_copy = NULL;
> > +
> > + /* This fastpath is restricted to specific renderbuffer types:
> > + * a 2D BGRA, RGBA, L8 or A8 texture. It could be generalized to
> support
> > + * more types.
> > + */
> > + if (!brw->has_llc ||
> > + !(type == GL_UNSIGNED_BYTE || type ==
> GL_UNSIGNED_INT_8_8_8_8_REV) ||
> > + pixels == NULL ||
> > + _mesa_is_bufferobj(pack->BufferObj) ||
> > + pack->Alignment > 4 ||
> > + pack->SkipPixels > 0 ||
> > + pack->SkipRows > 0 ||
> > + (pack->RowLength != 0 && pack->RowLength != width) ||
> > + pack->SwapBytes ||
> > + pack->LsbFirst ||
> > + pack->Invert)
> > + return false;
> > +
> > + if (!intel_get_memcpy(rb->Format, format, type, &mem_copy, &cpp))
> > + return false;
> > +
> > + if (!irb->mt ||
> > + (irb->mt->tiling != I915_TILING_X &&
> > + irb->mt->tiling != I915_TILING_Y)) {
> > + /* The algorithm is written only for X- or Y-tiled memory. */
> > + return false;
> > + }
> > +
> > + /* Since we are going to read raw data to the miptree, we need to
> resolve
> > + * any pending fast color clears before we start.
> > + */
> > + intel_miptree_resolve_color(brw, irb->mt);
> > +
> > + bo = irb->mt->bo;
> > +
> > + if (drm_intel_bo_references(brw->batch.bo, bo)) {
> > + perf_debug("Flushing before mapping a referenced bo.\n");
> > + intel_batchbuffer_flush(brw);
> > + }
>
> I don't know much about the spec, but is this even possible for
> glReadPixels?
>
Yes, it is. All you have to do is render and then readpixels without a
flush. I added that for precicely that reason. :-)
>
> > +
> > + error = brw_bo_map(brw, bo, false /* write enable */, "miptree");
> > + if (error || bo->virtual == NULL) {
>
> I realize this was probably just copied from the original code, but it
> should be
> superfluous (as is the comment about write enable). Not really sure why
> Eric
> initially added it.
>
Sure, the comment about write enable probably isn't needed, but it does
explain the boolean.
As for the other, I talked to Kristian and he said that, while unlikely, it
is still possible for brw_bo_map to fail so we should handle the error case.
> > + DBG("%s: failed to map bo\n", __FUNCTION__);
> > + return false;
> > + }
> > +
> > + dst_pitch = _mesa_image_row_stride(pack, width, format, type);
> > +
> > + /* We postponed printing this message until having committed to
> executing
> > + * the function.
> > + */
> > + DBG("%s: x,y=(%d,%d) (w,h)=(%d,%d) format=0x%x type=0x%x "
> > + "mesa_format=0x%x tiling=%d "
> > + "pack=(alignment=%d row_length=%d skip_pixels=%d
> skip_rows=%d)\n",
> > + __FUNCTION__, xoffset, yoffset, width, height,
> > + format, type, rb->Format, irb->mt->tiling,
> > + pack->Alignment, pack->RowLength, pack->SkipPixels,
> > + pack->SkipRows);
> > +
> > + /* This renderbuffer can come from a texture which may not be
> miplevel 0.
> > + * We need to adjust accordingly.
> > + */
> > + if (rb->TexImage) {
> > + int level = rb->TexImage->Level +
> rb->TexImage->TexObject->MinLevel;
> > +
> > + /* Adjust x and y offset based on miplevel */
> > + xoffset += irb->mt->level[level].level_x;
> > + yoffset += irb->mt->level[level].level_y;
> > + }
>
> Can the slice be non-zero? Should this be using
> intel_miptree_get_image_offset() instead?
>
Good catch. For now, I'm going to just implse that the texture's target is
one of GL_TEXTURE_2D or GL_TEXTURE_RECTANGLE. None of the download paths
handle arrays either, so that seems sane for now.
>
> > +
> > + tiled_to_linear(
> > + xoffset * cpp, (xoffset + width) * cpp,
> > + yoffset, yoffset + height,
> > + pixels - (ptrdiff_t) yoffset * dst_pitch - (ptrdiff_t) xoffset *
> cpp,
> > + bo->virtual,
> > + dst_pitch, irb->mt->pitch,
> > + brw->has_swizzling,
> > + irb->mt->tiling,
> > + mem_copy
> > + );
> > +
> > + drm_intel_bo_unmap(bo);
> > + return true;
> > +}
> > +
> > void
> > intelReadPixels(struct gl_context * ctx,
> > GLint x, GLint y, GLsizei width, GLsizei height,
> > GLenum format, GLenum type,
> > const struct gl_pixelstore_attrib *pack, GLvoid *
> pixels)
> > {
> > + bool ok;
> > +
> > struct brw_context *brw = brw_context(ctx);
> > bool dirty;
> >
> > DBG("%s\n", __FUNCTION__);
> >
> > + ok = intel_readpixels_tiled_memcpy(ctx, x, y, width, height,
> > + format, type, pixels, pack);
> > + if(ok)
> > + return;
> > +
> > if (_mesa_is_bufferobj(pack->BufferObj)) {
> > /* Using PBOs, so try the BLT based path. */
> > if (do_blit_readpixels(ctx, x, y, width, height, format, type,
> pack,
> > diff --git a/src/mesa/drivers/dri/i965/intel_tex.h
> b/src/mesa/drivers/dri/i965/intel_tex.h
> > index 27f7f11..f048e84 100644
> > --- a/src/mesa/drivers/dri/i965/intel_tex.h
> > +++ b/src/mesa/drivers/dri/i965/intel_tex.h
> > @@ -68,4 +68,13 @@ intel_texsubimage_tiled_memcpy(struct gl_context *ctx,
> > const struct gl_pixelstore_attrib
> *packing,
> > bool for_glTexImage);
> >
> > +bool
> > +intel_gettexsubimage_tiled_memcpy(struct gl_context *ctx,
> > + struct gl_texture_image *texImage,
> > + GLint xoffset, GLint yofset,
> > + GLsizei width, GLsizei height,
> > + GLenum format, GLenum type,
> > + GLvoid *pixels,
> > + const struct gl_pixelstore_attrib
> *packing);
> > +
> > #endif
> > diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c
> b/src/mesa/drivers/dri/i965/intel_tex_image.c
> > index 3317779..075315d 100644
> > --- a/src/mesa/drivers/dri/i965/intel_tex_image.c
> > +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
> > @@ -24,7 +24,7 @@
> > #include "intel_blit.h"
> > #include "intel_fbo.h"
> > #include "intel_image.h"
> > -
> > +#include "intel_tiled_memcpy.h"
> > #include "brw_context.h"
> >
> > #define FILE_DEBUG_FLAG DEBUG_TEXTURE
> > @@ -507,12 +507,145 @@ blit_texture_to_pbo(struct gl_context *ctx,
> > return true;
> > }
> >
> > +/**
> > + * \brief A fast path for glGetTexImage.
> > + *
> > + *
> > + * This fast path is taken when the texture format is BGRA, RGBA,
> > + * A or L and when the texture memory is X- or Y-tiled. It downloads
> > + * the texture data by mapping the texture memory without a GTT fence,
> thus
> > + * acquiring a linear view of the memory.
> > + *
> > + * This is a performance win over the conventional texture download
> path.
> > + * In the conventional texture download path,
> > + *
> > + */
> > +
> > +bool
> > +intel_gettexsubimage_tiled_memcpy(struct gl_context *ctx,
> > + struct gl_texture_image *texImage,
> > + GLint xoffset, GLint yoffset,
> > + GLsizei width, GLsizei height,
> > + GLenum format, GLenum type,
> > + GLvoid *pixels,
> > + const struct gl_pixelstore_attrib
> *packing)
> > +{
> > + struct brw_context *brw = brw_context(ctx);
> > + struct intel_texture_image *image = intel_texture_image(texImage);
> > + int dst_pitch;
> > +
> > + /* The miptree's buffer. */
> > + drm_intel_bo *bo;
> > +
> > + int error = 0;
> > +
> > + uint32_t cpp;
> > + mem_copy_fn mem_copy = NULL;
> > +
> > + /* This fastpath is restricted to specific texture types:
> > + * a 2D BGRA, RGBA, L8 or A8 texture. It could be generalized to
> support
> > + * more types.
> > + *
> > + * FINISHME: The restrictions below on packing alignment and packing
> row
> > + * length are likely unneeded now because we calculate the
> destination stride
> > + * with _mesa_image_row_stride. However, before removing the
> restrictions
> > + * we need tests.
> > + */
> > + if (!brw->has_llc ||
> > + !(type == GL_UNSIGNED_BYTE || type ==
> GL_UNSIGNED_INT_8_8_8_8_REV) ||
> > + !(texImage->TexObject->Target == GL_TEXTURE_2D ||
> > + texImage->TexObject->Target == GL_TEXTURE_RECTANGLE) ||
> > + pixels == NULL ||
> > + _mesa_is_bufferobj(packing->BufferObj) ||
> > + packing->Alignment > 4 ||
> > + packing->SkipPixels > 0 ||
> > + packing->SkipRows > 0 ||
> > + (packing->RowLength != 0 && packing->RowLength != width) ||
> > + packing->SwapBytes ||
> > + packing->LsbFirst ||
> > + packing->Invert)
> > + return false;
> > +
> > + if (!intel_get_memcpy(texImage->TexFormat, format, type, &mem_copy,
> &cpp))
> > + return false;
> > +
> > + /* If this is a nontrivial texture view, let another path handle it
> instead. */
> > + if (texImage->TexObject->MinLayer)
> > + return false;
> > +
> > + if (!image->mt ||
> > + (image->mt->tiling != I915_TILING_X &&
> > + image->mt->tiling != I915_TILING_Y)) {
> > + /* The algorithm is written only for X- or Y-tiled memory. */
> > + return false;
> > + }
> > +
> > + /* Since we are going to write raw data to the miptree, we need to
> resolve
> > + * any pending fast color clears before we start.
> > + */
> > + intel_miptree_resolve_color(brw, image->mt);
> > +
> > + bo = image->mt->bo;
> > +
> > + if (drm_intel_bo_references(brw->batch.bo, bo)) {
> > + perf_debug("Flushing before mapping a referenced bo.\n");
> > + intel_batchbuffer_flush(brw);
> > + }
> > +
> > + error = brw_bo_map(brw, bo, false /* write enable */, "miptree");
> > + if (error || bo->virtual == NULL) {
> > + DBG("%s: failed to map bo\n", __FUNCTION__);
> > + return false;
> > + }
> > +
> > + dst_pitch = _mesa_image_row_stride(packing, width, format, type);
> > +
> > + DBG("%s: level=%d x,y=(%d,%d) (w,h)=(%d,%d) format=0x%x type=0x%x "
> > + "mesa_format=0x%x tiling=%d "
> > + "packing=(alignment=%d row_length=%d skip_pixels=%d
> skip_rows=%d)\n",
> > + __FUNCTION__, texImage->Level, xoffset, yoffset, width, height,
> > + format, type, texImage->TexFormat, image->mt->tiling,
> > + packing->Alignment, packing->RowLength, packing->SkipPixels,
> > + packing->SkipRows);
> > +
> > + int level = texImage->Level + texImage->TexObject->MinLevel;
> > +
> > + /* Adjust x and y offset based on miplevel */
> > + xoffset += image->mt->level[level].level_x;
> > + yoffset += image->mt->level[level].level_y;
> > +
> > + tiled_to_linear(
> > + xoffset * cpp, (xoffset + width) * cpp,
> > + yoffset, yoffset + height,
> > + pixels - (ptrdiff_t) yoffset * dst_pitch - (ptrdiff_t) xoffset *
> cpp,
> > + bo->virtual,
> > + dst_pitch, image->mt->pitch,
> > + brw->has_swizzling,
> > + image->mt->tiling,
> > + mem_copy
> > + );
> > +
> > + drm_intel_bo_unmap(bo);
> > + return true;
> > +}
> > +
> > +
> > static void
> > intel_get_tex_image(struct gl_context *ctx,
> > GLenum format, GLenum type, GLvoid *pixels,
> > - struct gl_texture_image *texImage) {
> > + struct gl_texture_image *texImage)
> > +{
> > + bool ok;
> > +
> > DBG("%s\n", __FUNCTION__);
> >
> > + ok = intel_gettexsubimage_tiled_memcpy(ctx, texImage, 0, 0,
> > + texImage->Width,
> texImage->Height,
> > + format, type, pixels,
> &ctx->Pack);
> > +
> > + if(ok)
> > + return;
> > +
> > if (_mesa_is_bufferobj(ctx->Pack.BufferObj)) {
> > /* Using PBOs, so try the BLT based path. */
> > if (blit_texture_to_pbo(ctx, format, type, pixels, texImage))
> > --
> > 2.2.0
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
> --
> Ben Widawsky, Intel Open Source Technology Center
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150105/2796729f/attachment-0001.html>
More information about the mesa-dev
mailing list