<div dir="ltr"><br><div><div><div><div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Jan 4, 2015 at 1:07 PM, Ben Widawsky <span dir="ltr"><<a href="mailto:ben@bwidawsk.net" target="_blank">ben@bwidawsk.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I just did a very cursory review. I assume someone smarter than me will do a<br>
real review, but if not, feel free to ping me.<br>
<br>
I think all the comments apply to both functions.<br>
<div><div><br>
On Sat, Jan 03, 2015 at 11:54:15AM -0800, Jason Ekstrand wrote:<br>
> From: Sisinty Sasmita Patra <<a href="mailto:sisinty.patra@intel.com" target="_blank">sisinty.patra@intel.com</a>><br>
><br>
> Added intel_readpixels_tiled_mempcpy and intel_gettexsubimage_tiled_mempcpy<br>
> functions. These are the fast paths for glReadPixels and glGetTexImage.<br>
><br>
> v2: Jason Ekstrand <<a href="mailto:jason.ekstrand@intel.com" target="_blank">jason.ekstrand@intel.com</a>><br>
> - Refactor to make the functions look more like the old<br>
> intel_tex_subimage_tiled_memcpy<br>
> - Don't export the readpixels_tiled_memcpy function<br>
> - Fix some pointer arithmatic bugs in partial image downloads (using<br>
> ReadPixels with a non-zero x or y offset)<br>
> - Fix a bug when ReadPixels is performed on an FBO wrapping a texture<br>
> miplevel other than zero.<br>
><br>
> Signed-off-by: Jason Ekstrand <<a href="mailto:jason.ekstrand@intel.com" target="_blank">jason.ekstrand@intel.com</a>><br>
> ---<br>
> src/mesa/drivers/dri/i965/intel_pixel_read.c | 134 +++++++++++++++++++++++++-<br>
> src/mesa/drivers/dri/i965/intel_tex.h | 9 ++<br>
> src/mesa/drivers/dri/i965/intel_tex_image.c | 137 ++++++++++++++++++++++++++-<br>
> 3 files changed, 277 insertions(+), 3 deletions(-)<br>
><br>
> diff --git a/src/mesa/drivers/dri/i965/intel_pixel_read.c b/src/mesa/drivers/dri/i965/intel_pixel_read.c<br>
> index beb3152..d1e7798 100644<br>
> --- a/src/mesa/drivers/dri/i965/intel_pixel_read.c<br>
> +++ b/src/mesa/drivers/dri/i965/intel_pixel_read.c<br>
> @@ -38,14 +38,16 @@<br>
><br>
> #include "brw_context.h"<br>
> #include "intel_screen.h"<br>
> +#include "intel_batchbuffer.h"<br>
> #include "intel_blit.h"<br>
> #include "intel_buffers.h"<br>
> #include "intel_fbo.h"<br>
> #include "intel_mipmap_tree.h"<br>
> #include "intel_pixel.h"<br>
> #include "intel_buffer_objects.h"<br>
> +#include "intel_tiled_memcpy.h"<br>
><br>
> -#define FILE_DEBUG_FLAG DEBUG_PIXEL<br>
> +#define FILE_DEBUG_FLAG DEBUG_TEXTURE<br>
><br>
> /* For many applications, the new ability to pull the source buffers<br>
> * back out of the GTT and then do the packing/conversion operations<br>
> @@ -161,17 +163,147 @@ do_blit_readpixels(struct gl_context * ctx,<br>
> return true;<br>
> }<br>
><br>
> +/**<br>
> + * \brief A fast path for glReadPixels<br>
> + *<br>
> + * This fast path is taken when the source format is BGRA, RGBA,<br>
> + * A or L and when the texture memory is X- or Y-tiled. It downloads<br>
> + * the source data by mapping the memory without a GTT fence, thus<br>
> + * acquiring a linear view of the memory.<br>
<br>
</div></div>That last sentence is confusing since you're using linear differently than just<br>
about everywhere else (though the statement is accurate).<br>
<br>
How about something like, "It maps the source data with a CPU mapping which then<br>
needs to be de-tiled [by the CPU] before presenting linear data back to the<br>
user."<br></blockquote><div><br></div><div>Yeah, I'll get that fixed.<br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><div><br>
> + *<br>
> + * This is a performance win over the conventional texture download path.<br>
> + * In the conventional texture download path,<br>
> + *<br>
> + */<br>
> +<br>
> +static bool<br>
> +intel_readpixels_tiled_memcpy(struct gl_context * ctx,<br>
> + GLint xoffset, GLint yoffset,<br>
> + GLsizei width, GLsizei height,<br>
> + GLenum format, GLenum type,<br>
> + GLvoid * pixels,<br>
> + const struct gl_pixelstore_attrib *pack)<br>
> +{<br>
> + struct brw_context *brw = brw_context(ctx);<br>
> + struct gl_renderbuffer *rb = ctx->ReadBuffer->_ColorReadBuffer;<br>
> +<br>
> + /* This path supports reading from color buffers only */<br>
> + if (rb == NULL)<br>
> + return false;<br>
> +<br>
> + struct intel_renderbuffer *irb = intel_renderbuffer(rb);<br>
> + int dst_pitch;<br>
> +<br>
> + /* The miptree's buffer. */<br>
> + drm_intel_bo *bo;<br>
> +<br>
> + int error = 0;<br>
> +<br>
> + uint32_t cpp;<br>
> + mem_copy_fn mem_copy = NULL;<br>
> +<br>
> + /* This fastpath is restricted to specific renderbuffer types:<br>
> + * a 2D BGRA, RGBA, L8 or A8 texture. It could be generalized to support<br>
> + * more types.<br>
> + */<br>
> + if (!brw->has_llc ||<br>
> + !(type == GL_UNSIGNED_BYTE || type == GL_UNSIGNED_INT_8_8_8_8_REV) ||<br>
> + pixels == NULL ||<br>
> + _mesa_is_bufferobj(pack->BufferObj) ||<br>
> + pack->Alignment > 4 ||<br>
> + pack->SkipPixels > 0 ||<br>
> + pack->SkipRows > 0 ||<br>
> + (pack->RowLength != 0 && pack->RowLength != width) ||<br>
> + pack->SwapBytes ||<br>
> + pack->LsbFirst ||<br>
> + pack->Invert)<br>
> + return false;<br>
> +<br>
> + if (!intel_get_memcpy(rb->Format, format, type, &mem_copy, &cpp))<br>
> + return false;<br>
> +<br>
> + if (!irb->mt ||<br>
> + (irb->mt->tiling != I915_TILING_X &&<br>
> + irb->mt->tiling != I915_TILING_Y)) {<br>
> + /* The algorithm is written only for X- or Y-tiled memory. */<br>
> + return false;<br>
> + }<br>
> +<br>
> + /* Since we are going to read raw data to the miptree, we need to resolve<br>
> + * any pending fast color clears before we start.<br>
> + */<br>
> + intel_miptree_resolve_color(brw, irb->mt);<br>
> +<br>
> + bo = irb->mt->bo;<br>
> +<br>
> + if (drm_intel_bo_references(brw-><a href="http://batch.bo" target="_blank">batch.bo</a>, bo)) {<br>
> + perf_debug("Flushing before mapping a referenced bo.\n");<br>
> + intel_batchbuffer_flush(brw);<br>
> + }<br>
<br>
</div></div>I don't know much about the spec, but is this even possible for glReadPixels?<br></blockquote><div><br></div><div>Yes, it is. All you have to do is render and then readpixels without a flush. I added that for precicely that reason. :-)<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<span><br>
> +<br>
> + error = brw_bo_map(brw, bo, false /* write enable */, "miptree");<br>
> + if (error || bo->virtual == NULL) {<br>
<br>
</span>I realize this was probably just copied from the original code, but it should be<br>
superfluous (as is the comment about write enable). Not really sure why Eric<br>
initially added it.<br></blockquote><div><br></div><div>Sure, the comment about write enable probably isn't needed, but it does explain the boolean. <br></div><div> <br></div><div>As for the other, I talked to Kristian and he said that, while unlikely, it is still possible for brw_bo_map to fail so we should handle the error case.<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><div><br>
> + DBG("%s: failed to map bo\n", __FUNCTION__);<br>
> + return false;<br>
> + }<br>
> +<br>
> + dst_pitch = _mesa_image_row_stride(pack, width, format, type);<br>
> +<br>
> + /* We postponed printing this message until having committed to executing<br>
> + * the function.<br>
> + */<br>
> + DBG("%s: x,y=(%d,%d) (w,h)=(%d,%d) format=0x%x type=0x%x "<br>
> + "mesa_format=0x%x tiling=%d "<br>
> + "pack=(alignment=%d row_length=%d skip_pixels=%d skip_rows=%d)\n",<br>
> + __FUNCTION__, xoffset, yoffset, width, height,<br>
> + format, type, rb->Format, irb->mt->tiling,<br>
> + pack->Alignment, pack->RowLength, pack->SkipPixels,<br>
> + pack->SkipRows);<br>
> +<br>
> + /* This renderbuffer can come from a texture which may not be miplevel 0.<br>
> + * We need to adjust accordingly.<br>
> + */<br>
> + if (rb->TexImage) {<br>
> + int level = rb->TexImage->Level + rb->TexImage->TexObject->MinLevel;<br>
> +<br>
> + /* Adjust x and y offset based on miplevel */<br>
> + xoffset += irb->mt->level[level].level_x;<br>
> + yoffset += irb->mt->level[level].level_y;<br>
> + }<br>
<br>
</div></div>Can the slice be non-zero? Should this be using<br>
intel_miptree_get_image_offset() instead?<br></blockquote><div><br></div><div>Good catch. For now, I'm going to just implse that the texture's target is one of GL_TEXTURE_2D or GL_TEXTURE_RECTANGLE. None of the download paths handle arrays either, so that seems sane for now.<br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><div><br>
> +<br>
> + tiled_to_linear(<br>
> + xoffset * cpp, (xoffset + width) * cpp,<br>
> + yoffset, yoffset + height,<br>
> + pixels - (ptrdiff_t) yoffset * dst_pitch - (ptrdiff_t) xoffset * cpp,<br>
> + bo->virtual,<br>
> + dst_pitch, irb->mt->pitch,<br>
> + brw->has_swizzling,<br>
> + irb->mt->tiling,<br>
> + mem_copy<br>
> + );<br>
> +<br>
> + drm_intel_bo_unmap(bo);<br>
> + return true;<br>
> +}<br>
> +<br>
> void<br>
> intelReadPixels(struct gl_context * ctx,<br>
> GLint x, GLint y, GLsizei width, GLsizei height,<br>
> GLenum format, GLenum type,<br>
> const struct gl_pixelstore_attrib *pack, GLvoid * pixels)<br>
> {<br>
> + bool ok;<br>
> +<br>
> struct brw_context *brw = brw_context(ctx);<br>
> bool dirty;<br>
><br>
> DBG("%s\n", __FUNCTION__);<br>
><br>
> + ok = intel_readpixels_tiled_memcpy(ctx, x, y, width, height,<br>
> + format, type, pixels, pack);<br>
> + if(ok)<br>
> + return;<br>
> +<br>
> if (_mesa_is_bufferobj(pack->BufferObj)) {<br>
> /* Using PBOs, so try the BLT based path. */<br>
> if (do_blit_readpixels(ctx, x, y, width, height, format, type, pack,<br>
> diff --git a/src/mesa/drivers/dri/i965/intel_tex.h b/src/mesa/drivers/dri/i965/intel_tex.h<br>
> index 27f7f11..f048e84 100644<br>
> --- a/src/mesa/drivers/dri/i965/intel_tex.h<br>
> +++ b/src/mesa/drivers/dri/i965/intel_tex.h<br>
> @@ -68,4 +68,13 @@ intel_texsubimage_tiled_memcpy(struct gl_context *ctx,<br>
> const struct gl_pixelstore_attrib *packing,<br>
> bool for_glTexImage);<br>
><br>
> +bool<br>
> +intel_gettexsubimage_tiled_memcpy(struct gl_context *ctx,<br>
> + struct gl_texture_image *texImage,<br>
> + GLint xoffset, GLint yofset,<br>
> + GLsizei width, GLsizei height,<br>
> + GLenum format, GLenum type,<br>
> + GLvoid *pixels,<br>
> + const struct gl_pixelstore_attrib *packing);<br>
> +<br>
> #endif<br>
> diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c b/src/mesa/drivers/dri/i965/intel_tex_image.c<br>
> index 3317779..075315d 100644<br>
> --- a/src/mesa/drivers/dri/i965/intel_tex_image.c<br>
> +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c<br>
> @@ -24,7 +24,7 @@<br>
> #include "intel_blit.h"<br>
> #include "intel_fbo.h"<br>
> #include "intel_image.h"<br>
> -<br>
> +#include "intel_tiled_memcpy.h"<br>
> #include "brw_context.h"<br>
><br>
> #define FILE_DEBUG_FLAG DEBUG_TEXTURE<br>
> @@ -507,12 +507,145 @@ blit_texture_to_pbo(struct gl_context *ctx,<br>
> return true;<br>
> }<br>
><br>
> +/**<br>
> + * \brief A fast path for glGetTexImage.<br>
> + *<br>
> + *<br>
> + * This fast path is taken when the texture format is BGRA, RGBA,<br>
> + * A or L and when the texture memory is X- or Y-tiled. It downloads<br>
> + * the texture data by mapping the texture memory without a GTT fence, thus<br>
> + * acquiring a linear view of the memory.<br>
> + *<br>
> + * This is a performance win over the conventional texture download path.<br>
> + * In the conventional texture download path,<br>
> + *<br>
> + */<br>
> +<br>
> +bool<br>
> +intel_gettexsubimage_tiled_memcpy(struct gl_context *ctx,<br>
> + struct gl_texture_image *texImage,<br>
> + GLint xoffset, GLint yoffset,<br>
> + GLsizei width, GLsizei height,<br>
> + GLenum format, GLenum type,<br>
> + GLvoid *pixels,<br>
> + const struct gl_pixelstore_attrib *packing)<br>
> +{<br>
> + struct brw_context *brw = brw_context(ctx);<br>
> + struct intel_texture_image *image = intel_texture_image(texImage);<br>
> + int dst_pitch;<br>
> +<br>
> + /* The miptree's buffer. */<br>
> + drm_intel_bo *bo;<br>
> +<br>
> + int error = 0;<br>
> +<br>
> + uint32_t cpp;<br>
> + mem_copy_fn mem_copy = NULL;<br>
> +<br>
> + /* This fastpath is restricted to specific texture types:<br>
> + * a 2D BGRA, RGBA, L8 or A8 texture. It could be generalized to support<br>
> + * more types.<br>
> + *<br>
> + * FINISHME: The restrictions below on packing alignment and packing row<br>
> + * length are likely unneeded now because we calculate the destination stride<br>
> + * with _mesa_image_row_stride. However, before removing the restrictions<br>
> + * we need tests.<br>
> + */<br>
> + if (!brw->has_llc ||<br>
> + !(type == GL_UNSIGNED_BYTE || type == GL_UNSIGNED_INT_8_8_8_8_REV) ||<br>
> + !(texImage->TexObject->Target == GL_TEXTURE_2D ||<br>
> + texImage->TexObject->Target == GL_TEXTURE_RECTANGLE) ||<br>
> + pixels == NULL ||<br>
> + _mesa_is_bufferobj(packing->BufferObj) ||<br>
> + packing->Alignment > 4 ||<br>
> + packing->SkipPixels > 0 ||<br>
> + packing->SkipRows > 0 ||<br>
> + (packing->RowLength != 0 && packing->RowLength != width) ||<br>
> + packing->SwapBytes ||<br>
> + packing->LsbFirst ||<br>
> + packing->Invert)<br>
> + return false;<br>
> +<br>
> + if (!intel_get_memcpy(texImage->TexFormat, format, type, &mem_copy, &cpp))<br>
> + return false;<br>
> +<br>
> + /* If this is a nontrivial texture view, let another path handle it instead. */<br>
> + if (texImage->TexObject->MinLayer)<br>
> + return false;<br>
> +<br>
> + if (!image->mt ||<br>
> + (image->mt->tiling != I915_TILING_X &&<br>
> + image->mt->tiling != I915_TILING_Y)) {<br>
> + /* The algorithm is written only for X- or Y-tiled memory. */<br>
> + return false;<br>
> + }<br>
> +<br>
> + /* Since we are going to write raw data to the miptree, we need to resolve<br>
> + * any pending fast color clears before we start.<br>
> + */<br>
> + intel_miptree_resolve_color(brw, image->mt);<br>
> +<br>
> + bo = image->mt->bo;<br>
> +<br>
> + if (drm_intel_bo_references(brw-><a href="http://batch.bo" target="_blank">batch.bo</a>, bo)) {<br>
> + perf_debug("Flushing before mapping a referenced bo.\n");<br>
> + intel_batchbuffer_flush(brw);<br>
> + }<br>
> +<br>
> + error = brw_bo_map(brw, bo, false /* write enable */, "miptree");<br>
> + if (error || bo->virtual == NULL) {<br>
> + DBG("%s: failed to map bo\n", __FUNCTION__);<br>
> + return false;<br>
> + }<br>
> +<br>
> + dst_pitch = _mesa_image_row_stride(packing, width, format, type);<br>
> +<br>
> + DBG("%s: level=%d x,y=(%d,%d) (w,h)=(%d,%d) format=0x%x type=0x%x "<br>
> + "mesa_format=0x%x tiling=%d "<br>
> + "packing=(alignment=%d row_length=%d skip_pixels=%d skip_rows=%d)\n",<br>
> + __FUNCTION__, texImage->Level, xoffset, yoffset, width, height,<br>
> + format, type, texImage->TexFormat, image->mt->tiling,<br>
> + packing->Alignment, packing->RowLength, packing->SkipPixels,<br>
> + packing->SkipRows);<br>
> +<br>
> + int level = texImage->Level + texImage->TexObject->MinLevel;<br>
> +<br>
> + /* Adjust x and y offset based on miplevel */<br>
> + xoffset += image->mt->level[level].level_x;<br>
> + yoffset += image->mt->level[level].level_y;<br>
> +<br>
> + tiled_to_linear(<br>
> + xoffset * cpp, (xoffset + width) * cpp,<br>
> + yoffset, yoffset + height,<br>
> + pixels - (ptrdiff_t) yoffset * dst_pitch - (ptrdiff_t) xoffset * cpp,<br>
> + bo->virtual,<br>
> + dst_pitch, image->mt->pitch,<br>
> + brw->has_swizzling,<br>
> + image->mt->tiling,<br>
> + mem_copy<br>
> + );<br>
> +<br>
> + drm_intel_bo_unmap(bo);<br>
> + return true;<br>
> +}<br>
> +<br>
> +<br>
> static void<br>
> intel_get_tex_image(struct gl_context *ctx,<br>
> GLenum format, GLenum type, GLvoid *pixels,<br>
> - struct gl_texture_image *texImage) {<br>
> + struct gl_texture_image *texImage)<br>
> +{<br>
> + bool ok;<br>
> +<br>
> DBG("%s\n", __FUNCTION__);<br>
><br>
> + ok = intel_gettexsubimage_tiled_memcpy(ctx, texImage, 0, 0,<br>
> + texImage->Width, texImage->Height,<br>
> + format, type, pixels, &ctx->Pack);<br>
> +<br>
> + if(ok)<br>
> + return;<br>
> +<br>
> if (_mesa_is_bufferobj(ctx->Pack.BufferObj)) {<br>
> /* Using PBOs, so try the BLT based path. */<br>
> if (blit_texture_to_pbo(ctx, format, type, pixels, texImage))<br>
> --<br>
> 2.2.0<br>
><br>
</div></div>> _______________________________________________<br>
> mesa-dev mailing list<br>
> <a href="mailto:mesa-dev@lists.freedesktop.org" target="_blank">mesa-dev@lists.freedesktop.org</a><br>
> <a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev" target="_blank">http://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br>
<span><font color="#888888"><br>
--<br>
Ben Widawsky, Intel Open Source Technology Center<br>
</font></span></blockquote></div><br></div></div></div></div></div></div>