[Mesa-dev] [PATCH RFC v1] i965: Implement CopyTexSubImage2D via BLORP (and use it by default).

Mon Jan 28 14:32:18 PST 2013

On 01/20/2013 02:15 PM, Paul Berry wrote:
[snip]
> As an alternative to those fixes, here's my refactoring suggestion: drop
> this function entirely, and instead, in brw_blorp_copytexsubimage(),
> create a temporary intel_renderbuffer to wrap around dst_image, and just
> call do_blorp_blit() directly.  do_blorp_blit() already gets all the
> resolves correct, and it's well covered by existing tests.  As a bonus,
> when we get around to implementing EXT_multisampled_render_to_texture,
> we'll probably remember to adjust do_blorp_blit() accordingly--it would
> be nice for this code to inherit that adjustment automatically.

That's a very good idea.  It turned out to be less painful than I 
might've thought, and cleaned things up quite a bit.  v2 implements this 
(which I think should solve all the resolve questions too).

[snip]
>     +   /* CopyTexSubImage should happen even in conditional rendering.
>       We could
>     +    * turn it off and back on again.  For now, just bail instead.
>     +    */
>     +   if (ctx->Query.CondRenderQuery)
>     +      return false;
>
>
> This shouldn't be necessary.  Blorp ignores any 3D pipeline state that
> the client has set up (including conditional rendering state), so if you
> just drop this check, the copy should happen just fine even if
> conditional rendering is on.

You're right, it wasn't necessary.  I apparently had another bug 
originally that showed up in one of the conditional rendering tests, and 
I just papered over it with this.  Turns out I then proceeded to fix the 
actual bug too, so this is just garbage.  Removed in v2.

>     +
>     +   /* BLORP requires the formats to match.  In the future, we
>     should relax
>     +    * this for simple format mismatches like XRGB <-> ARGB.
>     +    */
>     +   if (_mesa_get_srgb_format_linear(src_rb->Format) !=
>     +       _mesa_get_srgb_format_linear(dst_image->TexFormat))
>     +      return false;
>
>
> Aren't you and Carl in the midst of relaxing that restriction right
> now?  Perhaps we should refactor this check into a function that's
> shared by the blit and CopyTexSubImage paths so that once the
> restriction is relaxed, we don't have to remember to update both checks.

Refactoring it is probably wise.  Could we do that as a follow-on patch 
(in the series that relaxes the restriction)?

[snip]
> Destination clipping shouldn't be necessary, because the restrictions on
> glCopyTexSubImage prevent the user from specifying a destination
> rectangle that falls outside the bounds of the destination texture.  See
> error_check_subtexture_dimensions().
[snip]
> Source clipping shouldn't be necessary either, because copytexsubimage
> (src/mesa/main/teximage.c) calls _mesa_clip_copytexsubimage, which takes
> care of it.

Excellent, thanks!  I've replaced the code by your comments.

>     +
>     +   /* Account for the fact that in the system framebuffer, the
>     origin is at
>     +    * the lower left.
>     +    */
>     +   if (_mesa_is_winsys_fbo(ctx->ReadBuffer)) {
>     +      GLint tmp = src_rb->Height - srcY0;
>     +      srcY0 = src_rb->Height - srcY1;
>     +      srcY1 = tmp;
>     +      mirror_y = !mirror_y;
>     +   }
>
>
> If you decide to go along with my refactoring suggestion, I'd suggest a
> follow-up patch where we move this check (and similar checks in
> try_blorp_blit) inside do_blorp_blit, just so that we don't have to
> duplicate this code.

Apparently that's hard: try_blorp_blit and this function look at 
ctx->ReadBuffer/ctx->DrawBuffer, which are gl_framebuffer*s. 
do_blorp_blit() only looks at the src/dst intel_renderbuffers, and 
doesn't access ctx fields.  I'd like to keep that separation, if possible.

Unfortunately, it apparently isn't possible to determine whether to 
Y-flip based on the intel_renderbuffers alone.  Name == 0 seemed 
obvious, but some of our fake/wrapper RBs apparently use that.  (We 
could change that, but I'm afraid I won't catch all cases.)  I also 
thought to check AllocStorage == intel_alloc_window_storage, but Chad 
convinced me that would fail in certain corner cases as well.

So I think I'll punt on this refactoring, if possible.  (I'd still like 
to see it happen someday.)

[snip]

> Possible further improvements (I'm not asking you to do them--just
> pointing them out so we can keep them in mind as we look for other
> places we can improve performance):
>
> (1) In theory, we ought to be able to optimize do_blorp_copytexsubimage
> and do_blorp_blit to detect the case where the destination is being
> entirely overwritten, and avoid having to do a depth resolve on the
> destination in that case (there's no point in carefully computing values
> to write into a depth buffer that's about to be overwritten).  (Note: a
> further advantage of my refactoring suggestion is that we'd only have to
> make this improvement in one place).
>
> (2) It's unfortunate that we still fall back to swrast when dims == 3.
> Really we ought to be able to do handle the 3D case in both blorp and
> the hardware blitter--we just have to  calculate the miptree layer more
> carefully.  Your code uses dst_image->Face, but instead it should use a
> combination of dst_image->Face and zoffset, in much the same way that
> intel_render_texture() does (in fact, it would be nice if we had some
> common "intel_miptree_compute_layer" function for this purpose).

Agreed.  It'd be nice to fix this sometime.

> (3) I have a long-term plan to update blorp so that its access to the
> destination miptree is HiZ-aware (blorp uses the 3D pipeline after all,
> so this should be possible).  That will of course change what resolves
> are necessary.  I have no idea when we'll get around to doing this, but
> it's worth looking into if we continue to find blit-related performance
> bottlenecks like this one.
>
> Thanks again for fixing this, Ken.  Hopefully my suggestions aren't too
> onerous.