[PATCH 02/13] glamor: Add glamor_program based copy acceleration
keithp at keithp.com
Tue May 13 22:57:07 PDT 2014
Markus Wick <markus at selfnet.de> writes:
> Am 2014-05-13 17:34, schrieb Keith Packard:
>> Sure, if glsl had a 'round' function I'd use it in a second :-)
> It was added in glsl130. As you use uvec which was also added in
> glsl130, it's fine.
Cool. I've done a bit of performance analysis with this change and the
results aren't conclusive yet. Obviously, I'm going to pick the code
which goes faster for me :-)
> I hope to save some framebuffer switching. As framebuffer switches needs
> much more validating than texture binding or uniform updates, it should
> be moved to the outer loop.
I'm not too worried about frame buffer switching -- anything allocating
target surfaces larger than we can render to is probably doing something
wrong. I do see applications allocating large source images though, so
we should strive to make that reasonably fast.
Perhaps we should add an x11perf test case that draws from/to enormous
images and see how things look. I've only tested for correctness up to
> I'm more thinking about a box loop.
If you can write something that looks cleaner, that'd be awesome. I'm
all for making the code as readable as possible.
> So that's what the element buffer is for. Just emit 6 vertices as
> triangles per quad and you'll get your quads :)
> 0 1 2 0 2 3 4 5 6 4 6 7 ...
Ah, ok. I was doing the easiest non-quad path I could come up with as I
really don't care about GLES :-)
> I wanted to say that we don't have to discard the temp copy directly. We
> can still copy by fb from there. Maybe this has some advantages, but I
I can't imagine that would be faster -- you'd have to wait for the copy
to complete before even starting the fallback, and presumably the
fallback won't be any faster this way...
> This commend doesn't describe why we have to call glTextureBarrierNV
> without overlapping copys at all. We only need it for multiple X11 copy
Not just copy calls, we have to put a barrier before any operation using
the dest as source because *any* rendering occurring before the copy
would need to be correctly synchronized for this to work.
keith.packard at intel.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 810 bytes
Desc: not available
More information about the xorg-devel