[PATCH 13/20] glamor: Add glamor_program based copy acceleration

Keith Packard keithp at keithp.com
Thu Mar 20 15:07:55 PDT 2014


Markus Wick <markus at selfnet.de> writes:


> You haven't used reverse, upsidedown and closure. Aren't they needed?

reverse and upsidedown are required when doing software overlapping
blts; but glCopyPixels handles those internally. 'closure' is a left
over from some ancient drawing code requirements; I can't remember what
it was used for though. Something about passing driver state across the
miDoCopy function interface.

>> +                    glCopyPixels (dx1 + dx - src_box->x1,
>> +                                  dy1 + dy - src_box->y1,
>> +                                  dx2 - dx1, dy2 - dy1, GL_COLOR);
>
> This functions are deprecated fortunately. Please don't use them at all.
> Glamor shall be used as hardware accelerated fallback, so we shouldn't 
> rely on uncommon features which may be implemented in software (eg on 
> nvidia) or not implemented at all (core, gles).

There isn't anything else I *can* use. GL doesn't otherwise define the
results for operations using the same object for source and
destination.

> I see two ways to handle overlapping copys:
> - Emit lots of non-overlapping draw calls / blits / copys
> - Create a temporary buffer and copy twice

A separate path involving a temporary buffer is clearly what would be
required where glCopyPixels is not available. Performance from
glCopyPixels is likely to be better when available, so we should prefer it.

> I guess GPUs have some special HW for this, but we can't use them on 
> OpenGL. memmove isn't parallelizable.

Yes, Intel GPUs have the 'blt' engine which performs overlapped copies
correctly, and glCopyPixels is how we get to that. I imagine the same is
true for other GPUs.

As this operation is critical for many existing X applications for
scrolling data around, we really have to make sure that we hit the
special hardware.

>> +                glDrawArraysInstanced(GL_TRIANGLE_STRIP, 0, 4, nbox);
>
> Do you think it's worth to add a fast path with glBlitframebuffer or 
> glCopyImageSubData?
> Both should have a better GPU performance and glCopyImageSubData should 
> also have a lower CPU overhead.

Should be easy to measure and see if either is faster than this code;
then we could decide.

-- 
keith.packard at intel.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 810 bytes
Desc: not available
URL: <http://lists.x.org/archives/xorg-devel/attachments/20140320/82636fc3/attachment-0001.sig>


More information about the xorg-devel mailing list