[Xorg] Using transformations in the render extension

Mon Aug 16 02:41:38 PDT 2004

On Mon, 2004-08-16 at 02:16, Lars Knoll wrote:
> Hi,
> 
> we did some experimenting trying to use transformations provided by the render 
> extension on pixmaps. What we found is that they currently are about a factor 
> of 5 slower than sending the data over to the client, doing the 
> transformation there and sending it back (for the "nearest" neighbor filter).
> 
> I digged a bit through the XServer code, and it seems that we take the worst 
> possible code path whenever a transformation is involved (going into 
> fbCompositeGeneral, and using function pointers for all operations).
> 
> The other thing I noticed while looking through the code is that PictOpSrc 
> always seems to go through the fbComposeGeneral method (even without 
> transformations).
> 
> Wouldn't it be worth optimising PictOpSrc without transformations, and both 
> PictOpOver and PictOpSrc for affine transformations (or at least scaling 
> operations)? 
> 
> I know that it would be best if the drivers started supporting these in 
> hardware, but as it currently looks to me most of them do not have the 
> support yet, and it would be great if the software fallback would not be 
> slower than what one can achieve on the client.

Your DDX should be handling PictOpSrc in the no-transform, no-repeat,
same-format case by using its normal CopyArea acceleration.  XAA was
fixed up to do this for the next release.  Repeating 1x1 PictOpSrc
should be done using the normal solid fill code as well (in the absence
of something better, which a general render acceleration might be). 
Kdrive does both of these.

There are far too many possible Render combinations to provide optimized
paths for all of them in software (without codegen, which is probably
more work than it's worth).  We weren't seeing too much PictOpSrc
before, but now we are with Composite.  We weren't seeing much use of
transforms, and now we are seeing some (and will be seeing more and more
for eye-candy things).  I expect that as people find more and more neat
effects to do with Render thanks to Composite, we'll have to keep adding
specific case improvements in software unless we improve software
compositing generally so that it's much closer to optimal.

Keith told me his plans for the general compositing code the other day,
involving converting the general code to operate on "patches" instead of
pixels.  With that, then we can have a single set of fast vectored IN
handling that operates on patches, two sets of each OP (component-alpha
and non-component-alpha), and patch-loading and -storing functions which
will be able work on the framebuffer *much* faster than currently
possible per-pixel.

Note that I expect to see much more Render acceleration in drivers once
we get an acceleration architecture that's designed with Render in mind.

-- 
Eric Anholt                                eta at lclark.edu          
http://people.freebsd.org/~anholt/         anholt at FreeBSD.org