EXA for radeon experimental patch

Lars Knoll lars at trolltech.com
Thu Sep 1 05:57:54 PDT 2005

On Thursday 01 September 2005 13:15, Benjamin Herrenschmidt wrote:
> > > They do, though they require some kernel support to get to the physical
> > > address of pages and need some proper scatter/gather support on the
> > > card side.
> >
> > The all you need is a drm module for your card. Even if the card doesn't
> > have scatter/gather support, drm allows you to allocate a piece of
> > consistent physical ram, and mmap it in the server. The handle you get is
> > the physical address, so you should be able to use that to implement PCI
> > dma transfers.
> Yup, but it's very likely that allocating physically contiguous memory
> will fail. The kernel isn't that good as keeping physical memory non
> fragmented, and thus, physical allocations above PAGE_SIZE are quite
> likely fail after boot.

Why this? The kernel has support for paging, so it could easily free up some 
continuous pages just by swapping them out if they are used.

> > > Nope. When writing/reading vram via MMIO, you go through the swappers
> > > on the PCI->VRAM path which are set for the bpp of the front buffer. If
> > > you picture you are up/downloading has a different bpp, you need to
> > > change the swapper during the access. Same problem with the fb*
> > > fallbacks in EXA, which is why I'm adding the Prepare/Finish hooks.
> >
> > Hmmm... but fbComposite can access up to three pixmaps, all of which can
> > have different bit depth. I don't really see how you could make that work
> > (as you need different swapping behavior for all three pixmaps.
> See my post "EXA prepare/finish hooks & random X questions". On radeon,
> for example, I can setup up to 8 "surfaces" that have different swapper
> setup on the PCI -> VRAM path.
> On cards that don't have such capability, I intend to have the
> PrepareAccess() hook fail, causing EXA to DownloadFromScreen() the
> pixmap to RAM before the composite operation.

Ok, I get it now. This makes sense for operations where you only write to a 

For pixmaps that operate as a source for the fbXxx commands (and this includes 
the dest pixmap in fbComposite) it might be better to download them directly, 
as you have to read the data from the framebuffer anyway. Doing this by mmio 
will be extremely slow on most cards, so the download hook gives you at least 
the change that it's faster.

When you have to read in something in 16bit color depth, even a simple memcpy 
based download implementation will be faster than mmio, as you at least copy 
the stuff 32bit wise from vidmem and not 16bit wise.


> > No idea. I am no expert here. As far as I can see in the code there is
> > some byteswapping happening in some places, but not in others. So the exa
> > code I have might very well be broken on big-endian machines (though I
> > can't test).
> It is, I verified it :) That's why I'm adding these hooks.

More information about the xorg mailing list