EXA for radeon experimental patch

Benjamin Herrenschmidt benh at kernel.crashing.org
Thu Sep 1 02:06:38 PDT 2005

On Thu, 2005-09-01 at 09:16 +0200, Lars Knoll wrote:
> On Thursday 01 September 2005 00:41, Benjamin Herrenschmidt wrote:
> > > Another think I saw is that compositing onto the framebuffer is still
> > > always slow. It might be a good idea is EXA always used
> > > DownloadFromScreen (if it exists) to copy all pixmaps for a composite
> > > call into main memory before attempting to use fbComposite.
> >
> > DownloadFromScreen will be dead slow in many cases. Especially you can't
> > really rely on DMA to AGP memory here as a lot of chipsets have non
> > working write from GPU to AGP :(
> What about writes from GPU to PCI? Maybe these exist.

They do, though they require some kernel support to get to the physical
address of pages and need some proper scatter/gather support on the card

> If you can't provide an implementation that is significantly faster than just 
> a series of memcpy commands it's probably best just to not implement the 
> hook, as it won't do anything else than the fallback handling from EXA.

Nah, it has to play with the swappers on BE architectures.

> Whether the hook is implemented or not?

It has to be implemented anyway for big endian because of the swapper

> > > Now this is not true for shared memory architectures as the i810, so we
> > > would probably need some way to find out how slow framebuffer reads are
> > > (and how fast DownloadFromScreen is) and decide the strategy to use based
> > > on this information.
> >
> > BTW. Another issue I'm tackling at the moment is endianness & swappers.
> > When falling back, composite will end up drawing directly into pixmaps
> > in vram which have a different bit depth than the front buffer.
> As long as the Picture has the correct format (ie. the one that is in fact in 
> VRAM) it should all just work.

Nope. When writing/reading vram via MMIO, you go through the swappers on
the PCI->VRAM path which are set for the bpp of the front buffer. If you
picture you are up/downloading has a different bpp, you need to change
the swapper during the access. Same problem with the fb* fallbacks in
EXA, which is why I'm adding the Prepare/Finish hooks.

I don't think XAA ever accessed the VRAM with a different bpp than the
front buffer.

> How did this work in XAA? XAA did also fall back to fbComposite operating 
> directly on VRAM. The only change now is that you're you have more freedom of 
> what kind of format you store in VRAM.
> > What kind of mecanism does nvidia have for dealing with that issue ?
> Nvidia has an endianness flag you can set in various places which tells the HW 
> about the endianness of pixmaps etc. They are allways set to host endianness.

What about the PCI->vram front swappers ?


More information about the xorg mailing list