[Mesa-dev] Gallium pixel formats on big-endian

Mon Feb 4 11:17:18 PST 2013

On Don, 2013-01-31 at 07:18 -0800, Jose Fonseca wrote: 
> ----- Original Message -----
> > On Don, 2013-01-31 at 02:14 -0800, Jose Fonseca wrote:
> > > ----- Original Message -----
> > > > On Mit, 2013-01-30 at 08:35 -0800, Jose Fonseca wrote:
> > > > > ----- Original Message -----
> > > 
> > > Is this feasible, or are there APIs that need (i.e, require) to
> > > handle
> > > both LE/BE formats?
> > 
> > Not sure, but my impression has been that APIs tend to prefer the CPU
> > native byte order. Anything else makes little sense from an
> > application
> > POV. Still, I wouldn't be surprised if there were exceptions, e.g.
> > with
> > image/video APIs related to fixed file formats.
> 
> I'm not an expert on video, but the if video formats are exceptional
> we could treat them exceptionally (e.g., they are always defined in
> LE). I also suspect that certain compressed formats like DXTn are
> always defined in LE terms.

Quite possibly. 

> > > (Or hardware only capable of LE formats?)
> > 
> > Unfortunately, our Southern Islands GPUs no longer have facilities
> > for
> > byte-swapping vertex / texture data on the fly.
> 
> Hmm.. forcing such drivers to byte-swap internally is a tall order (it
> would be easier if state trackers could do it for all such hardware),
> so this kills the gallium == native idea...

I don't think it's quite that bad, as Alex pointed out.

> > > If not, would it be feasible to byte-swap at state tracker level?
> > 
> > That should certainly be feasible for texture data, as that generally
> > involves at least one copy anyway. However, it might hurt for
> > streaming
> > vertex data. Also, we might have to be careful not to require double
> > byte-swapping in cases where simple copies would work or no copies
> > would
> > be necessary in the first place.
> 
> We could consider a PIPE_CAP_BYTE_ORDER, with two values: NATIVE, or LE. 
> 
> Drivers for hardware with LE/BE would advertise NATIVE and implement
> it trivially.
> 
> For drivers with LE hardware, the state tracker would be responsible
> for byteswapping when platform is BE (or just invert swizzles for
> RG8/RGBA8 formats). (With exception of eventual video/compressed
> formats which are only defined on a given ). State trackers would also
> disable user pointers and always create VBOs/IBOs.
> 
> Byteswapping can be done in a fairly generic way independent of the
> format -- one just needs to know the swap unit (1,2,4, or 8 bytes) for
> the format, and strides. That is, we could have an helper
> util_copy_and_swap_rect.
> 
> It's a bit hackish, but it seems a good compromise.

I'm afraid I don't like it very much. Non-exhaustive list of reasons:

      * The cap is a separate information channel, so the formats aren't
        fully self-descriptive. I fear this would be prone to some code
        not taking the cap into account properly. 
      * The cap is all or nothing, but drivers may be able to take care
        of byte swapping basically for free for some formats but not for
        others.

> I agree that having _LE/_BE format variants would be cleaner, but also
> a lot more work: update helpers to handle both le/be variants, update
> the state trackers to use all le/be variants _and_ byte swap when
> API/HW features mismatch...

Right. After thinking about this a bit more, how about something like
this:

Define the packing to be in the host byte order. However, do not define
array formats as packed values (which makes little sense e.g. for
*32*32*32*32 anyway) but really just as arrays. 16- or 32-bit components
of arrays are again packed in host byte order. Array components of 4, 2
or 1 bits occupy bits [0,n-1] of byte 0, then bits [n,2n-1] of byte
0, ..., bits [0,n-1] of byte 1, and so on.

To make the distinction between packed and array formats clear, they
should probably use different naming schemes. The current scheme makes
more sense for packed formats I think, so I'd propose separating array
components e.g. by an additional underscore, and ordering components
from lower memory address left to higher address right. As an example,
PIPE_FORMAT_R8G8_UNORM would become PIPE_FORMAT_R8_G8_UNORM. It would
probably make sense to define packed naming aliases for array formats
where possible, e.g. PIPE_FORMAT_R8_R8_UNORM would alias to
PIPE_FORMAT_R8G8_UNORM (which would be identical to MESA_FORMAT_GR88) on
little endian and PIPE_FORMAT_G8R8_UNORM (== MESA_FORMAT_RG88) on big
endian hosts. (That's using the current component ordering for packed
formats; IMHO the reverse order would actually make more sense for
those, but that might require more invasive changes...)

Overall, this implies that e.g. the *8*8*8*8 and *8*8 formats mostly
remain the same as now. At the same time, it should make the format
definitions a better match for frontend APIs and software rendering
drivers on big endian hosts. Also, I think it should allow reconciling
the official definition of formats with how some of the current code
already assumes some of them are defined with relatively few changes.

For drivers which can't use hardware facilities for byte-swapping as
necessary, there could be utility helpers.

Does this make any sense? :)

> But I'm happy to let those that will do this work decide.

Likewise.

-- 
Earthling Michel Dänzer           |                   http://www.amd.com
Libre software enthusiast         |          Debian, X and DRI developer