[Mesa-dev] Gallium pixel formats on big-endian

Mon Feb 11 05:20:02 PST 2013

----- Original Message -----
> On Don, 2013-01-31 at 07:18 -0800, Jose Fonseca wrote:
> > ----- Original Message -----
> > > On Don, 2013-01-31 at 02:14 -0800, Jose Fonseca wrote:
> > > > ----- Original Message -----
> > > > > On Mit, 2013-01-30 at 08:35 -0800, Jose Fonseca wrote:
> > > > > > ----- Original Message -----
> > > > 
> > > > Is this feasible, or are there APIs that need (i.e, require) to
> > > > handle
> > > > both LE/BE formats?
> > > 
> > > Not sure, but my impression has been that APIs tend to prefer the CPU
> > > native byte order. Anything else makes little sense from an
> > > application
> > > POV. Still, I wouldn't be surprised if there were exceptions, e.g.
> > > with
> > > image/video APIs related to fixed file formats.
> > 
> > I'm not an expert on video, but the if video formats are exceptional
> > we could treat them exceptionally (e.g., they are always defined in
> > LE). I also suspect that certain compressed formats like DXTn are
> > always defined in LE terms.
> 
> Quite possibly.
>  
> > > > (Or hardware only capable of LE formats?)
> > > 
> > > Unfortunately, our Southern Islands GPUs no longer have facilities
> > > for
> > > byte-swapping vertex / texture data on the fly.
> > 
> > Hmm.. forcing such drivers to byte-swap internally is a tall order (it
> > would be easier if state trackers could do it for all such hardware),
> > so this kills the gallium == native idea...
> 
> I don't think it's quite that bad, as Alex pointed out.
> 
> 
> > > > If not, would it be feasible to byte-swap at state tracker level?
> > > 
> > > That should certainly be feasible for texture data, as that generally
> > > involves at least one copy anyway. However, it might hurt for
> > > streaming
> > > vertex data. Also, we might have to be careful not to require double
> > > byte-swapping in cases where simple copies would work or no copies
> > > would
> > > be necessary in the first place.
> > 
> > We could consider a PIPE_CAP_BYTE_ORDER, with two values: NATIVE, or LE.
> > 
> > Drivers for hardware with LE/BE would advertise NATIVE and implement
> > it trivially.
> > 
> > For drivers with LE hardware, the state tracker would be responsible
> > for byteswapping when platform is BE (or just invert swizzles for
> > RG8/RGBA8 formats). (With exception of eventual video/compressed
> > formats which are only defined on a given ). State trackers would also
> > disable user pointers and always create VBOs/IBOs.
> > 
> > Byteswapping can be done in a fairly generic way independent of the
> > format -- one just needs to know the swap unit (1,2,4, or 8 bytes) for
> > the format, and strides. That is, we could have an helper
> > util_copy_and_swap_rect.
> > 
> > It's a bit hackish, but it seems a good compromise.
> 
> I'm afraid I don't like it very much. Non-exhaustive list of reasons:
> 
>       * The cap is a separate information channel, so the formats aren't
>         fully self-descriptive. I fear this would be prone to some code
>         not taking the cap into account properly.

Good point.

>       * The cap is all or nothing, but drivers may be able to take care
>         of byte swapping basically for free for some formats but not for
>         others.
> 
> > I agree that having _LE/_BE format variants would be cleaner, but also
> > a lot more work: update helpers to handle both le/be variants, update
> > the state trackers to use all le/be variants _and_ byte swap when
> > API/HW features mismatch...
> 
> Right. After thinking about this a bit more, how about something like
> this:
> 
> Define the packing to be in the host byte order. However, do not define
> array formats as packed values (which makes little sense e.g. for
> *32*32*32*32 anyway) but really just as arrays. 16- or 32-bit components
> of arrays are again packed in host byte order. Array components of 4, 2
> or 1 bits occupy bits [0,n-1] of byte 0, then bits [n,2n-1] of byte
> 0, ..., bits [0,n-1] of byte 1, and so on.

Sounds great to me.

> To make the distinction between packed and array formats clear, they
> should probably use different naming schemes. The current scheme makes
> more sense for packed formats I think, so I'd propose separating array
> components e.g. by an additional underscore, and ordering components
> from lower memory address left to higher address right. As an example,
> PIPE_FORMAT_R8G8_UNORM would become PIPE_FORMAT_R8_G8_UNORM. It would
> probably make sense to define packed naming aliases for array formats
> where possible, e.g. PIPE_FORMAT_R8_R8_UNORM would alias to
> PIPE_FORMAT_R8G8_UNORM (which would be identical to MESA_FORMAT_GR88) on
> little endian and PIPE_FORMAT_G8R8_UNORM (== MESA_FORMAT_RG88) on big
> endian hosts. (That's using the current component ordering for packed
> formats; IMHO the reverse order would actually make more sense for
> those, but that might require more invasive changes...)

I prefer your later suggestion.

> Overall, this implies that e.g. the *8*8*8*8 and *8*8 formats mostly
> remain the same as now. At the same time, it should make the format
> definitions a better match for frontend APIs and software rendering
> drivers on big endian hosts. Also, I think it should allow reconciling
> the official definition of formats with how some of the current code
> already assumes some of them are defined with relatively few changes.
> 
> For drivers which can't use hardware facilities for byte-swapping as
> necessary, there could be utility helpers.
> 
> Does this make any sense? :)

Yes, I think it is very sensible.

Jose