[Mesa-dev] Gallium pixel formats on big-endian

Thu Jan 31 07:18:36 PST 2013

----- Original Message -----
> On Don, 2013-01-31 at 02:14 -0800, Jose Fonseca wrote:
> > ----- Original Message -----
> > > On Mit, 2013-01-30 at 08:35 -0800, Jose Fonseca wrote:
> > > > 
> > > > ----- Original Message -----
> > > > > For another example (which I suspect is more relevant for
> > > > > this
> > > > > thread),
> > > > > wouldn't it be nice if the software rendering drivers could
> > > > > directly
> > > > > represent the window system renderbuffer format as a Gallium
> > > > > format
> > > > > in
> > > > > all cases?
> > > > 
> > > > I'm missing your point, could you give an example of where
> > > > that's
> > > > currently not possible?
> > > 
> > > E.g. an XImage of depth 16, where the pixels are generally packed
> > > in
> > > big
> > > endian if the X server runs on a big endian machine. It's
> > > impossible
> > > to
> > > represent that with PIPE_FORMAT_*5*6*5_UNORM packed in little
> > > endian.
> > 
> > I see.
> > 
> > Is this something that could be worked around?
> 
> Basically anything can be worked around somehow, right? :)
> 
> But in this example, it seems like it would require some kind of
> sideband information to specify that PIPE_FORMAT_*5*6*5_UNORM
> actually
> has the reversed byte order now, and some layer of the stack to use
> that
> and swap the bytes accordingly. So, extra copies and an extra
> information channel (and possibly a layering violation).
> 
> 
> > > > > I can't help feeling it would be better to treat endianness
> > > > > explicitly
> > > > > rather than implicitly in the format description, so drivers
> > > > > and
> > > > > state
> > > > > trackers could choose to use little/big/native/foreign endian
> > > > > formats
> > > > > as
> > > > > appropriate for the hardware and APIs they're dealing with.
> > > > 
> > > > What you mean by explicitly vs implicitly? Do you mean
> > > > r5g6b5_be,
> > > > r5g6b5_le, r32g32b32a32_unorm_le, r32g32b32a32_unorm_be, etc?
> > > 
> > > Yeah, something like that, with the byte order only applying
> > > within
> > > each
> > > component for array formats.
> > 
> > I don't oppose that. But it does seem a lot of work.
> 
> I'm afraid so.
> 
> > How would hardware drivers handle this? Specially those that have a
> > single LE/BE bit to choose?
> 
> I guess drivers would advertise the formats they can and want to
> support
> given the hardware capabilities and target platforms. For drivers
> which
> only have to worry about little endian environments, basically
> nothing
> should change except for the format names and maybe other similar
> details.
> 
> 
> > (BTW, I do believe we should unify Mesa format handling and
> > Gallium's
> > u_format module into a shared external helper library for formats
> > before we venture into that though as the effort of doing that
> > would
> > pretty much double.
> 
> That might be a good idea. The Mesa format code seems to have grown
> some
> warts of its own anyway.
> 
> 
> > I think it is also worth considering the other extreme: all formats
> > are expected to be LE on LE platforms, BE on BE platforms.
> 
> Right. I think that might be preferable over LE always, if we decide
> not
> to support both LE/BE explicitly.
> 
> > Is this feasible, or are there APIs that need (i.e, require) to
> > handle
> > both LE/BE formats?
> 
> Not sure, but my impression has been that APIs tend to prefer the CPU
> native byte order. Anything else makes little sense from an
> application
> POV. Still, I wouldn't be surprised if there were exceptions, e.g.
> with
> image/video APIs related to fixed file formats.

I'm not an expert on video, but the if video formats are exceptional we could treat them exceptionally (e.g., they are always defined in LE). I also suspect that certain compressed formats like DXTn are always defined in LE terms.

> > (Or hardware only capable of LE formats?)
> 
> Unfortunately, our Southern Islands GPUs no longer have facilities
> for
> byte-swapping vertex / texture data on the fly.

Hmm.. forcing such drivers to byte-swap internally is a tall order (it would be easier if state trackers could do it for all such hardware), so this kills the gallium == native idea...

> > If not, would it be feasible to byte-swap at state tracker level?
> 
> That should certainly be feasible for texture data, as that generally
> involves at least one copy anyway. However, it might hurt for
> streaming
> vertex data. Also, we might have to be careful not to require double
> byte-swapping in cases where simple copies would work or no copies
> would
> be necessary in the first place.

We could consider a PIPE_CAP_BYTE_ORDER, with two values: NATIVE, or LE. 

Drivers for hardware with LE/BE would advertise NATIVE and implement it trivially.

For drivers with LE hardware, the state tracker would be responsible for byteswapping when platform is BE (or just invert swizzles for RG8/RGBA8 formats).  (With exception of eventual video/compressed formats which are only defined on a given ). State trackers would also disable user pointers and always create VBOs/IBOs.

Byteswapping can be done in a fairly generic way independent of the format -- one just needs to know the swap unit (1,2,4, or 8 bytes) for the format, and strides. That is, we could have an helper util_copy_and_swap_rect.

It's a bit hackish, but it seems a good compromise.

I agree that having _LE/_BE format variants would be cleaner, but also a lot more work: update helpers to handle both le/be variants, update the state trackers to use all le/be variants _and_ byte swap when API/HW features mismatch...

But I'm happy to let those that will do this work decide.

Jose