[Mesa-dev] [PATCH v2 2/2] mesa: _mesa_format_convert should be endian agnostic

Jason Ekstrand jason at jlekstrand.net
Sat Aug 8 13:26:23 PDT 2015


On Sat, Aug 8, 2015 at 1:01 PM, Oded Gabbay <oded.gabbay at gmail.com> wrote:
> On Sat, Aug 8, 2015 at 7:34 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>> On Fri, Aug 7, 2015 at 10:38 PM, Oded Gabbay <oded.gabbay at gmail.com> wrote:
>>> On Sat, Aug 8, 2015 at 3:11 AM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>>>> On Fri, Aug 7, 2015 at 12:24 PM, Oded Gabbay <oded.gabbay at gmail.com> wrote:
>>>>> This patch fixes a bug that is manifested in the read path of mesa when
>>>>> running on big-endian machines. The effects can be seen when running
>>>>> piglit sanity test and/or taking a screen capture.
>>>>
>>>> piglit sanity isn't all that convincing.  It's quite possible that
>>>> there are two byte-swapping bugs that just happen to cancel out.  If
>>>> it fixes huge numbers of piglit tests, that's more convincing.
>>>>
>>>
>>> Well, without it about 65% of piglit tests fails. With this fix, about
>>> 30% fails. I would say that accounts for something...
>>
>> Yes, that counts for far more than just sanity.  It would be
>> instructive to see the piglit diff between 2ec8718d, 8ec6534b2, and
>> 5038d839b.  Those three patches are sequential and the one in the
>> middle, uses _mesa_format_convert for texture upload.
>>
>>>>> The bug is caused when _mesa_format_convert receives src_format as
>>>>> mesa_format, which it thens changes to mesa_array_format. During this
>>>>> change, it checks for endianness and swaps the bytes accordingly.
>>>>> However, because the bytes are _already_ swapped in the memory itself
>>>>> (being written there by llvmpipe/softpipe/sw rast), and src_format
>>>>> value matches the _actual_ contents of the memory, the result of the
>>>>> read is wrong.
>>>>>
>>>>> Therefore, because other layers/functions, such as llvm or
>>>>> read_rgba_pixels() takes care whether we are dealing with big-endian or
>>>>> little-endian, _mesa_format_convert should be endian agnostic to avoid
>>>>> duplicate swapping of bytes.
>>>>
>>>> As far as I know, the core mesa format conversion code has handled
>>>> byte-swapping for a very long time.  it is *not* simply punted to the
>>>> driver.  I'm 100% ready to believe that there is a problem but it's
>>>> probably far more subtle than just removing support for byte-swapping
>>>> in _mesa_format_to_array_format.
>>>
>>> First of all, this bug has been in a very long time as well (I tracked
>>> it to between 10.4 to 10.5). It has shown up since mesa started to use
>>> _mesa_format_convert
>>> From my debugging, I see that llvmpipe (and other) write the data to
>>> memory in BE format. e.g if the format is RGBA, then the data is
>>> actually written in AGBR format.
>>> Now, in _mesa_format_convert you get the src_format as AGBR (already
>>> swapped). However, because it arrives as format and not array_format,
>>> _mesa_format_convert needs to build the array_format in
>>> _mesa_format_to_array_format()
>>>
>>> In _mesa_format_to_array_format, after the array_format is created,
>>> you swap it if it is BE, so you now have RGBA in array_format
>>> representing the source. And that is the bug!
>>
>> Not quite... MESA_FORMAT formats are specified one of two ways: array
>> and packed.  Packed formats are specified in terms of low bits and
>> high bits in the word while array formats are specified in terms of an
>> array of elements of a certain type.  For the most part, packed is
>> used for things like 565 which obviously aren't array formats.
>> However, packed is also used for the normal 8-bit RGBA formats.  Since
>> they are specified in terms of low vs. high bits in a byte, they *do*
>> need to be swapped when going to an array format.  There was, indeed a
>> bug in there: we were swapping even if the MESA_FORMAT was already an
>> array format.  I just sent out a patch to fix that bug.
>>
>> It would also be instructive to know what the actual enums looked like
>> and what the data that you're refering to looked like.  The enum
>> values are described in mesa/main/formats.h.
>>
>> It's quite possible that, in all the BE cases, we aren't actually
>> treating the MESA_FORMATs as described above.  If that is the case,
>> then we should figure out what they do with the formats and either
>> change the BE cases or redefine the MESA_FORMATs in such a way that it
>> makes more sense on BE hardware.  However, as it currently stands
>> _mesa_format_to_array_format (as patched by the one I just sent)
>> matches the definitions.
>>
>>> of course I checked that _mesa_format_to_array_format() is only called
>>> from _mesa_format_convert so I can safely modify it according to
>>> _mesa_format_convert usage.
>>>
>>> So I don't think it is "far more subtle" than removing this
>>> byte-swapping. If you still think so, please elaborate.
>>
>> There are many places in the driver where we look at endianness and do
>> something with formats.  Obviously, something changed with 5038d839b;
>> we were taking something into account and aren't now or we weren't
>> then and are now.  What that something was, I'm not sure.
>>
>> Where all could the bug be?  Lots of places.  It could be in generated
>> pack/unpack code but I find that unlikely.  It could be where your
>> patch indicates but that seems to violate the definition of the
>> MESA_FORMATs.  It's possible that piglit is setting LSB_FIRST and
>> that's not getting respected.  If you're looking at gallium swrast or
>> llvmpipe, it could be in st/mesa where we convert from PIPE_FORMAT_*
>> to MESA_FORMAT_*.
>>
>> If you want to try and fix mesa for BE hardware, please do!  As ilia
>> said, most of the core mesa devs don't have BE hardware available.
>> However, that's probably going to be a long road (but possibly quite
>> rewarding).  I'm sure there are a *lot* of places where we get it
>> wrong and readpixels is just the start.
>>
>> --Jason
>>
> From my comparison of software rendering paths between ppc64, ppc64le
> and x86_64 using piglit gpu.py (software rendering only) as reference,
> ppc64 (BE) is definitely quite broken. Around 2000 tests are failing,
> vs. around 500 on ppc64le and 100 on x86_64
>
> I think my patch is a small step in a long road, but a step in the
> right direction nonetheless. I'm more focused on ppc64le but if I will
> be able to fix things as well for ppc64, I will certainly do it.
>
> Regarding the technical side, I looked at this problem from different
> angles. There maybe some mismatch here between mesa core and
> llvmpipe/softpipe that cause this problem (and others possibly), but
> in my opinion, the fix should be in mesa core for this specific
> problem, because:
> 1. llvmpipe/softpipe write the bytes *correctly* to the memory (so
> fixing there might break that)

If it's claiming the format is BGRA and writting as ARGB then that's
not correct.

> 2. _mesa_format_convert, according to its own documentation, should
> not handle byte swapping

That's referring to the GL_SWAP_BYTES parameter in glPixelStorei.  It
*does*, however, properly handle going from one mesa format or array
format (as defined in formats.h) to another regardless of endianness.
Modulo the patch I sent this morning, it does exactly what it claims
to do.

> Regarding the enums, _mesa_format_convert gets
> MESA_FORMAT_A8B8G8R8_UNORM, but the "real" format is actually RGBA.
> However, because its BE, the data is written in memory in format of
> ABGR. The entity that changes RGBA to ABGR is llvmpipe in st_format.c,
> when converting pipe to mesa format. notice the distinction between BE
> and LE in p_format.h

The way that gallium defines its formats is a bit strange.  Basically
it comes down to "follow C structure packing rules".  Doing a direct
PIPE_FORMAT to MESA_FORMAT conversion as is done in st_formats.c is
wrong and probably always has been wrong.  (The way mesa defines
formats didn't change in the format conversion series.)  The only
reason why no one has noticed up until this point is that gallium
hooks and does it's own thing for a lot of the texture upload stuff
and so those may not touch core mesa.  Apparently, gallium drivers
still use core mesa for readpixels.  It sounds to me like readpixels
was actually fixed by the patch in question but, because gallium was
passing mesa the wrong formats, it works before and doesn't now.

My recommendation would actually be to change st_formats.c to use the
A8R8G8B8 forms instead of the ARGB8888 forms for formats that mesa
considers "packed".  The former actually matches the definition in
mesa because it's specified in terms of low bits to high bits like
mesa does.

--Jason

>     Oded
>
>
>
>>> Thanks,
>>>          Oded
>>>
>>>>
>>>>> btw, it is also mentioned in the comment of the function that it doesn't
>>>>> handle byte-swapping, so the original code contradicts the documentation.
>>>>>
>>>>> Signed-off-by: Oded Gabbay <oded.gabbay at gmail.com>
>>>>> CC: "10.5 10.6" <mesa-stable at lists.freedesktop.org>
>>>>> Signed-off-by: Oded Gabbay <oded.gabbay at gmail.com>
>>>>> ---
>>>>>  src/mesa/main/formats.c | 6 ++----
>>>>>  1 file changed, 2 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/src/mesa/main/formats.c b/src/mesa/main/formats.c
>>>>> index baeb1bf..7a41cb0 100644
>>>>> --- a/src/mesa/main/formats.c
>>>>> +++ b/src/mesa/main/formats.c
>>>>> @@ -372,10 +372,8 @@ uint32_t
>>>>>  _mesa_format_to_array_format(mesa_format format)
>>>>>  {
>>>>>     const struct gl_format_info *info = _mesa_get_format_info(format);
>>>>> -   if (_mesa_little_endian())
>>>>> -      return info->ArrayFormat;
>>>>> -   else
>>>>> -      return _mesa_array_format_flip_channels(info->ArrayFormat);
>>>>> +
>>>>> +   return info->ArrayFormat;
>>>>>  }
>>>>>
>>>>>  static struct hash_table *format_array_format_table;
>>>>> --
>>>>> 2.4.3
>>>>>
>>>>> _______________________________________________
>>>>> mesa-dev mailing list
>>>>> mesa-dev at lists.freedesktop.org
>>>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list