[Mesa-dev] [PATCH v2 04/11] gallium: add endian_format field to struct pipe_resource

Thu Apr 14 15:08:02 UTC 2016

On Thu, Apr 14, 2016 at 5:47 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
> On Thu, Apr 14, 2016 at 10:34 AM, Oded Gabbay <oded.gabbay at gmail.com> wrote:
>> On Thu, Apr 14, 2016 at 5:01 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>>> This feels like an incredibly confused interface to me...
>> I agree it's not the prettiest design I made, but considering
>> big-endian is a class F citizen, then that's the best I got right now.
>> I'm open to suggestions, but please guys don't tell me "that's not
>> good but we have no idea how to solve it otherwise". It's not
>> constructive.
>
> Definitely didn't want to do that. Just want to create something
> that'll allow handling of all the various cases.
>
I know you mean well, I just vented some frustration :( Sorry.

>>
>>>> you talk
>>> about CPU <-> GPU, but I think that's a misnomer. What happens if you
>>> use up too much VRAM and something gets evicted to GART? Does it get
>>> byteswapped to the CPU-endianness?
>> Good point, I need to think about that. If that happens, it may
>> require a complementary fix in kernel driver. However, it's not
>> mutually exclusive to this patch-set.
>>
>>> Are you treating everything as
>>> 32-bit packed integers implicitly? What if you have, e.g., a R8 format
>>> (without the GBA bits)? Or even worse, a R16 format, where cpu bytes 0
>>> 1 2 3 would have to come out as 1 0 3 2.
>>
>> I don't think I have the problem you describe. I'm not treating
>> anything in an implicit matter. Each format get's its own handling.
>> Please remember that r600 AMD GPUs *do support* big-endian. So, I
>> configure them to know the color format (R8, R16, L4A4, ...), and I
>> configure endian (8 in 16 swap, 8 in 32 swap, ...) and also I can
>> configure swizzling. Then, the GPU handles all the swapping.
>> Therefore, if it supports a certain format, it knows how to swap it
>> correctly.
>
> Right. And NVIDIA has some of that too. And that's where a LOT of the
> confusion comes from. (At least for me.) Lacking docs doesn't help
> either =/
>
>>
>>>
>>> This seems like it might be enough for GL 1.3, but this sort of
>>> interface has to account for everything, IMO.
>>>
>> Well, for GL 3.3 with r600g, the piglit run on big-endian get's stuck
>> every time, after ~300 tests, even without my patches, so for sure its
>> broken. I just followed Marek's advise on how to solve it, and I
>> believe I reached a certain point where it is worth merging it.
>>
>>> I was also struggling with such issues when getting the nv30 driver to
>>> work on big endian. This was compounded by the issue that the NVIDIA
>>> GPUs have a kinda-sorta-but-not-really big-endian mode which swaps
>>> some things but not others. Basically you need to intelligently be
>>> able to say what "endian" the data is in where, and in order to do
>>> that, you need to know what format it's in.
>> That's what I'm trying to do here.
>>
>>  And that's where we hit
>>> some fails, because all the transfer_map/unmap stuff is formatless.
>>> And so there's no way to know whether you should be swapping or not.
>>
>> I tried to find a common ground for different formats, according to
>> how mesa/gallium treats them, and I believe I managed to do it to some
>> degree of success. I'm not saying its simple. There are different
>> paths in the code for the same action, and they behave differently.
>> e.g. in st_ReadPixels, you can read it using H/W blit or through
>> map+memcpy+unpack (fallback mode). So, for the same texture, it
>> depends which path you go through. In one path, I need to configure LE
>> and in the other path I need to configure BE.
>
> Wouldn't it make more sense to handle such issues in transfer_map?
> (i.e. create a staging memory area, and decode into it)? This assumes
> that the transfer_map() call has enough information to "do the right
> thing". I don't think it does today, but perhaps it could be taught?
It doesn't have all the info today, that's for sure. I imagine though
we can add parameters to it.

> That way everything that's in a pipe_resource is in some
> tightly-controlled format, and we specify the LE <-> BE parameters
> when converting between CPU-read/written and GPU-read/written data. I
> believe this is a better match for what's really happening, too. What
> do you think?
>
>   -ilia

Unless I'm missing something, I think, at the end of the day, it will
be the same issues as in my solution - per code path per format is a
different case. That's because you will still need to "teach"
transfer_map, per each transfer per format what to do. So one will
need to go and debug every single code path there is in mesa for
drawing/copying/reading/textures/etc., like what I did in the last 1.5
months. It's a great learning experience but it won't give anything
generic.

Again, for example, in st_ReadPixels, I imagine you will need to give
"different orders" to transfer_map for the two different scenarios -
H/W blit and fallback. So what's the gain here ?

If I'm missing something, please tell me.

    Oded