[Mesa-dev] [PATCH v2 04/11] gallium: add endian_format field to struct pipe_resource

Mon Apr 18 14:47:01 UTC 2016

On Thu, Apr 14, 2016 at 6:44 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
> On Thu, Apr 14, 2016 at 11:08 AM, Oded Gabbay <oded.gabbay at gmail.com> wrote:
>>> Wouldn't it make more sense to handle such issues in transfer_map?
>>> (i.e. create a staging memory area, and decode into it)? This assumes
>>> that the transfer_map() call has enough information to "do the right
>>> thing". I don't think it does today, but perhaps it could be taught?
>> It doesn't have all the info today, that's for sure. I imagine though
>> we can add parameters to it.
>>
>>> That way everything that's in a pipe_resource is in some
>>> tightly-controlled format, and we specify the LE <-> BE parameters
>>> when converting between CPU-read/written and GPU-read/written data. I
>>> believe this is a better match for what's really happening, too. What
>>> do you think?
>>>
>>>   -ilia
>>
>> Unless I'm missing something, I think, at the end of the day, it will
>> be the same issues as in my solution - per code path per format is a
>> different case. That's because you will still need to "teach"
>> transfer_map, per each transfer per format what to do. So one will
>> need to go and debug every single code path there is in mesa for
>> drawing/copying/reading/textures/etc., like what I did in the last 1.5
>> months. It's a great learning experience but it won't give anything
>> generic.
>>
>> Again, for example, in st_ReadPixels, I imagine you will need to give
>> "different orders" to transfer_map for the two different scenarios -
>> H/W blit and fallback. So what's the gain here ?
>>
>> If I'm missing something, please tell me.
>
> One of us is... let's figure out which one :)
>
> Here's my proposal:
>
> All data stored inside of resources is stored in a driver-happy
> format. The driver ensures that it's stored in proper endianness, etc.
> (Much like it does today wrt proper stride.)
>
> Blitting(/copying) between resources doesn't require any additional
> information, since you have the format(s) of the respective resources,
> and it's all inside the driver, so the driver does whatever it needs
> to do to make it all "work".
>
> *Accessing and modifying* resources (directly) from the CPU is what
> becomes tricky. The state tracker may have incorrect expectations of
> the actual backing data. There are a few different ways to resolve
> this. The one I'm proposing is that you only ever return a pointer to
> the directly underlying data if it matches the CPU's expectations
> (which will only be the case for byte-oriented array formats like
> PIPE_FORMAT_R8G8B8A8_* & co). Everything else, like e.g.
> PIPE_FORMAT_R5G6B5_UNORM and countless others, will have to go through
> a bounce buffer.
>
> At transfer map time, you convert the data from GPU-style to
> CPU-style, and copy back the relevant bits at unmap/flush time.
>
> This presents a nice clean boundary for this stuff. Instead of the
> state tracker trying to guess what the driver will do and feeding it
> endiannesses that it can't possibly guess properly, the tracking logic
> is relegated to the driver, and we extend the interfaces to allow the
> state tracker to access the data in a proper way.
>
> I believe the advantage of this scheme is that beyond adding format
> parameters to pipe_transfer_map() calls, there will not need to be any
> adjustments to the state trackers.
>
> One yet-to-be-resolved issue is what to do about glMapBuffer* - it
> maps a buffer, it's formatless (at map time), and yet the GPU will be
> required to interpret it correctly. We could decree that PIPE_BUFFER
> is just *always* an array of R8_UNORM and thus never needs any type of
> swapping. The driver needs to adjust accordingly to deal with accesses
> that don't fit that pattern (and where parameters can't be fed to the
> GPU to interpret it properly).
>
> I think something like the above will work. And I think it presents a
> cleaner barrier than your proposal, because none of the "this GPU can
> kinda-sorta understand BE, but not everywhere" details are ever
> exposed to the state tracker.
>
> Thoughts?
>
>   -ilia

Ilia,

To make the GPU do a conversion during blitting, I need to configure
registers. This is done in a couple of functions in the r600g driver
(r600_translate_texformat, r600_colorformat_endian_swap,
r600_translate_colorformat and r600_translate_colorswap).

The problem is that transfer_map/unmap don't call directly to those
functions. They call other functions which eventually call those 4
functions. Among those "other" functions, there are several function
calls which are *not* in the r600g driver. i.e. we go back to generic
util functions. For example:

#0  r600_translate_colorformat
#1  evergreen_init_color_surface
#2  evergreen_set_framebuffer_state
#3  util_blitter_custom_depth_stencil
#4  r600_blit_decompress_depth
#5  r600_texture_transfer_map

Am I allowed to now pass information from transfer_map/unmap all the
way down to the 4 functions I mentioned through all these layers as
additional parameters ? I preferred to put it in pipe_resource as that
information goes all the way down to those functions, but if I can't
use that, then what's an acceptable alternative ?

This time, I would like to get an agreement *before* I implement it.

Thanks,

    Oded