[Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

Thu Feb 22 21:16:52 UTC 2018

On Thu, Feb 22, 2018 at 1:49 PM, Bas Nieuwenhuizen
<bas at basnieuwenhuizen.nl> wrote:
> On Thu, Feb 22, 2018 at 7:04 PM, Kristian Høgsberg <hoegsberg at gmail.com> wrote:
>> On Wed, Feb 21, 2018 at 4:00 PM Alex Deucher <alexdeucher at gmail.com> wrote:
>>
>>> On Wed, Feb 21, 2018 at 1:14 AM, Chad Versace <chadversary at chromium.org>
>> wrote:
>>> > On Thu 21 Dec 2017, Daniel Vetter wrote:
>>> >> On Thu, Dec 21, 2017 at 12:22 AM, Kristian Kristensen <
>> hoegsberg at google.com> wrote:
>>> >>> On Wed, Dec 20, 2017 at 12:41 PM, Miguel Angel Vico <
>> mvicomoya at nvidia.com> wrote:
>>> >>>> On Wed, 20 Dec 2017 11:54:10 -0800 Kristian Høgsberg <
>> hoegsberg at gmail.com> wrote:
>>> >>>>> I'd like to see concrete examples of actual display controllers
>>> >>>>> supporting more format layouts than what can be specified with a 64
>>> >>>>> bit modifier.
>>> >>>>
>>> >>>> The main problem is our tiling and other metadata parameters can't
>>> >>>> generally fit in a modifier, so we find passing a blob of metadata a
>>> >>>> more suitable mechanism.
>>> >>>
>>> >>> I understand that you may have n knobs with a total of more than a
>> total of
>>> >>> 56 bits that configure your tiling/swizzling for color buffers. What
>> I don't
>>> >>> buy is that you need all those combinations when passing buffers
>> around
>>> >>> between codecs, cameras and display controllers. Even if you're
>> sharing
>>> >>> between the same 3D drivers in different processes, I expect just
>> locking
>>> >>> down, say, 64 different combinations (you can add more over time) and
>>> >>> assigning each a modifier would be sufficient. I doubt you'd extract
>>> >>> meaningful performance gains from going all the way to a blob.
>>> >
>>> > I agree with Kristian above. In my opinion, choosing to encode in
>>> > modifiers a precise description of every possible tiling/compression
>>> > layout is not technically incorrect, but I believe it misses the point.
>>> > The intention behind modifiers is not to exhaustively describe all
>>> > possibilites.
>>> >
>>> > I summarized this opinion in VK_EXT_image_drm_format_modifier,
>>> > where I wrote an "introdution to modifiers" section. Here's an excerpt:
>>> >
>>> >     One goal of modifiers in the Linux ecosystem is to enumerate for
>> each
>>> >     vendor a reasonably sized set of tiling formats that are
>> appropriate for
>>> >     images shared across processes, APIs, and/or devices, where each
>>> >     participating component may possibly be from different vendors.
>>> >     A non-goal is to enumerate all tiling formats supported by all
>> vendors.
>>> >     Some tiling formats used internally by vendors are inappropriate for
>>> >     sharing; no modifiers should be assigned to such tiling formats.
>>
>>> Where it gets tricky is how to select that subset?  Our tiling mode
>>> are defined more by the asic specific constraints than the tiling mode
>>> itself.  At a high level we have basically 3 tiling modes (out of 16
>>> possible) that would be the minimum we'd want to expose for gfx6-8.
>>> gfx9 uses a completely new scheme.
>>> 1. Linear (per asic stride requirements, not usable by many hw blocks)
>>> 2. 1D Thin (5 layouts, displayable, depth, thin, rotated, thick)
>>> 3. 2D Thin (1D tiling constraints, plus pipe config (18 possible),
>>> tile split (7 possible), sample split (4 possible), num banks (4
>>> possible), bank width (4 possible), bank height (4 possible), macro
>>> tile aspect (4 possible) all of which are asic config specific)
>>
>>> I guess we could do something like:
>>> AMD_GFX6_LINEAR_ALIGNED_64B
>>> AMD_GFX6_LINEAR_ALIGNED_256B
>>> AMD_GFX6_LINEAR_ALIGNED_512B
>>> AMD_GFX6_1D_THIN_DISPLAY
>>> AMD_GFX6_1D_THIN_DEPTH
>>> AMD_GFX6_1D_THIN_ROTATED
>>> AMD_GFX6_1D_THIN_THIN
>>> AMD_GFX6_1D_THIN_THICK
>>
>> AMD_GFX6_2D_1D_THIN_DISPLAY_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>>
>> AMD_GFX6_2D_1D_THIN_DEPTH_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>>
>> AMD_GFX6_2D_1D_THIN_ROTATED_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>>
>> AMD_GFX6_2D_1D_THIN_THIN_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>>
>> AMD_GFX6_2D_1D_THIN_THICK_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>>
>> AMD_GFX6_2D_1D_THIN_DISPLAY_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>>
>> AMD_GFX6_2D_1D_THIN_DEPTH_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>>
>> AMD_GFX6_2D_1D_THIN_ROTATED_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>>
>> AMD_GFX6_2D_1D_THIN_THIN_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>>
>> AMD_GFX6_2D_1D_THIN_THICK_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>>> etc.
>>
>>> We only probably need 40 bits to encode all of the tiling parameters
>>> so we could do family, plus tiling encoding that still seems unwieldy
>>> to deal with from an application perspective.  All of the parameters
>>> affect the alignment requirements.
>>
>> We discussed this earlier in the thread, here's what I said:
>>
>> Another point here is that the modifier doesn't need to encode all the
>> thing you have to communicate to the HW. For a given width, height, format,
>> compression type and maybe a few other high-level parameters, I'm skeptical
>> that the remaining tile parameters aren't just mechanically derivable using
>> a fixed table or formula. So instead of thinking of the modifiers as
>> something you can just memcpy into a state packet, it identifies a family
>> of configurations - enough information to deterministically derive the full
>> exact configuration. The formula may change, for example for different
>> hardware or if it's determined to not be optimal, and in that case, we can
>> use a new modifier to represent to new formula.
>
> I think this is not so much about being able to dump it in a state
> packet, but about sharing between different GPUs of AMD. We have
> basically only a few interesting tiling modes if you look at a single
> GPU, but checking if those are equal depends on the other bits  which
> may or may not be different per chip for the same conceptual tiling
> mode. We could just put a chip identifier in, but that would preclude
> any sharing while I think we can do some.

Right.  And the 2D ones, while they are the most complicated, are also
the most interesting from a performance perspective so ideally you'd
find a match on one of those.  If you don't expose the 2D modes,
there's not much point in supporting modifiers at all.

Alex