[RFC] drm/amdgpu: Add macros and documentation for format modifiers.

Tue Sep 4 10:04:43 UTC 2018

On Tue, Sep 4, 2018 at 3:00 AM, Bas Nieuwenhuizen
<bas at basnieuwenhuizen.nl> wrote:
> This is an initial proposal for format modifiers for AMD hardware.
>
> It uses 48 bits including a chip generation, leaving 8 bits for
> a format version number.
>
> On gfx6-gfx8 we have all the fields influencing sample locations
> in memory.
>
> Tile split bytes are optional for single sample buffers as no
> hardware reaches the split size with 1 sample and hence the actual
> size does not matter.
>
> The macrotile fields are duplicated for images with multiple planes.
> If the planes have different bitdepth they need different macro
> tile fields and different tile split bytes if multisample.
>
> I could not fit multiple copies in for tile split bytes, but
> multisample & multiplane images are very rare. Overall, I think
> we should punt on multisample for a later format version since
> they are generally not shared on any modifier aware windowing
> system, and we have more issues like fmask & cmask.
>
> We need these copies because the drm modifier of all planes in an
> image needs to be equal, so we need to fit these together.
>
> This adds fields for compression support, using metadata that is
> compatible with AMDVLK and for which radv and radeonsi can
> reasonably be extended.
>
> The big open question for compression is between which generations
> the format changed to see if we can share more.
>
> This explicitly does not try to solve the linear stride alignment
> issue, thoguh we could internally just use the tiling modes for
> the linear modes to indicate linear images with the stride for the
> given chip.
>
> Signed-off-by: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>
> CC: Chad Versace <chadversary at chromium.org>
> CC: Dave Airlie <airlied at redhat.com>
> CC: Marek Olšák <marek.olsak at amd.com>
> CC: Nicolai Hähnle <nicolai.haehnle at amd.com>
> CC: Alex Deucher <alexander.deucher at amd.com>
> CC: Daniel Vetter <daniel.vetter at ffwll.ch>
> ---
>  include/uapi/drm/amdgpu_drm.h | 130 ++++++++++++++++++++++++++++++++++
>  1 file changed, 130 insertions(+)
>
> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> index 94444eeba55b..4e1452161dbf 100644
> --- a/include/uapi/drm/amdgpu_drm.h
> +++ b/include/uapi/drm/amdgpu_drm.h
> @@ -990,6 +990,136 @@ struct drm_amdgpu_info_vce_clock_table {
>  #define AMDGPU_FAMILY_AI                       141 /* Vega10 */
>  #define AMDGPU_FAMILY_RV                       142 /* Raven */
>
> +#define AMDGPU_CHIP_TAHITI     0
> +#define AMDGPU_CHIP_PITCAIRN   1
> +#define AMDGPU_CHIP_VERDE      2
> +#define AMDGPU_CHIP_OLAND      3
> +#define AMDGPU_CHIP_HAINAN     4
> +#define AMDGPU_CHIP_BONAIRE    5
> +#define AMDGPU_CHIP_KAVERI     6
> +#define AMDGPU_CHIP_KABINI     7
> +#define AMDGPU_CHIP_HAWAII     8
> +#define AMDGPU_CHIP_MULLINS    9
> +#define AMDGPU_CHIP_TOPAZ      10
> +#define AMDGPU_CHIP_TONGA      11
> +#define AMDGPU_CHIP_FIJI       12
> +#define AMDGPU_CHIP_CARRIZO    13
> +#define AMDGPU_CHIP_STONEY     14
> +#define AMDGPU_CHIP_POLARIS10  15
> +#define AMDGPU_CHIP_POLARIS11  16
> +#define AMDGPU_CHIP_POLARIS12  17
> +#define AMDGPU_CHIP_VEGAM      18
> +#define AMDGPU_CHIP_VEGA10     19
> +#define AMDGPU_CHIP_VEGA12     20
> +#define AMDGPU_CHIP_VEGA20     21
> +#define AMDGPU_CHIP_RAVEN      22
> +
> +/* Format for GFX6-GFX8 DRM format modifiers. These are intentionally the same
> + * as AMDGPU_TILING_*. However, the the rules as to when to set them are
> + * different.
> + *
> + * Do not use linear ARRAY_MODEs or SWIZZLE_MODEs. Use DRM_FORMAT_MOD_LINEAR
> + * instead.
> + *
> + * If ARRAY_MODE is 1D, only the micro tile mode and the pipe config should be
> + * set.
> + *
> + * For other ARRAY_MODEs:
> + *  - Only set TILE_SPLIT if the image is multisample.
> + *
> + * We have 1 extra bit for the micro tile mode, as GFX6 and GFX7+ have 1
> + * different value there. The values are
> + *   - depth           : 0
> + *   - displayable     : 1
> + *   - thin            : 2
> + *   - thick (GFX6)    : 3
> + *   - rotated (GFX7+) : 4
> + *
> + * TODO: What to do with multisample multi plane images? More tile split
> + * fields don't fit if we want to keep a few bits for a format version.
> + */
> +#define AMDGPU_MODIFIER_GFX8_ARRAY_MODE_SHIFT          0
> +#define AMDGPU_MODIFIER_GFX8_ARRAY_MODE_MASK           0xf
> +#define AMDGPU_MODIFIER_GFX8_PIPE_CONFIG_SHIFT         4
> +#define AMDGPU_MODIFIER_GFX8_PIPE_CONFIG_MASK          0x1f
> +#define AMDGPU_MODIFIER_GFX8_TILE_SPLIT_SHIFT          9
> +#define AMDGPU_MODIFIER_GFX8_TILE_SPLIT_MASK           0x7
> +#define AMDGPU_MODIFIER_GFX8_MICRO_TILE_MODE_SHIFT     12
> +#define AMDGPU_MODIFIER_GFX8_MICRO_TILE_MODE_MASK      0x7
> +#define AMDGPU_MODIFIER_GFX8_BANK_WIDTH_SHIFT          15
> +#define AMDGPU_MODIFIER_GFX8_BANK_WIDTH_MASK           0x3
> +#define AMDGPU_MODIFIER_GFX8_BANK_HEIGHT_SHIFT         17
> +#define AMDGPU_MODIFIER_GFX8_BANK_HEIGHT_MASK          0x3
> +#define AMDGPU_MODIFIER_GFX8_MACRO_TILE_ASPECT_SHIFT   19
> +#define AMDGPU_MODIFIER_GFX8_MACRO_TILE_ASPECT_MASK    0x3
> +#define AMDGPU_MODIFIER_GFX8_NUM_BANKS_SHIFT           21
> +#define AMDGPU_MODIFIER_GFX8_NUM_BANKS_MASK            0x3
> +
> +/* Macrotile parameters for a second plane if existing */
> +#define AMDGPU_MODIFIER_GFX8_BANK_WIDTH_1_SHIFT                23
> +#define AMDGPU_MODIFIER_GFX8_BANK_WIDTH_1_MASK         0x3
> +#define AMDGPU_MODIFIER_GFX8_BANK_HEIGHT_1_SHIFT       25
> +#define AMDGPU_MODIFIER_GFX8_BANK_HEIGHT_1_MASK                0x3
> +#define AMDGPU_MODIFIER_GFX8_MACRO_TILE_ASPECT_1_SHIFT 27
> +#define AMDGPU_MODIFIER_GFX8_MACRO_TILE_ASPECT_1_MASK  0x3
> +#define AMDGPU_MODIFIER_GFX8_NUM_BANKS_1_SHIFT         29
> +#define AMDGPU_MODIFIER_GFX8_NUM_BANKS_1_MASK          0x3
> +
> +/* Macrotile parameters for a third plane if existing */
> +#define AMDGPU_MODIFIER_GFX8_BANK_WIDTH_2_SHIFT                31
> +#define AMDGPU_MODIFIER_GFX8_BANK_WIDTH_2_MASK         0x3
> +#define AMDGPU_MODIFIER_GFX8_BANK_HEIGHT_2_SHIFT       33
> +#define AMDGPU_MODIFIER_GFX8_BANK_HEIGHT_2_MASK                0x3
> +#define AMDGPU_MODIFIER_GFX8_MACRO_TILE_ASPECT_2_SHIFT 35
> +#define AMDGPU_MODIFIER_GFX8_MACRO_TILE_ASPECT_2_MASK  0x3
> +#define AMDGPU_MODIFIER_GFX8_NUM_BANKS_2_SHIFT         37
> +#define AMDGPU_MODIFIER_GFX8_NUM_BANKS_2_MASK          0x3
> +
> +#define AMDGPU_MODIFIER_GFX9_SWIZZLE_MODE_SHIFT                0
> +#define AMDGPU_MODIFIER_GFX9_SWIZZLE_MODE_MASK         0x1f
> +
> +/* Whether to enable DCC compression.
> + *
> + * If enabled, exporting the surface results in three
> + * planes:
> + *   - color data
> + *   - DCC data
> + *   - a 64-byte block with
> + *     - a 16 byte 0/1 bool as to whether the surface is currently DCC compressed.
> + *     - a 16-byte 0/1 bool as to whether the surface has fastclear data
> + *     - a 8-byte chunk with the current fastclear colors
> + *
> + * To ensure we do not keep compressing and decompressing the surface, once it
> + * has been decompressed no party may recompress again.
> + *
> + * Applications should not hand over images with fastclear data as not
> + * all users can support it, however, to help both Vulkan implementations
> + * with the allocation we keep it in the 64-byte block.
> + *
> + * TODO: Can scanout really not support fastclear data?
> + * TODO: What to do with multiplane images?
> + */
> +#define AMDGPU_MODIFIER_COMPRESSION_SHIFT              39
> +#define AMDGPU_MODIFIER_COMPRESSION_MASK               0x1

Summary of the irc discussion with my 2 recommendations:

- Make it clear in the spec here that all consumers must follow the 16
byte boolean to control actual DCC usage (using a conditional register
write in the command buffer). On intel we've put that into the
modifier itself, along the lines of "I don't like compressed anymore,
pls give me something else". Not that runtime renogiation works well
(or well, at all).

- If you have consumers which can't handle fastclear, make it a
separate modifier.

> +
> +/* The chip this is compatible with.
> + *
> + * If compression is disabled, use
> + *   - AMDGPU_CHIP_TAHITI for GFX6-GFX8
> + *   - AMDGPU_CHIP_VEGA10 for GFX9+
> + *
> + * With compression enabled please use the exact chip.
> + *
> + * TODO: Do some generations share DCC format?
> + */
> +#define AMDGPU_MODIFIER_CHIP_GEN_SHIFT                 40
> +#define AMDGPU_MODIFIER_CHIP_GEN_MASK                  0xff

Do you really need all the combinations here of DCC + gpu gen + tiling
details? When we had the entire discussion with nvidia folks they
eventually agreed that they don't need the massive pile with every
possible combination. Do you really plan to share all these different
things?

Note that e.g. on i915 we spec some of the tiling depending upon
buffer size and buffer format (because that's how the hw works), not
using explicit modifier flags for everything.
-Daniel

> +
> +#define AMDGPU_MODIFIER_SET(field, value) \
> +       (((__u64)(value) & AMDGPU_MODIFIER_##field##_MASK) << AMDGPU_MODIFIER_##field##_SHIFT)
> +#define AMDGPU_MODIFIER_GET(value, field) \
> +       (((__u64)(value) >> AMDGPU_MODIFIER_##field##_SHIFT) & AMDGPU_MODIFIER_##field##_MASK)
> +
>  /*
>   * Definition of free sync enter and exit signals
>   * We may have more options in the future
> --
> 2.18.0
>

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch