[RFC PATCH 0/3] Introducing I915_FORMAT_MOD_4_TILED_XE2_CCS Modifier for Xe2
Thomas Hellström
thomas.hellstrom at linux.intel.com
Tue May 14 16:51:09 UTC 2024
On Tue, 2024-05-14 at 12:25 +0300, Joonas Lahtinen wrote:
> Quoting Kenneth Graunke (2024-05-11 03:58:34)
> > On Tuesday, May 7, 2024 3:56:57 PM PDT Matt Roper wrote:
> > > On Mon, May 06, 2024 at 09:52:35PM +0300, Juha-Pekka Heikkila
> > > wrote:
> > > > These patches introduce I915_FORMAT_MOD_4_TILED_XE2_CCS
> > > > modifier, which,
> > > > from the kernel's perspective, behaves similarly to
> > `I915_FORMAT_MOD_4_TILED`.
> > > > This new modifier is primarily intended for user space to
> > > > effectively
> > monitor
> > > > compression status, especially when dealing with a mix of
> > > > compressed and
> > > > uncompressed buffers.
> > > >
> > > > The addition of this modifier facilitates user space in
> > > > managing
> > compression
> > > > status, particularly when utilizing both compressed and
> > > > uncompressed
> > buffers
> > > > concurrently. To leverage compression for these buffers, user
> > > > space
> > > > applications must configure the appropriate Page Attribute
> > > > Table (PAT)
> > index.
> > > > Display engine will treat all Tile4 as if it were compressed
> > > > under all
> > > > circumstances on Xe2 architecture.
> > >
> > > I may have missed some discussion about this, but I thought the
> > > previous
> > > consensus was that we didn't want/need new modifiers for
> > > compression on
> > > Xe2? If a userspace client (or the display hardware) receives a
> > > buffer
> > > of unknown origin and unknown compression status, it's always
> > > fine to
> > > select a compressed PAT when binding the buffer to read since
> > > even for
> > > uncompressed buffers the CCS metadata will accurately reflect the
> > > compression status. Unlike Xe1, where generating content without
> > > compression enabled would leave random garbage in the FlatCCS
> > > area, Xe2
> > > will set the corresponding FlatCCS to '0x0' for each block,
> > > indicating
> > > uncompressed data.
> > >
> > > Can you explain more what the benefit of handling these modifiers
> > > explicitly is?
> > >
> > >
> > > Matt
> >
> > Thanks, Matt! I'm a bit late in getting up to speed with the Xe2
> > compression
> > changes; this is really good information.
> >
> > As I understand it...all blocks on the GPU behave in the way you
> > mentioned,
> > where generating uncompressed data via the GPU will set FlatCCS =
> > 0, so you
> > can assume a compressed PAT entry and everything works.
> >
> > One snag is...I've heard that CPU access doesn't work that way.
> > So, if you
> > mmap a buffer on the CPU, and write data with the CPU, then I think
> > we're back
> > to the "FlatCCS contains uninitialized garbage" case, where it's
> > unsafe to
> > assume a compressed PAT. And... we don't really know when sharing
> > buffers
> > whether the other side is going to want to do CPU access.
>
> I think the previous discussion has specifically happened in the
> context of
> dma-buf, so not only CPU but other GPUs/accelerators/decoders/devices
> in the
> system are also relevant.
>
> It boils down to the fact that when exporting a dma-buf, one can't
> know it will
> be consumed only by the same GPU (or other device for that matter)
> unless there
> is an explicit negotiation between exporter and importers.
>
> > It would be really nice to assume compression by default, though,
> > which got me
> > thinking: if we mmap a buffer via DRM_XE_GEM_MMAP_OFFSET, could
> > xe.ko disable
> > compression for us? So, resolve any outstanding CCS data, and then
> > switch any
> > PAT entries to uncompressed. Mapping would block until that
> > resolve is done.
> > It could leave compression off forever (once you CPU map a buffer,
> > it's never
> > compressed again). Or it could turn CCS back on when map count
> > reaches 0 (but
> > frankly I'm not sure that's terribly important, and sounds more
> > complex).
>
> This would only really work for a single device but the dma-buf is
> specifically targeting more generic sharing. There's no built-in
> mechanism
> to limit the sharing to subset of devices without explicit
> negotiation
> between the exporter and importers.
>
> So I think the "by default" mode needs to be interoperable, and the
> explicit negotiation can then use less compatible formats given those
> FD
> are never passed to importers that were not part of the negotiation.
>
> > As I understand it, at least on discrete GPUs, the kernel already
> > has to do
> > something similar for eviction, when migrating BOs to system memory
> > (which
> > doesn't support compression). So this would be doing basically the
> > same
> > "resolve and disable CCS" step the kernel can presumably already
> > do, but now
> > on mmap as well.
>
> So far the eviction strategy has been to copy both the backing store
> and
> compression bits in raw form. With Xe2 it would indeed be possible to
> do
> an implicit resolve IFF the buffer has not been shared to someone who
> doesn't
> understand compression and might have left garbage in the CCS bits.
>
> When evicting in raw form, kernel doesn't have to know if the CCS
> bits
> are garbage or not on any given moment.
>
> Regards, Joonas
Just a follow-up comment (TBC), IF we're going for the everything-is-
compressed approach, I think there are some considerations to be made:
dma-buf exports to foreign devices need to resolve at map_attachment
time. Foreign devices are all devices that can't interpret the
compressed content.
dma-buf mmaps need to resolve. IIRC Could be implemented in the
DMA_BUF_IOCTL_SYNC callbacks that wrap cpu access.
dma-buf imports from foreign devices need to never use compressed PAT
for writing. Should KMD enforce this? Implicitly? Explicitly? I don't
see how UMD could know whether the imported dma-buf is from a foreign
device.
For mmaps of buffer objects of local device a resolve is needed. Having
KMD do that on a pagefault-basis is definitely possible, but will most
likely be terribly inefficient. Better to leave that to UMD?
/Thomas
>
> >
> > What do you think? Viable? Crazy? Have I missed something?
> >
> > --Ken
More information about the Intel-gfx
mailing list