[Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation

Kenneth Graunke kenneth at whitecape.org
Tue Apr 4 22:15:02 UTC 2023


On Monday, April 3, 2023 9:48:40 AM PDT Ville Syrjälä wrote:
> On Mon, Apr 03, 2023 at 09:35:32AM -0700, Matt Roper wrote:
> > On Mon, Apr 03, 2023 at 07:02:08PM +0300, Ville Syrjälä wrote:
> > > On Fri, Mar 31, 2023 at 11:38:30PM -0700, fei.yang at intel.com wrote:
> > > > From: Fei Yang <fei.yang at intel.com>
> > > > 
> > > > To comply with the design that buffer objects shall have immutable
> > > > cache setting through out its life cycle, {set, get}_caching ioctl's
> > > > are no longer supported from MTL onward. With that change caching
> > > > policy can only be set at object creation time. The current code
> > > > applies a default (platform dependent) cache setting for all objects.
> > > > However this is not optimal for performance tuning. The patch extends
> > > > the existing gem_create uAPI to let user set PAT index for the object
> > > > at creation time.
> > > 
> > > This is missing the whole justification for the new uapi.
> > > Why is MOCS not sufficient?
> > 
> > PAT and MOCS are somewhat related, but they're not the same thing.  The
> > general direction of the hardware architecture recently has been to
> > slowly dumb down MOCS and move more of the important memory/cache
> > control over to the PAT instead.  On current platforms there is some
> > overlap (and MOCS has an "ignore PAT" setting that makes the MOCS "win"
> > for the specific fields that both can control), but MOCS doesn't have a
> > way to express things like snoop/coherency mode (on MTL), or class of
> > service (on PVC).  And if you check some of the future platforms, the
> > hardware design starts packing even more stuff into the PAT (not just
> > cache behavior) which will never be handled by MOCS.
> 
> Sigh. So the hardware designers screwed up MOCS yet again and
> instead of getting that fixed we are adding a new uapi to work
> around it?
> 
> The IMO sane approach (which IIRC was the situation for a few
> platform generations at least) is that you just shove the PAT
> index into MOCS (or tell it to go look it up from the PTE).
> Why the heck did they not just stick with that?

There are actually some use cases in newer APIs where MOCS doesn't
work well.  For example, VK_KHR_buffer_device_address in Vulkan 1.2:

https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VK_KHR_buffer_device_address.html

It essentially adds "pointers to buffer memory in shaders", where apps
can just get a 64-bit pointer, and use it with an address.  On our EUs,
that turns into A64 data port messages which refer directly to memory.
Notably, there's no descriptor (i.e. SURFACE_STATE) where you could
stuff a MOCS value.  So, you get one single MOCS entry for all such
buffers...which is specified in STATE_BASE_ADDRESS.  Hope you wanted
all of them to have the same cache & coherency settings!

With PAT/PTE, we can at least specify settings for each buffer, rather
than one global setting.

Compression has also been moving towards virtual address-based solutions
and handling in the caches and memory controller, rather than in e.g.
the sampler reading SURFACE_STATE.  (It started evolving that way with
Tigerlake, really, but continues.)

> > Also keep in mind that MOCS generally applies at the GPU instruction
> > level; although a lot of instructions have a field to provide a MOCS
> > index, or can use a MOCS already associated with a surface state, there
> > are still some that don't. PAT is the source of memory access
> > characteristics for anything that can't provide a MOCS directly.
> 
> So what are the things that don't have MOCS and where we need
> some custom cache behaviour, and we already know all that at
> buffer creation time?

For Meteorlake...we have MOCS for cache settings.  We only need to use
PAT for coherency settings; I believe we can get away with deciding that
up-front at buffer creation time.  If we were doing full cacheability,
I'd be very nervous about deciding performance tuning at creation time.

--Ken
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20230404/8b5d324d/attachment.sig>


More information about the dri-devel mailing list