<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof ContentPasted0 ContentPasted1 ContentPasted2 ContentPasted3 ContentPasted4 ContentPasted5">
>> On 31/05/2023 18:10, fei.yang@intel.com wrote:
<div class="ContentPasted0">>>> From: Fei Yang <fei.yang@intel.com></div>
<div class="ContentPasted0">>>></div>
<div class="ContentPasted0">>>> To comply with the design that buffer objects shall have immutable</div>
<div class="ContentPasted0">>>> cache setting through out their life cycle, {set, get}_caching ioctl's</div>
<div class="ContentPasted0">>>> are no longer supported from MTL onward. With that change caching</div>
<div class="ContentPasted0">>>> policy can only be set at object creation time. The current code</div>
<div class="ContentPasted0">>>> applies a default (platform dependent) cache setting for all objects.</div>
<div class="ContentPasted0">>>> However this is not optimal for performance tuning. The patch extends</div>
<div class="ContentPasted0">>>> the existing gem_create uAPI to let user set PAT index for the object</div>
<div class="ContentPasted0">>>> at creation time.</div>
<div class="ContentPasted0">>>> The new extension is platform independent, so UMD's can switch to using</div>
<div class="ContentPasted0">>>> this extension for older platforms as well, while {set, get}_caching are</div>
<div class="ContentPasted0">>>> still supported on these legacy paltforms for compatibility reason.</div>
<div class="ContentPasted0">>>></div>
<div class="ContentPasted0">>>> Note: The detailed description of PAT index is missing in current PRM</div>
<div class="ContentPasted0">>>> even for older hardware and will be added by the next PRM update under</div>
<div class="ContentPasted0">>>> chapter name "Memory Views".</div>
<div class="ContentPasted0">>>></div>
<div class="ContentPasted0">>>> BSpec: 45101</div>
<div class="ContentPasted0">>>></div>
<div class="ContentPasted0">>>> Mesa support has been submitted in this merge request:</div>
<div class="ContentPasted0">>>> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22878</div>
<div class="ContentPasted0">>>></div>
<div class="ContentPasted0">>>> The media driver is supported by the following commits:</div>
<div class="ContentPasted0">>>> https://github.com/intel/media-driver/commit/92c00a857433ebb34ec575e9834f473c6fcb6341</div>
<div class="ContentPasted0">>>> https://github.com/intel/media-driver/commit/fd375cf2c5e1f6bf6b43258ff797b3134aadc9fd</div>
<div class="ContentPasted0">>>> https://github.com/intel/media-driver/commit/08dd244b22484770a33464c2c8ae85430e548000</div>
<div class="ContentPasted0">>></div>
<div class="ContentPasted0">>> On which platforms will media-driver use the uapi? I couldn't easily</div>
<div class="ContentPasted0">>> figure out myself from the links above and also in the master branch I</div>
<div class="ContentPasted0">>> couldn't find the implementation of CachePolicyGetPATIndex.</div>
<div class="ContentPasted0">></div>
<div class="ContentPasted0">> These commits look like platform independent. Carl, could you chime in here?</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">Confirmed with Carl and Lihao offline that the media driver is calling set_pat</div>
<div class="ContentPasted0">extension in common code path, so the use of set_pat extension is platform</div>
<div class="ContentPasted0">independent. The only problem right now is that the gmm library is not returning</div>
<div class="ContentPasted0">correct PAT index for all hardware platforms, so on some platforms the call would</div>
<div class="ContentPasted0">be bypassed and fall back to the old way.</div>
<div class="ContentPasted0">I think this is the correct implementation. It should be platform independent as</div>
long as the application knows what PAT index to set. Updating the gmm library to
<div class="ContentPasted1">understand PAT index for each hardware platform is a separate issue.</div>
<div><br class="ContentPasted1">
</div>
<div class="ContentPasted1">>> Now that PRMs for Tigerlake have been published and Meteorlake situation</div>
<div class="ContentPasted1">>> is documented indirectly in Mesa code, my only remaining concern is with</div>
<div class="ContentPasted1">>> the older platforms. So if there is no particular reason to have the</div>
<div class="ContentPasted1">>> extension working on those, I would strongly suggest we disable there.</div>
<div class="ContentPasted1">></div>
<div class="ContentPasted1">> What's the concern? There is no change required for older platforms, existing</div>
<div class="ContentPasted1">> user space code should continue to work. And this extension should be made</div>
<div class="ContentPasted1">> available for any new development because the cache settings for BO's need</div>
<div class="ContentPasted1">> to be immutable. And that is platform independent.</div>
<div class="ContentPasted1">></div>
<div class="ContentPasted1">>> For a precedent see I915_CONTEXT_PARAM_SSEU and how it allows the</div>
<div class="ContentPasted1">>> extension only on Gen11 and only for a very specific usecase (see</div>
<div class="ContentPasted1">>> restrictions in set_sseu() and i915_gem_user_to_context_sseu()).</div>
<div class="ContentPasted1">>></div>
<div class="ContentPasted1">>> Regards,</div>
<div class="ContentPasted1">>></div>
<div class="ContentPasted1">>> Tvrtko</div>
<div class="ContentPasted1">>></div>
<div class="ContentPasted1">>>></div>
<div class="ContentPasted1">>>> The IGT test related to this change is</div>
<div class="ContentPasted1">>>> igt@gem_create@create-ext-set-pat</div>
<div class="ContentPasted1">>>></div>
<div class="ContentPasted1">>>> Signed-off-by: Fei Yang <fei.yang@intel.com></div>
<div class="ContentPasted1">>>> Cc: Chris Wilson <chris.p.wilson@linux.intel.com></div>
<div class="ContentPasted1">>>> Cc: Matt Roper <matthew.d.roper@intel.com></div>
<div class="ContentPasted1">>>> Cc: Andi Shyti <andi.shyti@linux.intel.com></div>
<div class="ContentPasted1">>>> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com></div>
<div class="ContentPasted1">>>> Acked-by: Jordan Justen <jordan.l.justen@intel.com></div>
<div class="ContentPasted1">>>> Tested-by: Jordan Justen <jordan.l.justen@intel.com></div>
<div class="ContentPasted1">>>> Acked-by: Carl Zhang <carl.zhang@intel.com></div>
<div class="ContentPasted1">>>> Tested-by: Lihao Gu <lihao.gu@intel.com></div>
<div class="ContentPasted1">>>> ---</div>
<div class="ContentPasted1">>>> drivers/gpu/drm/i915/gem/i915_gem_create.c | 36 +++++++++++++++++++</div>
<div class="ContentPasted1">>>> drivers/gpu/drm/i915/gem/i915_gem_object.c | 6 ++++</div>
<div class="ContentPasted1">>>> include/uapi/drm/i915_drm.h | 41 ++++++++++++++++++++++</div>
<div class="ContentPasted1">>>> 3 files changed, 83 insertions(+)</div>
<div class="ContentPasted1">>>></div>
<div class="ContentPasted1">>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c b/drivers/gpu/drm/i915/gem/i915_gem_create.c</div>
>>> index bfe1dbda4cb7..644a936248ad 100644
<div class="ContentPasted2">>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_create.c</div>
<div class="ContentPasted2">>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c</div>
<div class="ContentPasted2">>>> @@ -245,6 +245,7 @@ struct create_ext {</div>
<div class="ContentPasted2">>>> unsigned int n_placements;</div>
<div class="ContentPasted2">>>> unsigned int placement_mask;</div>
<div class="ContentPasted2">>>> unsigned long flags;</div>
<div class="ContentPasted2">>>> + unsigned int pat_index;</div>
<div class="ContentPasted2">>>> };</div>
<div class="ContentPasted2">>>></div>
<div class="ContentPasted2">>>> static void repr_placements(char *buf, size_t size,</div>
<div class="ContentPasted2">>>> @@ -394,11 +395,39 @@ static int ext_set_protected(struct i915_user_extension __user *base, void *data</div>
<div class="ContentPasted2">>>> return 0;</div>
<div class="ContentPasted2">>>> }</div>
<div class="ContentPasted2">>>></div>
<div class="ContentPasted2">>>> +static int ext_set_pat(struct i915_user_extension __user *base, void *data)</div>
<div class="ContentPasted2">>>> +{</div>
<div class="ContentPasted2">>>> + struct create_ext *ext_data = data;</div>
<div class="ContentPasted2">>>> + struct drm_i915_private *i915 = ext_data->i915;</div>
<div class="ContentPasted2">>>> + struct drm_i915_gem_create_ext_set_pat ext;</div>
<div class="ContentPasted2">>>> + unsigned int max_pat_index;</div>
<div class="ContentPasted2">>>> +</div>
<div class="ContentPasted2">>>> + BUILD_BUG_ON(sizeof(struct drm_i915_gem_create_ext_set_pat) !=</div>
<div class="ContentPasted2">>>> + offsetofend(struct drm_i915_gem_create_ext_set_pat, rsvd));</div>
<div class="ContentPasted2">>>> +</div>
<div class="ContentPasted2">>>> + if (copy_from_user(&ext, base, sizeof(ext)))</div>
<div class="ContentPasted2">>>> + return -EFAULT;</div>
<div class="ContentPasted2">>>> +</div>
<div class="ContentPasted2">>>> + max_pat_index = INTEL_INFO(i915)->max_pat_index;</div>
<div class="ContentPasted2">>>> +</div>
<div class="ContentPasted2">>>> + if (ext.pat_index > max_pat_index) {</div>
<div class="ContentPasted2">>>> + drm_dbg(&i915->drm, "PAT index is invalid: %u\n",</div>
<div class="ContentPasted2">>>> + ext.pat_index);</div>
<div class="ContentPasted2">>>> + return -EINVAL;</div>
<div class="ContentPasted2">>>> + }</div>
<div class="ContentPasted2">>>> +</div>
<div class="ContentPasted2">>>> + ext_data->pat_index = ext.pat_index;</div>
<div class="ContentPasted2">>>> +</div>
<div class="ContentPasted2">>>> + return 0;</div>
<div class="ContentPasted2">>>> +}</div>
<div class="ContentPasted2">>>> +</div>
>>> static const i915_user_extension_fn create_extensions[] = {
<div class="ContentPasted3">>>> [I915_GEM_CREATE_EXT_MEMORY_REGIONS] = ext_set_placements,</div>
<div class="ContentPasted3">>>> [I915_GEM_CREATE_EXT_PROTECTED_CONTENT] = ext_set_protected,</div>
<div class="ContentPasted3">>>> + [I915_GEM_CREATE_EXT_SET_PAT] = ext_set_pat,</div>
<div class="ContentPasted3">>>> };</div>
<div class="ContentPasted3">>>></div>
<div class="ContentPasted3">>>> +#define PAT_INDEX_NOT_SET 0xffff</div>
<div class="ContentPasted3">>>> /**</div>
<div class="ContentPasted3">>>> * i915_gem_create_ext_ioctl - Creates a new mm object and returns a handle to it.</div>
<div class="ContentPasted3">>>> * @dev: drm device pointer</div>
<div class="ContentPasted3">>>> @@ -418,6 +447,7 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,</div>
<div class="ContentPasted3">>>> if (args->flags & ~I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS)</div>
<div class="ContentPasted3">>>> return -EINVAL;</div>
<div class="ContentPasted3">>>></div>
<div class="ContentPasted3">>>> + ext_data.pat_index = PAT_INDEX_NOT_SET;</div>
<div class="ContentPasted3">>>> ret = i915_user_extensions(u64_to_user_ptr(args->extensions),</div>
<div class="ContentPasted3">>>> create_extensions,</div>
<div class="ContentPasted3">>>> ARRAY_SIZE(create_extensions),</div>
<div class="ContentPasted3">>>> @@ -454,5 +484,11 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,</div>
<div class="ContentPasted3">>>> if (IS_ERR(obj))</div>
<div class="ContentPasted3">>>> return PTR_ERR(obj);</div>
<div class="ContentPasted3">>>></div>
<div class="ContentPasted3">>>> + if (ext_data.pat_index != PAT_INDEX_NOT_SET) {</div>
<div class="ContentPasted3">>>> + i915_gem_object_set_pat_index(obj, ext_data.pat_index);</div>
<div class="ContentPasted3">>>> + /* Mark pat_index is set by UMD */</div>
<div class="ContentPasted3">>>> + obj->pat_set_by_user = true;</div>
<div class="ContentPasted3">>>> + }</div>
<div class="ContentPasted3">>>> +</div>
<div class="ContentPasted3">>>> return i915_gem_publish(obj, file, &args->size, &args->handle);</div>
<div class="ContentPasted3">>>> }</div>
<div class="ContentPasted3">>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c</div>
<div class="ContentPasted3">>>> index 46a19b099ec8..97ac6fb37958 100644</div>
<div class="ContentPasted3">>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c</div>
<div class="ContentPasted3">>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c</div>
<div class="ContentPasted3">>>> @@ -208,6 +208,12 @@ bool i915_gem_object_can_bypass_llc(struct drm_i915_gem_object *obj)</div>
<div class="ContentPasted3">>>> if (!(obj->flags & I915_BO_ALLOC_USER))</div>
<div class="ContentPasted3">>>> return false;</div>
<div class="ContentPasted3">>>></div>
<div class="ContentPasted3">>>> + /*</div>
<div class="ContentPasted3">>>> + * Always flush cache for UMD objects at creation time.</div>
<div class="ContentPasted3">>>> + */</div>
>>> + if (obj->pat_set_by_user)
<div class="ContentPasted4">>>> + return true;</div>
<div class="ContentPasted4">>>> +</div>
<div class="ContentPasted4">>>> /*</div>
<div class="ContentPasted4">>>> * EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it</div>
<div class="ContentPasted4">>>> * possible for userspace to bypass the GTT caching bits set by the</div>
<div class="ContentPasted4">>>> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h</div>
<div class="ContentPasted4">>>> index f31dfacde601..4083a23e0614 100644</div>
<div class="ContentPasted4">>>> --- a/include/uapi/drm/i915_drm.h</div>
<div class="ContentPasted4">>>> +++ b/include/uapi/drm/i915_drm.h</div>
<div class="ContentPasted4">>>> @@ -3679,9 +3679,13 @@ struct drm_i915_gem_create_ext {</div>
<div class="ContentPasted4">>>> *</div>
<div class="ContentPasted4">>>> * For I915_GEM_CREATE_EXT_PROTECTED_CONTENT usage see</div>
<div class="ContentPasted4">>>> * struct drm_i915_gem_create_ext_protected_content.</div>
<div class="ContentPasted4">>>> + *</div>
<div class="ContentPasted4">>>> + * For I915_GEM_CREATE_EXT_SET_PAT usage see</div>
<div class="ContentPasted4">>>> + * struct drm_i915_gem_create_ext_set_pat.</div>
<div class="ContentPasted4">>>> */</div>
<div class="ContentPasted4">>>> #define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0</div>
<div class="ContentPasted4">>>> #define I915_GEM_CREATE_EXT_PROTECTED_CONTENT 1</div>
<div class="ContentPasted4">>>> +#define I915_GEM_CREATE_EXT_SET_PAT 2</div>
<div class="ContentPasted4">>>> __u64 extensions;</div>
<div class="ContentPasted4">>>> };</div>
<div class="ContentPasted4">>>></div>
<div class="ContentPasted4">>>> @@ -3796,6 +3800,43 @@ struct drm_i915_gem_create_ext_protected_content {</div>
<div class="ContentPasted4">>>> __u32 flags;</div>
<div class="ContentPasted4">>>> };</div>
<div class="ContentPasted4">>>></div>
<div class="ContentPasted4">>>> +/**</div>
<div class="ContentPasted4">>>> + * struct drm_i915_gem_create_ext_set_pat - The</div>
<div class="ContentPasted4">>>> + * I915_GEM_CREATE_EXT_SET_PAT extension.</div>
<div class="ContentPasted4">>>> + *</div>
<div class="ContentPasted4">>>> + * If this extension is provided, the specified caching policy (PAT index) is</div>
<div class="ContentPasted4">>>> + * applied to the buffer object.</div>
<div class="ContentPasted4">>>> + *</div>
<div class="ContentPasted4">>>> + * Below is an example on how to create an object with specific caching policy:</div>
<div class="ContentPasted4">>>> + *</div>
<div class="ContentPasted4">>>> + * .. code-block:: C</div>
<div class="ContentPasted4">>>> + *</div>
<div class="ContentPasted4">>>> + * struct drm_i915_gem_create_ext_set_pat set_pat_ext = {</div>
<div class="ContentPasted4">>>> + * .base = { .name = I915_GEM_CREATE_EXT_SET_PAT },</div>
>>> + * .pat_index = 0,
<div class="ContentPasted5">>>> + * };</div>
<div class="ContentPasted5">>>> + * struct drm_i915_gem_create_ext create_ext = {</div>
<div class="ContentPasted5">>>> + * .size = PAGE_SIZE,</div>
<div class="ContentPasted5">>>> + * .extensions = (uintptr_t)&set_pat_ext,</div>
<div class="ContentPasted5">>>> + * };</div>
<div class="ContentPasted5">>>> + *</div>
<div class="ContentPasted5">>>> + * int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);</div>
<div class="ContentPasted5">>>> + * if (err) ...</div>
<div class="ContentPasted5">>>> + */</div>
<div class="ContentPasted5">>>> +struct drm_i915_gem_create_ext_set_pat {</div>
<div class="ContentPasted5">>>> + /** @base: Extension link. See struct i915_user_extension. */</div>
<div class="ContentPasted5">>>> + struct i915_user_extension base;</div>
<div class="ContentPasted5">>>> + /**</div>
<div class="ContentPasted5">>>> + * @pat_index: PAT index to be set</div>
<div class="ContentPasted5">>>> + * PAT index is a bit field in Page Table Entry to control caching</div>
<div class="ContentPasted5">>>> + * behaviors for GPU accesses. The definition of PAT index is</div>
<div class="ContentPasted5">>>> + * platform dependent and can be found in hardware specifications,</div>
<div class="ContentPasted5">>>> + */</div>
<div class="ContentPasted5">>>> + __u32 pat_index;</div>
<div class="ContentPasted5">>>> + /** @rsvd: reserved for future use */</div>
<div class="ContentPasted5">>>> + __u32 rsvd;</div>
<div class="ContentPasted5">>>> +};</div>
<div class="ContentPasted5">>>> +</div>
<div class="ContentPasted5">>>> /* ID of the protected content session managed by i915 when PXP is active */</div>
<div class="ContentPasted5">>>> #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf</div>
<div class="ContentPasted5">>>></div>
<br>
</div>
</body>
</html>