[Intel-gfx] [PATCH 7/8] drm/i915: use pat_index instead of cache_level

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Mon Apr 24 09:22:59 UTC 2023


On 23/04/2023 07:52, Yang, Fei wrote:
>> On 20/04/2023 00:00, fei.yang at intel.com wrote:
>>> From: Fei Yang <fei.yang at intel.com>
>>>
>>> Currently the KMD is using enum i915_cache_level to set caching policy for
>>> buffer objects. This is flaky because the PAT index which really controls
>>> the caching behavior in PTE has far more levels than what's defined in the
>>> enum. In addition, the PAT index is platform dependent, having to translate
>>> between i915_cache_level and PAT index is not reliable, and makes the code
>>> more complicated.
>>>
>>> >From UMD's perspective there is also a necessity to set caching policy for
>>> performance fine tuning. It's much easier for the UMD to directly use PAT
>>> index because the behavior of each PAT index is clearly defined in Bspec.
>>> Having the abstracted i915_cache_level sitting in between would only cause
>>> more ambiguity.
>>>
>>> For these reasons this patch replaces i915_cache_level with PAT index. Also
>>> note, the cache_level is not completely removed yet, because the KMD still
>>> has the need of creating buffer objects with simple cache settings such as
>>> cached, uncached, or writethrough. For such simple cases, using cache_level
>>> would help simplify the code.
>>>
>>> Cc: Chris Wilson <chris.p.wilson at linux.intel.com>
>>> Cc: Matt Roper <matthew.d.roper at intel.com>
>>> Signed-off-by: Fei Yang <fei.yang at intel.com>
>>> Reviewed-by: Andi Shyti <andi.shyti at linux.intel.com>
>>
>> I think have some ideas no how to perhaps make this simpler, please bear
>> with me.
>>
>> In my mind get/set caching ioctls need to be failing once explicit pat
>> index has been set by userspace. Or at least not return false information.
> 
> By design we are ending the support for set caching ioctl. The patch is included
> in this series, "drm/i915/mtl: end support for set caching ioctl"
> 
> +       if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
> +               return -EOPNOTSUPP;
> +
> 
>> And I don't like i915_gem_object_has_cache_level and
>> i915_gem_get_pat_index as a refactoring step.
>>
>> It also seems that the driver has a need to query the caching mode set
>> regardless of the route (of setting).
> 
> Only for the objects created by the KMD. For UMD created objects with PAT
> index set KMD should never touch the setting.
> 
>> So how about this.
>>
>> Three callers which query the caching mode: use_cpu_reloc, vm_fault_gtt,
>> gpu_write_needs_clflush.
>>
>> We convert them to be like:
>>
>> i915_gem_object_has_caching_mode(obj, PAT_UC / PAT_WT / ...);
> 
> PAT_UC/WT/WB are platform dependent (https://gfxspecs.intel.com/Predator/Home/Index/45101),
> performing this check you would have to do something like,
> 
> if (MTL)
>          ...
> else if (PVC)
>          ...
> else if (GEN12)
>          ...
> else
>          ...

No, it would be doable with a table as I suggested below. And that table 
could be re-used for debugfs pretty-printing simplifying that code too.

>> Then apart from the per platform tables for mapping between cache level
>> to pat index, you add tables which map pat index to caching modes
>> (PAT_UC, etc, naming TBD, just enums or bitmasks also TBD, I haven't
>> looked at the bspec to see how exactly it works).
>>
>> You would use that table in the i915_gem_object_has_caching_mode helper,
>> called from the above three functions instead of obj->cache_level direct
>> comparison.
>>
>> I am assuming at least for instance cache_level != I915_CACHE_NONE would
>> be equivalent to i915_gem_object_has_caching_mode(obj, PAT_UC), etc.
> 
> So far kernel only needs 4 cache levels defined in enum i915_cache_level,
> kernel doesn't need to understand all PAT indices. By desgin if the userspace
> is setting PAT index directly, kernel only needs to pass the setting to PTE.
> 
> For objects created by kernel (including objects created by userspace without
> specifying pat index), there are only 4 options (defined in the cachelevel_to_pat).
> 
> For objects created by userspace with PAT index set (GEM_CREATE + set_pat extension),
> kernel should not touch the setting, just pass it to the PAT index bits in PTE.
> 
> That's why I was only checking cache_level. Handling PAT index is much more
> complicated because of its platform dependent nature and even the number of
> PAT indices varies from platform to platform. Fortunately kernel doesn't need
> to understand that.

Yeah but I think you maybe missed the spirit of my proposal - which is 
to simplify the internal code paths by not having the duality of 
cache_level-vs-pat almost all the way down. But instead cut it at the 
top API level.

You have this:

+	.cachelevel_to_pat = { \
+		[I915_CACHE_NONE]   = 0, \
+		[I915_CACHE_LLC]    = 1, \
+		[I915_CACHE_L3_LLC] = 2, \
+		[I915_CACHE_WT]     = 3, \
+	}

I propose to add something like:

.legacy_platform_pat = { /* pat index to driver logical flags */
     [0] = PAT_UC,
     [1] = PAT_WB | PAT_LLC, /* Just illustrating the principle */
};

i915->platform_pat = &legacy_platform_pat

i915_gem_object_has_caching_mode(obj, PAT_UC)
...
	return i915->platform_pat & PAT_UC == PAT_UC /* give or take */


get/set_caching_ioctl
{
...
	if (obj->has_pat_index) /* set in the extension only */
		return -EINVAL;

debugfs:

i915_show_pat_flags(i915->platform_pat[obj->pat_index]); /* generic! */

etc...

Set obj->pat_index in the extension or set_cache_level _only_. No 
internal code paths then need to use anything but it. No sprinkling of 
conversion helpers needed or dubious has_cache_level query.

Regards,

Tvrtko

>> Same mapping table could also be used in debugfs (i915_cache_level_str)
>> to universally describe any obj->pat_index, with no need to have
>> anything platform dependend there.
>>
>> In set caching set you always set obj->pat_index and so low level code
>> can always just use that.
>>
>> Unless I am missing something (possible) I think like that we end up
>> with no i915_gem_get_pat_index sprinkled around and also no confusing
>> i915_gem_object_has_cache_level.
>>
>> Obj->pat_index would be a single point of truth, while obj->cache_level
>> is just a legacy field for get/set_caching ioctl - not used in the
>> internal driver flows.
>>
>> We would need an additional field for storing the boolean of whether
>> userspace had overriden the PAT.
>>
>> Regards,
>>
>> Tvrtko


More information about the Intel-gfx mailing list