<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof ContentPasted0">
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> Informal commit message for now.</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> I got a bit impatient and curious to see if the idea we discussed would</div>
<div class="ContentPasted0">> work so sketched something out. I think it is what I was describing back</div>
<div class="ContentPasted0">> then..</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">Oops, you beat me on this, shame on me.</div>
<div class="ContentPasted0"> </div>
<div class="ContentPasted0">> So high level idea is to teach the driver what caching modes are hidden</div>
<div class="ContentPasted0">> behind PAT indices. Given you already had that in static tables, if we</div>
<div class="ContentPasted0">> just turn the tables a bit around and add a driver abstraction of caching</div>
<div class="ContentPasted0">> modes this is what happens:</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> * We can lose the ugly runtime i915_gem_get_pat_index.</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> * We can have a smarter i915_gem_object_has_cache_level, which now can</div>
<div class="ContentPasted0">> use the above mentioned table to understand the caching modes and so</div>
<div class="ContentPasted0">> does not have to pessimistically return true for _any_ input when user</div>
<div class="ContentPasted0">> has set the PAT index. This may improve things even for MTL.</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> * We can simplify the debugfs printout to be platform agnostic.</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> * We are perhaps opening the door to un-regress the dodgy addition</div>
<div class="ContentPasted0">> made to i915_gem_object_can_bypass_llc? See QQQ/FIXME in the patch.</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> I hope I did not forget anything, but anyway, please have a read and see</div>
<div class="ContentPasted0">> what you think. I think it has potential.</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> Proper commit message can come later.</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com></div>
<div class="ContentPasted0">> Cc: Fei Yang <fei.yang@intel.com></div>
<div class="ContentPasted0">> ---</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/Makefile | 1 +</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/gem/i915_gem_domain.c | 34 ++---</div>
<div class="ContentPasted0">> .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 13 +-</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/gem/i915_gem_mman.c | 10 +-</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/gem/i915_gem_object.c | 78 ++++-------</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/gem/i915_gem_object.h | 18 ++-</div>
<div class="ContentPasted0">> .../gpu/drm/i915/gem/i915_gem_object_types.h | 99 +-------------</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 7 +-</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c | 26 ++--</div>
<div class="ContentPasted0">> .../gpu/drm/i915/gem/selftests/huge_pages.c | 2 +-</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/gt/gen6_ppgtt.c | 4 +-</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 13 +-</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/gt/intel_ggtt.c | 9 +-</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/gt/intel_migrate.c | 11 +-</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/gt/selftest_migrate.c | 9 +-</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/gt/selftest_reset.c | 14 +-</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/gt/selftest_timeline.c | 5 +-</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/gt/selftest_tlb.c | 5 +-</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 8 +-</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/i915_cache.c | 59 ++++++++</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/i915_cache.h | 129 ++++++++++++++++++</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/i915_debugfs.c | 83 ++++++-----</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/i915_driver.c | 3 +</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/i915_drv.h | 3 +</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/i915_gem.c | 21 +--</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/i915_gpu_error.c | 7 +-</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/i915_pci.c | 83 +++++------</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/intel_device_info.h | 6 +-</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/selftests/i915_gem.c | 5 +-</div>
<div class="ContentPasted0">> .../gpu/drm/i915/selftests/i915_gem_evict.c | 4 +-</div>
<div class="ContentPasted0">> drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 13 +-</div>
<div class="ContentPasted0">> .../drm/i915/selftests/intel_memory_region.c | 4 +-</div>
<div class="ContentPasted0">> .../gpu/drm/i915/selftests/mock_gem_device.c | 10 +-</div>
<div class="ContentPasted0">> 33 files changed, 415 insertions(+), 381 deletions(-)</div>
<div class="ContentPasted0">> create mode 100644 drivers/gpu/drm/i915/i915_cache.c</div>
<div class="ContentPasted0">> create mode 100644 drivers/gpu/drm/i915/i915_cache.h</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile</div>
<div class="ContentPasted0">> index 2be9dd960540..2c3da8f0c78e 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/Makefile</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/Makefile</div>
<div class="ContentPasted0">> @@ -30,6 +30,7 @@ subdir-ccflags-y += -I$(srctree)/$(src)</div>
<div class="ContentPasted0">> # core driver code</div>
<div class="ContentPasted0">> i915-y += i915_driver.o \</div>
<div class="ContentPasted0">> i915_drm_client.o \</div>
<div class="ContentPasted0">> + i915_cache.o \</div>
<div class="ContentPasted0">> i915_config.o \</div>
<div class="ContentPasted0">> i915_getparam.o \</div>
<div class="ContentPasted0">> i915_ioctl.o \</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c</div>
<div class="ContentPasted0">> index dfaaa8b66ac3..49bfae45390f 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c</div>
<div class="ContentPasted0">> @@ -8,6 +8,7 @@</div>
<div class="ContentPasted0">> #include "display/intel_frontbuffer.h"</div>
<div class="ContentPasted0">> #include "gt/intel_gt.h"</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> +#include "i915_cache.h"</div>
<div class="ContentPasted0">> #include "i915_drv.h"</div>
<div class="ContentPasted0">> #include "i915_gem_clflush.h"</div>
<div class="ContentPasted0">> #include "i915_gem_domain.h"</div>
<div class="ContentPasted0">> @@ -27,15 +28,8 @@ static bool gpu_write_needs_clflush(struct drm_i915_gem_object *obj)</div>
<div class="ContentPasted0">> if (IS_DGFX(i915))</div>
<div class="ContentPasted0">> return false;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> - /*</div>
<div class="ContentPasted0">> - * For objects created by userspace through GEM_CREATE with pat_index</div>
<div class="ContentPasted0">> - * set by set_pat extension, i915_gem_object_has_cache_level() will</div>
<div class="ContentPasted0">> - * always return true, because the coherency of such object is managed</div>
<div class="ContentPasted0">> - * by userspace. Othereise the call here would fall back to checking</div>
<div class="ContentPasted0">> - * whether the object is un-cached or write-through.</div>
<div class="ContentPasted0">> - */</div>
<div class="ContentPasted0">> - return !(i915_gem_object_has_cache_level(obj, I915_CACHE_NONE) ||</div>
<div class="ContentPasted0">> - i915_gem_object_has_cache_level(obj, I915_CACHE_WT));</div>
<div class="ContentPasted0">> + return i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_UC) != 1 &&</div>
<div class="ContentPasted0">> + i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_WT) != 1;</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">Why is it necessary to define I915_CACHE_MODE's while there is already i915_cache_level?</div>
<div class="ContentPasted0">I thought we wanted to get rid of such abstractions instead of adding more.</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">This patch also introduced INTEL_INFO(i915)->cache_modes, why don't we directly add the</div>
<div class="ContentPasted0">platform specific PAT there? For example, add the following for MTL,</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">INTEL_INFO(i915)->pat[] = {</div>
<div class="ContentPasted0"> [0] = MTL_PPAT_L4_0_WB, \</div>
<div class="ContentPasted0"> [1] = MTL_PPAT_L4_1_WT, \</div>
<div class="ContentPasted0"> [2] = MTL_PPAT_L4_3_UC, \</div>
<div class="ContentPasted0"> [3] = MTL_PPAT_L4_0_WB | MTL_2_COH_1W, \</div>
<div class="ContentPasted0"> [4] = MTL_PPAT_L4_0_WB | MTL_3_COH_2W, \</div>
<div class="ContentPasted0">}</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">Everything here has already been defined, no need to introduce new macros.</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">This can also be used to initialize the PAT index registers, like in</div>
<div class="ContentPasted0">xelpg_setup_private_ppat() and xelpmp_setup_private_ppat().</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> bool i915_gem_cpu_write_needs_clflush(struct drm_i915_gem_object *obj)</div>
<div class="ContentPasted0">> @@ -272,15 +266,18 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)</div>
<div class="ContentPasted0">> int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">> enum i915_cache_level cache_level)</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">s/enum i915_cache_level cache_level/unsigned int pat_index</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">This function is for KMD objects only, I don't think we even need to keep</div>
<div class="ContentPasted0">the i915_cache_level, simply passing in INTEL_INFO(i915)->pat_uc/wb/wt is</div>
<div class="ContentPasted0">good enough.</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> {</div>
<div class="ContentPasted0">> + struct drm_i915_private *i915 = to_i915(obj->base.dev);</div>
<div class="ContentPasted0">> + i915_cache_t mode;</div>
<div class="ContentPasted0">> int ret;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> - /*</div>
<div class="ContentPasted0">> - * For objects created by userspace through GEM_CREATE with pat_index</div>
<div class="ContentPasted0">> - * set by set_pat extension, simply return 0 here without touching</div>
<div class="ContentPasted0">> - * the cache setting, because such objects should have an immutable</div>
<div class="ContentPasted0">> - * cache setting by desgin and always managed by userspace.</div>
<div class="ContentPasted0">> - */</div>
<div class="ContentPasted0">> - if (i915_gem_object_has_cache_level(obj, cache_level))</div>
<div class="ContentPasted0">> + if (obj->pat_set_by_user)</div>
<div class="ContentPasted0">> + return -EINVAL;</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">I don't think this condition would ever be true, but okay to keep it.</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> + ret = i915_cache_level_to_pat_and_mode(i915, cache_level, &mode);</div>
<div class="ContentPasted0">> + if (ret < 0)</div>
<div class="ContentPasted0">> + return -EINVAL;</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> + if (mode == obj->cache_mode)</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">The above lines can be just one line,</div>
<div class="ContentPasted0"> if (pat_index == obj->pat_index)</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> return 0;</div>
<div class="ContentPasted0">></div>
<div class="ContentPasted0">> ret = i915_gem_object_wait(obj,</div>
<div class="ContentPasted0">> @@ -326,10 +323,9 @@ int i915_gem_get_caching_ioctl(struct drm_device *dev, void *data,</div>
<div class="ContentPasted0">> goto out;</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> - if (i915_gem_object_has_cache_level(obj, I915_CACHE_LLC) ||</div>
<div class="ContentPasted0">> - i915_gem_object_has_cache_level(obj, I915_CACHE_L3_LLC))</div>
<div class="ContentPasted0">> + if (i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_WB))</div>
<div class="ContentPasted0">> args->caching = I915_CACHING_CACHED;</div>
<div class="ContentPasted0">> - else if (i915_gem_object_has_cache_level(obj, I915_CACHE_WT))</div>
<div class="ContentPasted0">> + else if (i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_WT))</div>
<div class="ContentPasted0">> args->caching = I915_CACHING_DISPLAY;</div>
<div class="ContentPasted0">> else</div>
<div class="ContentPasted0">> args->caching = I915_CACHING_NONE;</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c</div>
<div class="ContentPasted0">> index d3208a325614..ee85221fa6eb 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c</div>
<div class="ContentPasted0">> @@ -640,15 +640,9 @@ static inline int use_cpu_reloc(const struct reloc_cache *cache,</div>
<div class="ContentPasted0">> if (DBG_FORCE_RELOC == FORCE_GTT_RELOC)</div>
<div class="ContentPasted0">> return false;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> - /*</div>
<div class="ContentPasted0">> - * For objects created by userspace through GEM_CREATE with pat_index</div>
<div class="ContentPasted0">> - * set by set_pat extension, i915_gem_object_has_cache_level() always</div>
<div class="ContentPasted0">> - * return true, otherwise the call would fall back to checking whether</div>
<div class="ContentPasted0">> - * the object is un-cached.</div>
<div class="ContentPasted0">> - */</div>
<div class="ContentPasted0">> return (cache->has_llc ||</div>
<div class="ContentPasted0">> obj->cache_dirty ||</div>
<div class="ContentPasted0">> - !i915_gem_object_has_cache_level(obj, I915_CACHE_NONE));</div>
<div class="ContentPasted0">> + i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_UC) != 1);</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> static int eb_reserve_vma(struct i915_execbuffer *eb,</div>
<div class="ContentPasted0">> @@ -1329,10 +1323,7 @@ static void *reloc_iomap(struct i915_vma *batch,</div>
<div class="ContentPasted0">> if (drm_mm_node_allocated(&cache->node)) {</div>
<div class="ContentPasted0">> ggtt->vm.insert_page(&ggtt->vm,</div>
<div class="ContentPasted0">> i915_gem_object_get_dma_address(obj, page),</div>
<div class="ContentPasted0">> - offset,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(ggtt->vm.i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE),</div>
<div class="ContentPasted0">> - 0);</div>
<div class="ContentPasted0">> + offset, eb->i915->pat_uc, 0);</div>
<div class="ContentPasted0">> } else {</div>
<div class="ContentPasted0">> offset += page << PAGE_SHIFT;</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c</div>
<div class="ContentPasted0">> index aa4d842d4c5a..5e21aedb02d2 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c</div>
<div class="ContentPasted0">> @@ -386,13 +386,11 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf)</div>
<div class="ContentPasted0">> /*</div>
<div class="ContentPasted0">> * For objects created by userspace through GEM_CREATE with pat_index</div>
<div class="ContentPasted0">> * set by set_pat extension, coherency is managed by userspace, make</div>
<div class="ContentPasted0">> - * sure we don't fail handling the vm fault by calling</div>
<div class="ContentPasted0">> - * i915_gem_object_has_cache_level() which always return true for such</div>
<div class="ContentPasted0">> - * objects. Otherwise this helper function would fall back to checking</div>
<div class="ContentPasted0">> - * whether the object is un-cached.</div>
<div class="ContentPasted0">> + * sure we don't fail handling the vm fault by making sure that we</div>
<div class="ContentPasted0">> + * know the object is uncached or that we have LLC.</div>
<div class="ContentPasted0">> */</div>
<div class="ContentPasted0">> - if (!(i915_gem_object_has_cache_level(obj, I915_CACHE_NONE) ||</div>
<div class="ContentPasted0">> - HAS_LLC(i915))) {</div>
<div class="ContentPasted0">> + if (i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_UC) != 1 &&</div>
<div class="ContentPasted0">> + !HAS_LLC(i915)) {</div>
<div class="ContentPasted0">> ret = -EFAULT;</div>
<div class="ContentPasted0">> goto err_unpin;</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c</div>
<div class="ContentPasted0">> index 0004d5fa7cc2..52c6c5f09bdd 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c</div>
<div class="ContentPasted0">> @@ -45,33 +45,6 @@ static struct kmem_cache *slab_objects;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> static const struct drm_gem_object_funcs i915_gem_object_funcs;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> -unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,</div>
<div class="ContentPasted0">> - enum i915_cache_level level)</div>
<div class="ContentPasted0">> -{</div>
<div class="ContentPasted0">> - if (drm_WARN_ON(&i915->drm, level >= I915_MAX_CACHE_LEVEL))</div>
<div class="ContentPasted0">> - return 0;</div>
<div class="ContentPasted0">> -</div>
<div class="ContentPasted0">> - return INTEL_INFO(i915)->cachelevel_to_pat[level];</div>
<div class="ContentPasted0">> -}</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">Yes, this can be removed. INTEL_INFO(i915)->pat_uc/wb/wt should be sufficient,</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> -</div>
<div class="ContentPasted0">> -bool i915_gem_object_has_cache_level(const struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">> - enum i915_cache_level lvl)</div>
<div class="ContentPasted0">> -{</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">If we had INTEL_INFO(i915)->pat[] setup, it would be easier just to keep this</div>
<div class="ContentPasted0">function, because we can simply check the cache policy bit field in</div>
<div class="ContentPasted0">INTEL_INFO(i915)->pat[obj->pat_index] to see whether it is cached, uncached,</div>
<div class="ContentPasted0">or write-through.</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">See Bspec https://gfxspecs.intel.com/Predator/Home/Index/44235</div>
<div class="ContentPasted0">For MTL check bit[3:2], for other gen12 platforms check bit[1:0]</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> - /*</div>
<div class="ContentPasted0">> - * In case the pat_index is set by user space, this kernel mode</div>
<div class="ContentPasted0">> - * driver should leave the coherency to be managed by user space,</div>
<div class="ContentPasted0">> - * simply return true here.</div>
<div class="ContentPasted0">> - */</div>
<div class="ContentPasted0">> - if (obj->pat_set_by_user)</div>
<div class="ContentPasted0">> - return true;</div>
<div class="ContentPasted0">> -</div>
<div class="ContentPasted0">> - /*</div>
<div class="ContentPasted0">> - * Otherwise the pat_index should have been converted from cache_level</div>
<div class="ContentPasted0">> - * so that the following comparison is valid.</div>
<div class="ContentPasted0">> - */</div>
<div class="ContentPasted0">> - return obj->pat_index == i915_gem_get_pat_index(obj_to_i915(obj), lvl);</div>
<div class="ContentPasted0">> -}</div>
<div class="ContentPasted0">> -</div>
<div class="ContentPasted0">> struct drm_i915_gem_object *i915_gem_object_alloc(void)</div>
<div class="ContentPasted0">> {</div>
<div class="ContentPasted0">> struct drm_i915_gem_object *obj;</div>
<div class="ContentPasted0">> @@ -144,6 +117,24 @@ void __i915_gem_object_fini(struct drm_i915_gem_object *obj)</div>
<div class="ContentPasted0">> dma_resv_fini(&obj->base._resv);</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> +void __i915_gem_object_update_coherency(struct drm_i915_gem_object *obj)</div>
<div class="ContentPasted0">> +{</div>
<div class="ContentPasted0">> + struct drm_i915_private *i915 = to_i915(obj->base.dev);</div>
<div class="ContentPasted0">> + const unsigned int mode = I915_CACHE_MODE(obj->cache_mode);</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">obj->cache_mode seems to be redundant if we have INTEL_INFO(i915)->pat[] and</div>
<div class="ContentPasted0">obj->pat_index.</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> + if (!(mode == I915_CACHE_MODE_UNKNOWN || mode == I915_CACHE_MODE_UC))</div>
<div class="ContentPasted0">> + obj->cache_coherent = (I915_BO_CACHE_COHERENT_FOR_READ |</div>
<div class="ContentPasted0">> + I915_BO_CACHE_COHERENT_FOR_WRITE);</div>
<div class="ContentPasted0">> + else if (mode != I915_CACHE_MODE_UNKNOWN && HAS_LLC(i915))</div>
<div class="ContentPasted0">> + obj->cache_coherent = I915_BO_CACHE_COHERENT_FOR_READ;</div>
<div class="ContentPasted0">> + else</div>
<div class="ContentPasted0">> + obj->cache_coherent = 0;</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> + obj->cache_dirty =</div>
<div class="ContentPasted0">> + !(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE) &&</div>
<div class="ContentPasted0">> + !IS_DGFX(i915);</div>
<div class="ContentPasted0">> +}</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> /**</div>
<div class="ContentPasted0">> * i915_gem_object_set_cache_coherency - Mark up the object's coherency levels</div>
<div class="ContentPasted0">> * for a given cache_level</div>
<div class="ContentPasted0">> @@ -154,20 +145,15 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">> unsigned int cache_level)</div>
<div class="ContentPasted0">> {</div>
<div class="ContentPasted0">> struct drm_i915_private *i915 = to_i915(obj->base.dev);</div>
<div class="ContentPasted0">> + i915_cache_t mode;</div>
<div class="ContentPasted0">> + int found;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> - obj->pat_index = i915_gem_get_pat_index(i915, cache_level);</div>
<div class="ContentPasted0">> + found = i915_cache_level_to_pat_and_mode(i915, cache_level, &mode);</div>
<div class="ContentPasted0">> + GEM_WARN_ON(found < 0);</div>
<div class="ContentPasted0">> + obj->pat_index = found;</div>
<div class="ContentPasted0">> + obj->cache_mode = mode;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> - if (cache_level != I915_CACHE_NONE)</div>
<div class="ContentPasted0">> - obj->cache_coherent = (I915_BO_CACHE_COHERENT_FOR_READ |</div>
<div class="ContentPasted0">> - I915_BO_CACHE_COHERENT_FOR_WRITE);</div>
<div class="ContentPasted0">> - else if (HAS_LLC(i915))</div>
<div class="ContentPasted0">> - obj->cache_coherent = I915_BO_CACHE_COHERENT_FOR_READ;</div>
<div class="ContentPasted0">> - else</div>
<div class="ContentPasted0">> - obj->cache_coherent = 0;</div>
<div class="ContentPasted0">> -</div>
<div class="ContentPasted0">> - obj->cache_dirty =</div>
<div class="ContentPasted0">> - !(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE) &&</div>
<div class="ContentPasted0">> - !IS_DGFX(i915);</div>
<div class="ContentPasted0">> + __i915_gem_object_update_coherency(obj);</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> /**</div>
<div class="ContentPasted0">> @@ -187,18 +173,9 @@ void i915_gem_object_set_pat_index(struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">> return;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> obj->pat_index = pat_index;</div>
<div class="ContentPasted0">> + obj->cache_mode = INTEL_INFO(i915)->cache_modes[pat_index];</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> - if (pat_index != i915_gem_get_pat_index(i915, I915_CACHE_NONE))</div>
<div class="ContentPasted0">> - obj->cache_coherent = (I915_BO_CACHE_COHERENT_FOR_READ |</div>
<div class="ContentPasted0">> - I915_BO_CACHE_COHERENT_FOR_WRITE);</div>
<div class="ContentPasted0">> - else if (HAS_LLC(i915))</div>
<div class="ContentPasted0">> - obj->cache_coherent = I915_BO_CACHE_COHERENT_FOR_READ;</div>
<div class="ContentPasted0">> - else</div>
<div class="ContentPasted0">> - obj->cache_coherent = 0;</div>
<div class="ContentPasted0">> -</div>
<div class="ContentPasted0">> - obj->cache_dirty =</div>
<div class="ContentPasted0">> - !(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE) &&</div>
<div class="ContentPasted0">> - !IS_DGFX(i915);</div>
<div class="ContentPasted0">> + __i915_gem_object_update_coherency(obj);</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> bool i915_gem_object_can_bypass_llc(struct drm_i915_gem_object *obj)</div>
<div class="ContentPasted0">> @@ -215,6 +192,7 @@ bool i915_gem_object_can_bypass_llc(struct drm_i915_gem_object *obj)</div>
<div class="ContentPasted0">> /*</div>
<div class="ContentPasted0">> * Always flush cache for UMD objects at creation time.</div>
<div class="ContentPasted0">> */</div>
<div class="ContentPasted0">> + /* QQQ/FIXME why? avoidable performance penalty? */</div>
<div class="ContentPasted0">> if (obj->pat_set_by_user)</div>
<div class="ContentPasted0">> return true;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h</div>
<div class="ContentPasted0">> index 884a17275b3a..f84f41e9f81f 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h</div>
<div class="ContentPasted0">> @@ -13,6 +13,7 @@</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> #include "display/intel_frontbuffer.h"</div>
<div class="ContentPasted0">> #include "intel_memory_region.h"</div>
<div class="ContentPasted0">> +#include "i915_cache.h"</div>
<div class="ContentPasted0">> #include "i915_gem_object_types.h"</div>
<div class="ContentPasted0">> #include "i915_gem_gtt.h"</div>
<div class="ContentPasted0">> #include "i915_gem_ww.h"</div>
<div class="ContentPasted0">> @@ -32,10 +33,18 @@ static inline bool i915_gem_object_size_2big(u64 size)</div>
<div class="ContentPasted0">> return false;</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> -unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,</div>
<div class="ContentPasted0">> - enum i915_cache_level level);</div>
<div class="ContentPasted0">> -bool i915_gem_object_has_cache_level(const struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">> - enum i915_cache_level lvl);</div>
<div class="ContentPasted0">> +static inline int</div>
<div class="ContentPasted0">> +i915_gem_object_has_cache_mode(const struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">> + unsigned int mode)</div>
<div class="ContentPasted0">> +{</div>
<div class="ContentPasted0">> + if (I915_CACHE_MODE(obj->cache_mode) == mode)</div>
<div class="ContentPasted0">> + return 1;</div>
<div class="ContentPasted0">> + else if (obj->pat_set_by_user)</div>
<div class="ContentPasted0">> + return -1; /* Unknown, callers should assume no. */</div>
<div class="ContentPasted0">> + else</div>
<div class="ContentPasted0">> + return 0;</div>
<div class="ContentPasted0">> +}</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> void i915_gem_init__objects(struct drm_i915_private *i915);</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> void i915_objects_module_exit(void);</div>
<div class="ContentPasted0">> @@ -764,6 +773,7 @@ int i915_gem_object_wait_moving_fence(struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">> bool intr);</div>
<div class="ContentPasted0">> bool i915_gem_object_has_unknown_state(struct drm_i915_gem_object *obj);</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> +void __i915_gem_object_update_coherency(struct drm_i915_gem_object *obj);</div>
<div class="ContentPasted0">> void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">> unsigned int cache_level);</div>
<div class="ContentPasted0">> void i915_gem_object_set_pat_index(struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h</div>
<div class="ContentPasted0">> index 8de2b91b3edf..1f9fa28d07df 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h</div>
<div class="ContentPasted0">> @@ -14,6 +14,7 @@</div>
<div class="ContentPasted0">> #include <uapi/drm/i915_drm.h></div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> #include "i915_active.h"</div>
<div class="ContentPasted0">> +#include "i915_cache.h"</div>
<div class="ContentPasted0">> #include "i915_selftest.h"</div>
<div class="ContentPasted0">> #include "i915_vma_resource.h"</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> @@ -116,93 +117,6 @@ struct drm_i915_gem_object_ops {</div>
<div class="ContentPasted0">> const char *name; /* friendly name for debug, e.g. lockdep classes */</div>
<div class="ContentPasted0">> };</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> -/**</div>
<div class="ContentPasted0">> - * enum i915_cache_level - The supported GTT caching values for system memory</div>
<div class="ContentPasted0">> - * pages.</div>
<div class="ContentPasted0">> - *</div>
<div class="ContentPasted0">> - * These translate to some special GTT PTE bits when binding pages into some</div>
<div class="ContentPasted0">> - * address space. It also determines whether an object, or rather its pages are</div>
<div class="ContentPasted0">> - * coherent with the GPU, when also reading or writing through the CPU cache</div>
<div class="ContentPasted0">> - * with those pages.</div>
<div class="ContentPasted0">> - *</div>
<div class="ContentPasted0">> - * Userspace can also control this through struct drm_i915_gem_caching.</div>
<div class="ContentPasted0">> - */</div>
<div class="ContentPasted0">> -enum i915_cache_level {</div>
<div class="ContentPasted0">> - /**</div>
<div class="ContentPasted0">> - * @I915_CACHE_NONE:</div>
<div class="ContentPasted0">> - *</div>
<div class="ContentPasted0">> - * GPU access is not coherent with the CPU cache. If the cache is dirty</div>
<div class="ContentPasted0">> - * and we need the underlying pages to be coherent with some later GPU</div>
<div class="ContentPasted0">> - * access then we need to manually flush the pages.</div>
<div class="ContentPasted0">> - *</div>
<div class="ContentPasted0">> - * On shared LLC platforms reads and writes through the CPU cache are</div>
<div class="ContentPasted0">> - * still coherent even with this setting. See also</div>
<div class="ContentPasted0">> - * &drm_i915_gem_object.cache_coherent for more details. Due to this we</div>
<div class="ContentPasted0">> - * should only ever use uncached for scanout surfaces, otherwise we end</div>
<div class="ContentPasted0">> - * up over-flushing in some places.</div>
<div class="ContentPasted0">> - *</div>
<div class="ContentPasted0">> - * This is the default on non-LLC platforms.</div>
<div class="ContentPasted0">> - */</div>
<div class="ContentPasted0">> - I915_CACHE_NONE = 0,</div>
<div class="ContentPasted0">> - /**</div>
<div class="ContentPasted0">> - * @I915_CACHE_LLC:</div>
<div class="ContentPasted0">> - *</div>
<div class="ContentPasted0">> - * GPU access is coherent with the CPU cache. If the cache is dirty,</div>
<div class="ContentPasted0">> - * then the GPU will ensure that access remains coherent, when both</div>
<div class="ContentPasted0">> - * reading and writing through the CPU cache. GPU writes can dirty the</div>
<div class="ContentPasted0">> - * CPU cache.</div>
<div class="ContentPasted0">> - *</div>
<div class="ContentPasted0">> - * Not used for scanout surfaces.</div>
<div class="ContentPasted0">> - *</div>
<div class="ContentPasted0">> - * Applies to both platforms with shared LLC(HAS_LLC), and snooping</div>
<div class="ContentPasted0">> - * based platforms(HAS_SNOOP).</div>
<div class="ContentPasted0">> - *</div>
<div class="ContentPasted0">> - * This is the default on shared LLC platforms. The only exception is</div>
<div class="ContentPasted0">> - * scanout objects, where the display engine is not coherent with the</div>
<div class="ContentPasted0">> - * CPU cache. For such objects I915_CACHE_NONE or I915_CACHE_WT is</div>
<div class="ContentPasted0">> - * automatically applied by the kernel in pin_for_display, if userspace</div>
<div class="ContentPasted0">> - * has not done so already.</div>
<div class="ContentPasted0">> - */</div>
<div class="ContentPasted0">> - I915_CACHE_LLC,</div>
<div class="ContentPasted0">> - /**</div>
<div class="ContentPasted0">> - * @I915_CACHE_L3_LLC:</div>
<div class="ContentPasted0">> - *</div>
<div class="ContentPasted0">> - * Explicitly enable the Gfx L3 cache, with coherent LLC.</div>
<div class="ContentPasted0">> - *</div>
<div class="ContentPasted0">> - * The Gfx L3 sits between the domain specific caches, e.g</div>
<div class="ContentPasted0">> - * sampler/render caches, and the larger LLC. LLC is coherent with the</div>
<div class="ContentPasted0">> - * GPU, but L3 is only visible to the GPU, so likely needs to be flushed</div>
<div class="ContentPasted0">> - * when the workload completes.</div>
<div class="ContentPasted0">> - *</div>
<div class="ContentPasted0">> - * Not used for scanout surfaces.</div>
<div class="ContentPasted0">> - *</div>
<div class="ContentPasted0">> - * Only exposed on some gen7 + GGTT. More recent hardware has dropped</div>
<div class="ContentPasted0">> - * this explicit setting, where it should now be enabled by default.</div>
<div class="ContentPasted0">> - */</div>
<div class="ContentPasted0">> - I915_CACHE_L3_LLC,</div>
<div class="ContentPasted0">> - /**</div>
<div class="ContentPasted0">> - * @I915_CACHE_WT:</div>
<div class="ContentPasted0">> - *</div>
<div class="ContentPasted0">> - * Write-through. Used for scanout surfaces.</div>
<div class="ContentPasted0">> - *</div>
<div class="ContentPasted0">> - * The GPU can utilise the caches, while still having the display engine</div>
<div class="ContentPasted0">> - * be coherent with GPU writes, as a result we don't need to flush the</div>
<div class="ContentPasted0">> - * CPU caches when moving out of the render domain. This is the default</div>
<div class="ContentPasted0">> - * setting chosen by the kernel, if supported by the HW, otherwise we</div>
<div class="ContentPasted0">> - * fallback to I915_CACHE_NONE. On the CPU side writes through the CPU</div>
<div class="ContentPasted0">> - * cache still need to be flushed, to remain coherent with the display</div>
<div class="ContentPasted0">> - * engine.</div>
<div class="ContentPasted0">> - */</div>
<div class="ContentPasted0">> - I915_CACHE_WT,</div>
<div class="ContentPasted0">> - /**</div>
<div class="ContentPasted0">> - * @I915_MAX_CACHE_LEVEL:</div>
<div class="ContentPasted0">> - *</div>
<div class="ContentPasted0">> - * Mark the last entry in the enum. Used for defining cachelevel_to_pat</div>
<div class="ContentPasted0">> - * array for cache_level to pat translation table.</div>
<div class="ContentPasted0">> - */</div>
<div class="ContentPasted0">> - I915_MAX_CACHE_LEVEL,</div>
<div class="ContentPasted0">> -};</div>
<div class="ContentPasted0">> -</div>
<div class="ContentPasted0">> enum i915_map_type {</div>
<div class="ContentPasted0">> I915_MAP_WB = 0,</div>
<div class="ContentPasted0">> I915_MAP_WC,</div>
<div class="ContentPasted0">> @@ -375,6 +289,9 @@ struct drm_i915_gem_object {</div>
<div class="ContentPasted0">> unsigned int mem_flags;</div>
<div class="ContentPasted0">> #define I915_BO_FLAG_STRUCT_PAGE BIT(0) /* Object backed by struct pages */</div>
<div class="ContentPasted0">> #define I915_BO_FLAG_IOMEM BIT(1) /* Object backed by IO memory */</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> + i915_cache_t cache_mode;</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> /**</div>
<div class="ContentPasted0">> * @pat_index: The desired PAT index.</div>
<div class="ContentPasted0">> *</div>
<div class="ContentPasted0">> @@ -409,9 +326,7 @@ struct drm_i915_gem_object {</div>
<div class="ContentPasted0">> * Check for @pat_set_by_user to find out if an object has pat index set</div>
<div class="ContentPasted0">> * by userspace. The ioctl's to change cache settings have also been</div>
<div class="ContentPasted0">> * disabled for the objects with pat index set by userspace. Please don't</div>
<div class="ContentPasted0">> - * assume @cache_coherent having the flags set as describe here. A helper</div>
<div class="ContentPasted0">> - * function i915_gem_object_has_cache_level() provides one way to bypass</div>
<div class="ContentPasted0">> - * the use of this field.</div>
<div class="ContentPasted0">> + * assume @cache_coherent having the flags set as describe here.</div>
<div class="ContentPasted0">> *</div>
<div class="ContentPasted0">> * Track whether the pages are coherent with the GPU if reading or</div>
<div class="ContentPasted0">> * writing through the CPU caches. The largely depends on the</div>
<div class="ContentPasted0">> @@ -492,9 +407,7 @@ struct drm_i915_gem_object {</div>
<div class="ContentPasted0">> * Check for @pat_set_by_user to find out if an object has pat index set</div>
<div class="ContentPasted0">> * by userspace. The ioctl's to change cache settings have also been</div>
<div class="ContentPasted0">> * disabled for the objects with pat_index set by userspace. Please don't</div>
<div class="ContentPasted0">> - * assume @cache_dirty is set as describe here. Also see helper function</div>
<div class="ContentPasted0">> - * i915_gem_object_has_cache_level() for possible ways to bypass the use</div>
<div class="ContentPasted0">> - * of this field.</div>
<div class="ContentPasted0">> + * assume @cache_dirty is set as describe here.</div>
<div class="ContentPasted0">> *</div>
<div class="ContentPasted0">> * Track if we are we dirty with writes through the CPU cache for this</div>
<div class="ContentPasted0">> * object. As a result reading directly from main memory might yield</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c</div>
<div class="ContentPasted0">> index 3b094d36a0b0..a7012f1a9c70 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c</div>
<div class="ContentPasted0">> @@ -563,11 +563,8 @@ static void dbg_poison(struct i915_ggtt *ggtt,</div>
<div class="ContentPasted0">> while (size) {</div>
<div class="ContentPasted0">> void __iomem *s;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> - ggtt->vm.insert_page(&ggtt->vm, addr,</div>
<div class="ContentPasted0">> - ggtt->error_capture.start,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(ggtt->vm.i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE),</div>
<div class="ContentPasted0">> - 0);</div>
<div class="ContentPasted0">> + ggtt->vm.insert_page(&ggtt->vm, addr, ggtt->error_capture.start,</div>
<div class="ContentPasted0">> + ggtt->vm.i915->pat_uc, 0);</div>
<div class="ContentPasted0">> mb();</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> s = io_mapping_map_wc(&ggtt->iomap,</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c</div>
<div class="ContentPasted0">> index 7078af2f8f79..e794bd2a7ccb 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c</div>
<div class="ContentPasted0">> @@ -58,6 +58,16 @@ i915_ttm_cache_level(struct drm_i915_private *i915, struct ttm_resource *res,</div>
<div class="ContentPasted0">> I915_CACHE_NONE;</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> +static unsigned int</div>
<div class="ContentPasted0">> +i915_ttm_cache_pat(struct drm_i915_private *i915, struct ttm_resource *res,</div>
<div class="ContentPasted0">> + struct ttm_tt *ttm)</div>
<div class="ContentPasted0">> +{</div>
<div class="ContentPasted0">> + return ((HAS_LLC(i915) || HAS_SNOOP(i915)) &&</div>
<div class="ContentPasted0">> + !i915_ttm_gtt_binds_lmem(res) &&</div>
<div class="ContentPasted0">> + ttm->caching == ttm_cached) ? i915->pat_wb :</div>
<div class="ContentPasted0">> + i915->pat_uc;</div>
<div class="ContentPasted0">> +}</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> static struct intel_memory_region *</div>
<div class="ContentPasted0">> i915_ttm_region(struct ttm_device *bdev, int ttm_mem_type)</div>
<div class="ContentPasted0">> {</div>
<div class="ContentPasted0">> @@ -196,7 +206,7 @@ static struct dma_fence *i915_ttm_accel_move(struct ttm_buffer_object *bo,</div>
<div class="ContentPasted0">> struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);</div>
<div class="ContentPasted0">> struct i915_request *rq;</div>
<div class="ContentPasted0">> struct ttm_tt *src_ttm = bo->ttm;</div>
<div class="ContentPasted0">> - enum i915_cache_level src_level, dst_level;</div>
<div class="ContentPasted0">> + unsigned int src_pat, dst_pat;</div>
<div class="ContentPasted0">> int ret;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> if (!to_gt(i915)->migrate.context || intel_gt_is_wedged(to_gt(i915)))</div>
<div class="ContentPasted0">> @@ -206,16 +216,15 @@ static struct dma_fence *i915_ttm_accel_move(struct ttm_buffer_object *bo,</div>
<div class="ContentPasted0">> if (I915_SELFTEST_ONLY(fail_gpu_migration))</div>
<div class="ContentPasted0">> clear = true;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> - dst_level = i915_ttm_cache_level(i915, dst_mem, dst_ttm);</div>
<div class="ContentPasted0">> + dst_pat = i915_ttm_cache_pat(i915, dst_mem, dst_ttm);</div>
<div class="ContentPasted0">> if (clear) {</div>
<div class="ContentPasted0">> if (bo->type == ttm_bo_type_kernel &&</div>
<div class="ContentPasted0">> !I915_SELFTEST_ONLY(fail_gpu_migration))</div>
<div class="ContentPasted0">> return ERR_PTR(-EINVAL);</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> intel_engine_pm_get(to_gt(i915)->migrate.context->engine);</div>
<div class="ContentPasted0">> - ret = intel_context_migrate_clear(to_gt(i915)->migrate.context, deps,</div>
<div class="ContentPasted0">> - dst_st->sgl,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(i915, dst_level),</div>
<div class="ContentPasted0">> + ret = intel_context_migrate_clear(to_gt(i915)->migrate.context,</div>
<div class="ContentPasted0">> + deps, dst_st->sgl, dst_pat,</div>
<div class="ContentPasted0">> i915_ttm_gtt_binds_lmem(dst_mem),</div>
<div class="ContentPasted0">> 0, &rq);</div>
<div class="ContentPasted0">> } else {</div>
<div class="ContentPasted0">> @@ -225,14 +234,13 @@ static struct dma_fence *i915_ttm_accel_move(struct ttm_buffer_object *bo,</div>
<div class="ContentPasted0">> if (IS_ERR(src_rsgt))</div>
<div class="ContentPasted0">> return ERR_CAST(src_rsgt);</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> - src_level = i915_ttm_cache_level(i915, bo->resource, src_ttm);</div>
<div class="ContentPasted0">> + src_pat = i915_ttm_cache_pat(i915, bo->resource, src_ttm);</div>
<div class="ContentPasted0">> intel_engine_pm_get(to_gt(i915)->migrate.context->engine);</div>
<div class="ContentPasted0">> ret = intel_context_migrate_copy(to_gt(i915)->migrate.context,</div>
<div class="ContentPasted0">> deps, src_rsgt->table.sgl,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(i915, src_level),</div>
<div class="ContentPasted0">> + src_pat,</div>
<div class="ContentPasted0">> i915_ttm_gtt_binds_lmem(bo->resource),</div>
<div class="ContentPasted0">> - dst_st->sgl,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(i915, dst_level),</div>
<div class="ContentPasted0">> + dst_st->sgl, dst_pat,</div>
<div class="ContentPasted0">> i915_ttm_gtt_binds_lmem(dst_mem),</div>
<div class="ContentPasted0">> &rq);</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c</div>
<div class="ContentPasted0">> index df6c9a84252c..c8925918784e 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c</div>
<div class="ContentPasted0">> @@ -354,7 +354,7 @@ fake_huge_pages_object(struct drm_i915_private *i915, u64 size, bool single)</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> obj->write_domain = I915_GEM_DOMAIN_CPU;</div>
<div class="ContentPasted0">> obj->read_domains = I915_GEM_DOMAIN_CPU;</div>
<div class="ContentPasted0">> - obj->pat_index = i915_gem_get_pat_index(i915, I915_CACHE_NONE);</div>
<div class="ContentPasted0">> + i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> return obj;</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c</div>
<div class="ContentPasted0">> index c2bdc133c89a..fb69f667652a 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c</div>
<div class="ContentPasted0">> @@ -226,9 +226,7 @@ static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)</div>
<div class="ContentPasted0">> return ret;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> vm->scratch[0]->encode =</div>
<div class="ContentPasted0">> - vm->pte_encode(px_dma(vm->scratch[0]),</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(vm->i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE),</div>
<div class="ContentPasted0">> + vm->pte_encode(px_dma(vm->scratch[0]), vm->i915->pat_uc,</div>
<div class="ContentPasted0">> PTE_READ_ONLY);</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> vm->scratch[1] = vm->alloc_pt_dma(vm, I915_GTT_PAGE_SIZE_4K);</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c</div>
<div class="ContentPasted0">> index f948d33e5ec5..a6692ea1a91e 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c</div>
<div class="ContentPasted0">> @@ -40,16 +40,11 @@ static u64 gen8_pte_encode(dma_addr_t addr,</div>
<div class="ContentPasted0">> if (flags & PTE_LM)</div>
<div class="ContentPasted0">> pte |= GEN12_PPGTT_PTE_LM;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> - /*</div>
<div class="ContentPasted0">> - * For pre-gen12 platforms pat_index is the same as enum</div>
<div class="ContentPasted0">> - * i915_cache_level, so the switch-case here is still valid.</div>
<div class="ContentPasted0">> - * See translation table defined by LEGACY_CACHELEVEL.</div>
<div class="ContentPasted0">> - */</div>
<div class="ContentPasted0">> switch (pat_index) {</div>
<div class="ContentPasted0">> - case I915_CACHE_NONE:</div>
<div class="ContentPasted0">> + case 0:</div>
<div class="ContentPasted0">> pte |= PPAT_UNCACHED;</div>
<div class="ContentPasted0">> break;</div>
<div class="ContentPasted0">> - case I915_CACHE_WT:</div>
<div class="ContentPasted0">> + case 3:</div>
<div class="ContentPasted0">> pte |= PPAT_DISPLAY_ELLC;</div>
<div class="ContentPasted0">> break;</div>
<div class="ContentPasted0">> default:</div>
<div class="ContentPasted0">> @@ -853,9 +848,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)</div>
<div class="ContentPasted0">> pte_flags |= PTE_LM;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> vm->scratch[0]->encode =</div>
<div class="ContentPasted0">> - vm->pte_encode(px_dma(vm->scratch[0]),</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(vm->i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE),</div>
<div class="ContentPasted0">> + vm->pte_encode(px_dma(vm->scratch[0]), vm->i915->pat_uc,</div>
<div class="ContentPasted0">> pte_flags);</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> for (i = 1; i <= vm->top; i++) {</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c</div>
<div class="ContentPasted0">> index dd0ed941441a..c97379cf8241 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c</div>
<div class="ContentPasted0">> @@ -921,9 +921,7 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)</div>
<div class="ContentPasted0">> pte_flags |= PTE_LM;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> ggtt->vm.scratch[0]->encode =</div>
<div class="ContentPasted0">> - ggtt->vm.pte_encode(px_dma(ggtt->vm.scratch[0]),</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE),</div>
<div class="ContentPasted0">> + ggtt->vm.pte_encode(px_dma(ggtt->vm.scratch[0]), i915->pat_uc,</div>
<div class="ContentPasted0">> pte_flags);</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> return 0;</div>
<div class="ContentPasted0">> @@ -1297,10 +1295,7 @@ bool i915_ggtt_resume_vm(struct i915_address_space *vm)</div>
<div class="ContentPasted0">> * ptes to be repopulated.</div>
<div class="ContentPasted0">> */</div>
<div class="ContentPasted0">> vma->resource->bound_flags = 0;</div>
<div class="ContentPasted0">> - vma->ops->bind_vma(vm, NULL, vma->resource,</div>
<div class="ContentPasted0">> - obj ? obj->pat_index :</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(vm->i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE),</div>
<div class="ContentPasted0">> + vma->ops->bind_vma(vm, NULL, vma->resource, obj->cache_mode,</div>
<div class="ContentPasted0">> was_bound);</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> if (obj) { /* only used during resume => exclusive access */</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c</div>
<div class="ContentPasted0">> index 6023288b0e2d..81f7834cc2db 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gt/intel_migrate.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c</div>
<div class="ContentPasted0">> @@ -45,9 +45,7 @@ static void xehpsdv_toggle_pdes(struct i915_address_space *vm,</div>
<div class="ContentPasted0">> * Insert a dummy PTE into every PT that will map to LMEM to ensure</div>
<div class="ContentPasted0">> * we have a correctly setup PDE structure for later use.</div>
<div class="ContentPasted0">> */</div>
<div class="ContentPasted0">> - vm->insert_page(vm, 0, d->offset,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(vm->i915, I915_CACHE_NONE),</div>
<div class="ContentPasted0">> - PTE_LM);</div>
<div class="ContentPasted0">> + vm->insert_page(vm, 0, d->offset, vm->i915->pat_uc, PTE_LM);</div>
<div class="ContentPasted0">> GEM_BUG_ON(!pt->is_compact);</div>
<div class="ContentPasted0">> d->offset += SZ_2M;</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> @@ -65,9 +63,7 @@ static void xehpsdv_insert_pte(struct i915_address_space *vm,</div>
<div class="ContentPasted0">> * alignment is 64K underneath for the pt, and we are careful</div>
<div class="ContentPasted0">> * not to access the space in the void.</div>
<div class="ContentPasted0">> */</div>
<div class="ContentPasted0">> - vm->insert_page(vm, px_dma(pt), d->offset,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(vm->i915, I915_CACHE_NONE),</div>
<div class="ContentPasted0">> - PTE_LM);</div>
<div class="ContentPasted0">> + vm->insert_page(vm, px_dma(pt), d->offset, vm->i915->pat_uc, PTE_LM);</div>
<div class="ContentPasted0">> d->offset += SZ_64K;</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> @@ -77,8 +73,7 @@ static void insert_pte(struct i915_address_space *vm,</div>
<div class="ContentPasted0">> {</div>
<div class="ContentPasted0">> struct insert_pte_data *d = data;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> - vm->insert_page(vm, px_dma(pt), d->offset,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(vm->i915, I915_CACHE_NONE),</div>
<div class="ContentPasted0">> + vm->insert_page(vm, px_dma(pt), d->offset, vm->i915->pat_uc,</div>
<div class="ContentPasted0">> i915_gem_object_is_lmem(pt->base) ? PTE_LM : 0);</div>
<div class="ContentPasted0">> d->offset += PAGE_SIZE;</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gt/selftest_migrate.c b/drivers/gpu/drm/i915/gt/selftest_migrate.c</div>
<div class="ContentPasted0">> index 3def5ca72dec..a67ede65d816 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gt/selftest_migrate.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gt/selftest_migrate.c</div>
<div class="ContentPasted0">> @@ -904,8 +904,7 @@ static int perf_clear_blt(void *arg)</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> err = __perf_clear_blt(gt->migrate.context,</div>
<div class="ContentPasted0">> dst->mm.pages->sgl,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(gt->i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE),</div>
<div class="ContentPasted0">> + gt->i915->pat_uc,</div>
<div class="ContentPasted0">> i915_gem_object_is_lmem(dst),</div>
<div class="ContentPasted0">> sizes[i]);</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> @@ -995,12 +994,10 @@ static int perf_copy_blt(void *arg)</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> err = __perf_copy_blt(gt->migrate.context,</div>
<div class="ContentPasted0">> src->mm.pages->sgl,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(gt->i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE),</div>
<div class="ContentPasted0">> + gt->i915->pat_uc,</div>
<div class="ContentPasted0">> i915_gem_object_is_lmem(src),</div>
<div class="ContentPasted0">> dst->mm.pages->sgl,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(gt->i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE),</div>
<div class="ContentPasted0">> + gt->i915->pat_uc,</div>
<div class="ContentPasted0">> i915_gem_object_is_lmem(dst),</div>
<div class="ContentPasted0">> sz);</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gt/selftest_reset.c b/drivers/gpu/drm/i915/gt/selftest_reset.c</div>
<div class="ContentPasted0">> index 79aa6ac66ad2..327dc9294e0f 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gt/selftest_reset.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gt/selftest_reset.c</div>
<div class="ContentPasted0">> @@ -84,11 +84,8 @@ __igt_reset_stolen(struct intel_gt *gt,</div>
<div class="ContentPasted0">> void __iomem *s;</div>
<div class="ContentPasted0">> void *in;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> - ggtt->vm.insert_page(&ggtt->vm, dma,</div>
<div class="ContentPasted0">> - ggtt->error_capture.start,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(gt->i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE),</div>
<div class="ContentPasted0">> - 0);</div>
<div class="ContentPasted0">> + ggtt->vm.insert_page(&ggtt->vm, dma, ggtt->error_capture.start,</div>
<div class="ContentPasted0">> + gt->i915->pat_uc, 0);</div>
<div class="ContentPasted0">> mb();</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> s = io_mapping_map_wc(&ggtt->iomap,</div>
<div class="ContentPasted0">> @@ -127,11 +124,8 @@ __igt_reset_stolen(struct intel_gt *gt,</div>
<div class="ContentPasted0">> void *in;</div>
<div class="ContentPasted0">> u32 x;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> - ggtt->vm.insert_page(&ggtt->vm, dma,</div>
<div class="ContentPasted0">> - ggtt->error_capture.start,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(gt->i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE),</div>
<div class="ContentPasted0">> - 0);</div>
<div class="ContentPasted0">> + ggtt->vm.insert_page(&ggtt->vm, dma, ggtt->error_capture.start,</div>
<div class="ContentPasted0">> + gt->i915->pat_uc, 0);</div>
<div class="ContentPasted0">> mb();</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> s = io_mapping_map_wc(&ggtt->iomap,</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gt/selftest_timeline.c b/drivers/gpu/drm/i915/gt/selftest_timeline.c</div>
<div class="ContentPasted0">> index 39c3ec12df1a..db64dc7d3fce 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gt/selftest_timeline.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gt/selftest_timeline.c</div>
<div class="ContentPasted0">> @@ -836,7 +836,10 @@ static int setup_watcher(struct hwsp_watcher *w, struct intel_gt *gt,</div>
<div class="ContentPasted0">> return PTR_ERR(obj);</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> /* keep the same cache settings as timeline */</div>
<div class="ContentPasted0">> - i915_gem_object_set_pat_index(obj, tl->hwsp_ggtt->obj->pat_index);</div>
<div class="ContentPasted0">> + obj->pat_index = tl->hwsp_ggtt->obj->pat_index;</div>
<div class="ContentPasted0">> + obj->cache_mode = tl->hwsp_ggtt->obj->cache_mode;</div>
<div class="ContentPasted0">> + __i915_gem_object_update_coherency(obj);</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> w->map = i915_gem_object_pin_map_unlocked(obj,</div>
<div class="ContentPasted0">> page_unmask_bits(tl->hwsp_ggtt->obj->mm.mapping));</div>
<div class="ContentPasted0">> if (IS_ERR(w->map)) {</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gt/selftest_tlb.c b/drivers/gpu/drm/i915/gt/selftest_tlb.c</div>
<div class="ContentPasted0">> index 3bd6b540257b..6049f01be219 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gt/selftest_tlb.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gt/selftest_tlb.c</div>
<div class="ContentPasted0">> @@ -36,8 +36,6 @@ pte_tlbinv(struct intel_context *ce,</div>
<div class="ContentPasted0">> u64 length,</div>
<div class="ContentPasted0">> struct rnd_state *prng)</div>
<div class="ContentPasted0">> {</div>
<div class="ContentPasted0">> - const unsigned int pat_index =</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(ce->vm->i915, I915_CACHE_NONE);</div>
<div class="ContentPasted0">> struct drm_i915_gem_object *batch;</div>
<div class="ContentPasted0">> struct drm_mm_node vb_node;</div>
<div class="ContentPasted0">> struct i915_request *rq;</div>
<div class="ContentPasted0">> @@ -157,7 +155,8 @@ pte_tlbinv(struct intel_context *ce,</div>
<div class="ContentPasted0">> /* Flip the PTE between A and B */</div>
<div class="ContentPasted0">> if (i915_gem_object_is_lmem(vb->obj))</div>
<div class="ContentPasted0">> pte_flags |= PTE_LM;</div>
<div class="ContentPasted0">> - ce->vm->insert_entries(ce->vm, &vb_res, pat_index, pte_flags);</div>
<div class="ContentPasted0">> + ce->vm->insert_entries(ce->vm, &vb_res, ce->vm->i915->pat_uc,</div>
<div class="ContentPasted0">> + pte_flags);</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> /* Flush the PTE update to concurrent HW */</div>
<div class="ContentPasted0">> tlbinv(ce->vm, addr & -length, length);</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c</div>
<div class="ContentPasted0">> index d408856ae4c0..e099414d624d 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c</div>
<div class="ContentPasted0">> @@ -991,14 +991,10 @@ static void uc_fw_bind_ggtt(struct intel_uc_fw *uc_fw)</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> if (ggtt->vm.raw_insert_entries)</div>
<div class="ContentPasted0">> ggtt->vm.raw_insert_entries(&ggtt->vm, vma_res,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(ggtt->vm.i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE),</div>
<div class="ContentPasted0">> - pte_flags);</div>
<div class="ContentPasted0">> + ggtt->vm.i915->pat_uc, pte_flags);</div>
<div class="ContentPasted0">> else</div>
<div class="ContentPasted0">> ggtt->vm.insert_entries(&ggtt->vm, vma_res,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(ggtt->vm.i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE),</div>
<div class="ContentPasted0">> - pte_flags);</div>
<div class="ContentPasted0">> + ggtt->vm.i915->pat_uc, pte_flags);</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> static void uc_fw_unbind_ggtt(struct intel_uc_fw *uc_fw)</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/i915_cache.c b/drivers/gpu/drm/i915/i915_cache.c</div>
<div class="ContentPasted0">> new file mode 100644</div>
<div class="ContentPasted0">> index 000000000000..7a8002ebd2ec</div>
<div class="ContentPasted0">> --- /dev/null</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/i915_cache.c</div>
<div class="ContentPasted0">> @@ -0,0 +1,59 @@</div>
<div class="ContentPasted0">> +/*</div>
<div class="ContentPasted0">> + * SPDX-License-Identifier: MIT</div>
<div class="ContentPasted0">> + *</div>
<div class="ContentPasted0">> + * Copyright © 2023 Intel Corporation</div>
<div class="ContentPasted0">> + */</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +#include "i915_cache.h"</div>
<div class="ContentPasted0">> +#include "i915_drv.h"</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +static int find_pat(const struct intel_device_info *info, i915_cache_t mode)</div>
<div class="ContentPasted0">> +{</div>
<div class="ContentPasted0">> + int i;</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> + for (i = 0; i < ARRAY_SIZE(info->cache_modes); i++) {</div>
<div class="ContentPasted0">> + if (info->cache_modes[i] == mode)</div>
<div class="ContentPasted0">> + return i;</div>
<div class="ContentPasted0">> + }</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> + return -1;</div>
<div class="ContentPasted0">> +}</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +void i915_cache_init(struct drm_i915_private *i915)</div>
<div class="ContentPasted0">> +{</div>
<div class="ContentPasted0">> + int ret;</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> + ret = find_pat(INTEL_INFO(i915), I915_CACHE(UC));</div>
<div class="ContentPasted0">> + WARN_ON(ret < 0);</div>
<div class="ContentPasted0">> + drm_info(&i915->drm, "Using PAT index %u for uncached access\n", ret);</div>
<div class="ContentPasted0">> + i915->pat_uc = ret;</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> + ret = find_pat(INTEL_INFO(i915), I915_CACHE(WB));</div>
<div class="ContentPasted0">> + WARN_ON(ret < 0);</div>
<div class="ContentPasted0">> + drm_info(&i915->drm, "Using PAT index %u for write-back access\n", ret);</div>
<div class="ContentPasted0">> + i915->pat_wb = ret;</div>
<div class="ContentPasted0">> +}</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">I don't think we need the above two functions. Why don't we just hard code</div>
<div class="ContentPasted0">pat_uc and pat_wb in device_info? plus pat_wt too? These are predetermined,</div>
<div class="ContentPasted0">and used by KMD only.</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> +int i915_cache_level_to_pat_and_mode(struct drm_i915_private *i915,</div>
<div class="ContentPasted0">> + unsigned int cache_level,</div>
<div class="ContentPasted0">> + i915_cache_t *mode)</div>
<div class="ContentPasted0">> +{</div>
<div class="ContentPasted0">> + const struct intel_device_info *info = INTEL_INFO(i915);</div>
<div class="ContentPasted0">> + i915_cache_t level_to_mode[] = {</div>
<div class="ContentPasted0">> + [I915_CACHE_NONE] = I915_CACHE(UC),</div>
<div class="ContentPasted0">> + [I915_CACHE_WT] = I915_CACHE(WT),</div>
<div class="ContentPasted0">> + [I915_CACHE_LLC] = I915_CACHE(WB),</div>
<div class="ContentPasted0">> + };</div>
<div class="ContentPasted0">> + int ret;</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> + if (GRAPHICS_VER(i915) >= 12)</div>
<div class="ContentPasted0">> + level_to_mode[I915_CACHE_L3_LLC] = I915_CACHE(WB);</div>
<div class="ContentPasted0">> + else</div>
<div class="ContentPasted0">> + level_to_mode[I915_CACHE_L3_LLC] = _I915_CACHE(WB, LLC);</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> + ret = find_pat(info, level_to_mode[cache_level]);</div>
<div class="ContentPasted0">> + if (ret >= 0 && mode)</div>
<div class="ContentPasted0">> + *mode = info->cache_modes[ret];</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> + return ret;</div>
<div class="ContentPasted0">> +}</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/i915_cache.h b/drivers/gpu/drm/i915/i915_cache.h</div>
<div class="ContentPasted0">> new file mode 100644</div>
<div class="ContentPasted0">> index 000000000000..0df03f1f01ef</div>
<div class="ContentPasted0">> --- /dev/null</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/i915_cache.h</div>
<div class="ContentPasted0">> @@ -0,0 +1,129 @@</div>
<div class="ContentPasted0">> +/* SPDX-License-Identifier: MIT */</div>
<div class="ContentPasted0">> +/*</div>
<div class="ContentPasted0">> + * Copyright © 2023 Intel Corporation</div>
<div class="ContentPasted0">> + */</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +#ifndef __I915_CACHE_H__</div>
<div class="ContentPasted0">> +#define __I915_CACHE_H__</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +#include <linux/types.h></div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +struct drm_i915_private;</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +typedef u16 i915_cache_t;</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +#define I915_CACHE(mode) \</div>
<div class="ContentPasted0">> + (i915_cache_t)(I915_CACHE_MODE_##mode)</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +#define _I915_CACHE(mode, flag) \</div>
<div class="ContentPasted0">> + (i915_cache_t)((I915_CACHE_MODE_##mode) | ( BIT(8 + I915_CACHE_##flag)))</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +#define I915_CACHE_MODE(cache) \</div>
<div class="ContentPasted0">> + (unsigned int)(((i915_cache_t)(cache)) & 0xff)</div>
<div class="ContentPasted0">> +#define I915_CACHE_FLAGS(cache) \</div>
<div class="ContentPasted0">> + (unsigned int)((((i915_cache_t)(cache) & 0xff00)) >> 16)</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +/* Cache mode values */</div>
<div class="ContentPasted0">> +#define I915_CACHE_MODE_UNKNOWN (0)</div>
<div class="ContentPasted0">> +#define I915_CACHE_MODE_UC (1)</div>
<div class="ContentPasted0">> +#define I915_CACHE_MODE_WB (2)</div>
<div class="ContentPasted0">> +#define I915_CACHE_MODE_WT (3)</div>
<div class="ContentPasted0">> +#define I915_CACHE_MODE_WC (4)</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">Why do you need these CACHE_MODE's? Aren't they the same as i915_cache_level, which</div>
<div class="ContentPasted0">need some sort of translation?</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +/* Mode flag bits */</div>
<div class="ContentPasted0">> +#define I915_CACHE_L3 (0)</div>
<div class="ContentPasted0">> +#define I915_CACHE_COH1W (1)</div>
<div class="ContentPasted0">> +#define I915_CACHE_COH2W (2)</div>
<div class="ContentPasted0">> +#define I915_CACHE_CLOS1 (3)</div>
<div class="ContentPasted0">> +#define I915_CACHE_CLOS2 (4)</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">These had been defined in drivers/gpu/drm/i915/gt/intel_gtt.h already, why add new ones?</div>
<div class="ContentPasted0">The CLOS ones are not needed in upstream unless we want to support PVC here.</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +void i915_cache_init(struct drm_i915_private *i915);</div>
<div class="ContentPasted0">> +int i915_cache_level_to_pat_and_mode(struct drm_i915_private *i915,</div>
<div class="ContentPasted0">> + unsigned int cache_level,</div>
<div class="ContentPasted0">> + i915_cache_t *mode);</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +/*</div>
<div class="ContentPasted0">> + * Legacy/kernel internal interface below:</div>
<div class="ContentPasted0">> + */</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +/**</div>
<div class="ContentPasted0">> + * enum i915_cache_level - The supported GTT caching values for system memory</div>
<div class="ContentPasted0">> + * pages.</div>
<div class="ContentPasted0">> + *</div>
<div class="ContentPasted0">> + * These translate to some special GTT PTE bits when binding pages into some</div>
<div class="ContentPasted0">> + * address space. It also determines whether an object, or rather its pages are</div>
<div class="ContentPasted0">> + * coherent with the GPU, when also reading or writing through the CPU cache</div>
<div class="ContentPasted0">> + * with those pages.</div>
<div class="ContentPasted0">> + *</div>
<div class="ContentPasted0">> + * Userspace can also control this through struct drm_i915_gem_caching.</div>
<div class="ContentPasted0">> + */</div>
<div class="ContentPasted0">> +enum i915_cache_level {</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">Shouldn't we completely get rid of this enum now? It should be replaced by</div>
<div class="ContentPasted0">INTEL_INFO(i915)->pat_uc/wb/wt.</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> + /**</div>
<div class="ContentPasted0">> + * @I915_CACHE_NONE:</div>
<div class="ContentPasted0">> + *</div>
<div class="ContentPasted0">> + * GPU access is not coherent with the CPU cache. If the cache is dirty</div>
<div class="ContentPasted0">> + * and we need the underlying pages to be coherent with some later GPU</div>
<div class="ContentPasted0">> + * access then we need to manually flush the pages.</div>
<div class="ContentPasted0">> + *</div>
<div class="ContentPasted0">> + * On shared LLC platforms reads and writes through the CPU cache are</div>
<div class="ContentPasted0">> + * still coherent even with this setting. See also</div>
<div class="ContentPasted0">> + * &drm_i915_gem_object.cache_coherent for more details. Due to this we</div>
<div class="ContentPasted0">> + * should only ever use uncached for scanout surfaces, otherwise we end</div>
<div class="ContentPasted0">> + * up over-flushing in some places.</div>
<div class="ContentPasted0">> + *</div>
<div class="ContentPasted0">> + * This is the default on non-LLC platforms.</div>
<div class="ContentPasted0">> + */</div>
<div class="ContentPasted0">> + I915_CACHE_NONE = 0,</div>
<div class="ContentPasted0">> + /**</div>
<div class="ContentPasted0">> + * @I915_CACHE_LLC:</div>
<div class="ContentPasted0">> + *</div>
<div class="ContentPasted0">> + * GPU access is coherent with the CPU cache. If the cache is dirty,</div>
<div class="ContentPasted0">> + * then the GPU will ensure that access remains coherent, when both</div>
<div class="ContentPasted0">> + * reading and writing through the CPU cache. GPU writes can dirty the</div>
<div class="ContentPasted0">> + * CPU cache.</div>
<div class="ContentPasted0">> + *</div>
<div class="ContentPasted0">> + * Not used for scanout surfaces.</div>
<div class="ContentPasted0">> + *</div>
<div class="ContentPasted0">> + * Applies to both platforms with shared LLC(HAS_LLC), and snooping</div>
<div class="ContentPasted0">> + * based platforms(HAS_SNOOP).</div>
<div class="ContentPasted0">> + *</div>
<div class="ContentPasted0">> + * This is the default on shared LLC platforms. The only exception is</div>
<div class="ContentPasted0">> + * scanout objects, where the display engine is not coherent with the</div>
<div class="ContentPasted0">> + * CPU cache. For such objects I915_CACHE_NONE or I915_CACHE_WT is</div>
<div class="ContentPasted0">> + * automatically applied by the kernel in pin_for_display, if userspace</div>
<div class="ContentPasted0">> + * has not done so already.</div>
<div class="ContentPasted0">> + */</div>
<div class="ContentPasted0">> + I915_CACHE_LLC,</div>
<div class="ContentPasted0">> + /**</div>
<div class="ContentPasted0">> + * @I915_CACHE_L3_LLC:</div>
<div class="ContentPasted0">> + *</div>
<div class="ContentPasted0">> + * Explicitly enable the Gfx L3 cache, with coherent LLC.</div>
<div class="ContentPasted0">> + *</div>
<div class="ContentPasted0">> + * The Gfx L3 sits between the domain specific caches, e.g</div>
<div class="ContentPasted0">> + * sampler/render caches, and the larger LLC. LLC is coherent with the</div>
<div class="ContentPasted0">> + * GPU, but L3 is only visible to the GPU, so likely needs to be flushed</div>
<div class="ContentPasted0">> + * when the workload completes.</div>
<div class="ContentPasted0">> + *</div>
<div class="ContentPasted0">> + * Not used for scanout surfaces.</div>
<div class="ContentPasted0">> + *</div>
<div class="ContentPasted0">> + * Only exposed on some gen7 + GGTT. More recent hardware has dropped</div>
<div class="ContentPasted0">> + * this explicit setting, where it should now be enabled by default.</div>
<div class="ContentPasted0">> + */</div>
<div class="ContentPasted0">> + I915_CACHE_L3_LLC,</div>
<div class="ContentPasted0">> + /**</div>
<div class="ContentPasted0">> + * @I915_CACHE_WT:</div>
<div class="ContentPasted0">> + *</div>
<div class="ContentPasted0">> + * Write-through. Used for scanout surfaces.</div>
<div class="ContentPasted0">> + *</div>
<div class="ContentPasted0">> + * The GPU can utilise the caches, while still having the display engine</div>
<div class="ContentPasted0">> + * be coherent with GPU writes, as a result we don't need to flush the</div>
<div class="ContentPasted0">> + * CPU caches when moving out of the render domain. This is the default</div>
<div class="ContentPasted0">> + * setting chosen by the kernel, if supported by the HW, otherwise we</div>
<div class="ContentPasted0">> + * fallback to I915_CACHE_NONE. On the CPU side writes through the CPU</div>
<div class="ContentPasted0">> + * cache still need to be flushed, to remain coherent with the display</div>
<div class="ContentPasted0">> + * engine.</div>
<div class="ContentPasted0">> + */</div>
<div class="ContentPasted0">> + I915_CACHE_WT,</div>
<div class="ContentPasted0">> +};</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +#endif /* __I915_CACHE_H__ */</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c</div>
<div class="ContentPasted0">> index 76ccd4e03e31..e2da57397770 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/i915_debugfs.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/i915_debugfs.c</div>
<div class="ContentPasted0">> @@ -139,48 +139,37 @@ static const char *stringify_vma_type(const struct i915_vma *vma)</div>
<div class="ContentPasted0">> return "ppgtt";</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> -static const char *i915_cache_level_str(struct drm_i915_gem_object *obj)</div>
<div class="ContentPasted0">> +static void obj_cache_str(struct seq_file *m, struct drm_i915_gem_object *obj)</div>
<div class="ContentPasted0">> {</div>
<div class="ContentPasted0">> - struct drm_i915_private *i915 = obj_to_i915(obj);</div>
<div class="ContentPasted0">> + const i915_cache_t cache = obj->cache_mode;</div>
<div class="ContentPasted0">> + const unsigned int mode = I915_CACHE_MODE(cache);</div>
<div class="ContentPasted0">> + const unsigned long flags = I915_CACHE_FLAGS(cache);</div>
<div class="ContentPasted0">> + static const char *mode_str[] = {</div>
<div class="ContentPasted0">> + [I915_CACHE_MODE_UC] = "UC",</div>
<div class="ContentPasted0">> + [I915_CACHE_MODE_WB] = "WB",</div>
<div class="ContentPasted0">> + [I915_CACHE_MODE_WT] = "WT",</div>
<div class="ContentPasted0">> + [I915_CACHE_MODE_WC] = "WC",</div>
<div class="ContentPasted0">> + };</div>
<div class="ContentPasted0">> + static const char *flag_str[] = {</div>
<div class="ContentPasted0">> + [I915_CACHE_L3] = "L3",</div>
<div class="ContentPasted0">> + [I915_CACHE_COH1W] = "1-Way-Coherent",</div>
<div class="ContentPasted0">> + [I915_CACHE_COH2W] = "2-Way-Coherent",</div>
<div class="ContentPasted0">> + [I915_CACHE_CLOS1] = "CLOS1",</div>
<div class="ContentPasted0">> + [I915_CACHE_CLOS2] = "CLOS2",</div>
<div class="ContentPasted0">> + };</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> - if (IS_METEORLAKE(i915)) {</div>
<div class="ContentPasted0">> - switch (obj->pat_index) {</div>
<div class="ContentPasted0">> - case 0: return " WB";</div>
<div class="ContentPasted0">> - case 1: return " WT";</div>
<div class="ContentPasted0">> - case 2: return " UC";</div>
<div class="ContentPasted0">> - case 3: return " WB (1-Way Coh)";</div>
<div class="ContentPasted0">> - case 4: return " WB (2-Way Coh)";</div>
<div class="ContentPasted0">> - default: return " not defined";</div>
<div class="ContentPasted0">> - }</div>
<div class="ContentPasted0">> - } else if (IS_PONTEVECCHIO(i915)) {</div>
<div class="ContentPasted0">> - switch (obj->pat_index) {</div>
<div class="ContentPasted0">> - case 0: return " UC";</div>
<div class="ContentPasted0">> - case 1: return " WC";</div>
<div class="ContentPasted0">> - case 2: return " WT";</div>
<div class="ContentPasted0">> - case 3: return " WB";</div>
<div class="ContentPasted0">> - case 4: return " WT (CLOS1)";</div>
<div class="ContentPasted0">> - case 5: return " WB (CLOS1)";</div>
<div class="ContentPasted0">> - case 6: return " WT (CLOS2)";</div>
<div class="ContentPasted0">> - case 7: return " WT (CLOS2)";</div>
<div class="ContentPasted0">> - default: return " not defined";</div>
<div class="ContentPasted0">> - }</div>
<div class="ContentPasted0">> - } else if (GRAPHICS_VER(i915) >= 12) {</div>
<div class="ContentPasted0">> - switch (obj->pat_index) {</div>
<div class="ContentPasted0">> - case 0: return " WB";</div>
<div class="ContentPasted0">> - case 1: return " WC";</div>
<div class="ContentPasted0">> - case 2: return " WT";</div>
<div class="ContentPasted0">> - case 3: return " UC";</div>
<div class="ContentPasted0">> - default: return " not defined";</div>
<div class="ContentPasted0">> - }</div>
<div class="ContentPasted0">> + if (mode == I915_CACHE_MODE_UNKNOWN || mode > ARRAY_SIZE(mode_str)) {</div>
<div class="ContentPasted0">> + if (obj->pat_set_by_user)</div>
<div class="ContentPasted0">> + seq_printf(m, " PAT-%u", obj->pat_index);</div>
<div class="ContentPasted0">> + else</div>
<div class="ContentPasted0">> + seq_printf(m, " PAT-%u-???", obj->pat_index);</div>
<div class="ContentPasted0">> } else {</div>
<div class="ContentPasted0">> - switch (obj->pat_index) {</div>
<div class="ContentPasted0">> - case 0: return " UC";</div>
<div class="ContentPasted0">> - case 1: return HAS_LLC(i915) ?</div>
<div class="ContentPasted0">> - " LLC" : " snooped";</div>
<div class="ContentPasted0">> - case 2: return " L3+LLC";</div>
<div class="ContentPasted0">> - case 3: return " WT";</div>
<div class="ContentPasted0">> - default: return " not defined";</div>
<div class="ContentPasted0">> - }</div>
<div class="ContentPasted0">> + unsigned long bit;</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> + seq_printf(m, " %s", mode_str[mode]);</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> + for_each_set_bit(bit, &flags, sizeof(i915_cache_t))</div>
<div class="ContentPasted0">> + seq_printf(m, "-%s", flag_str[bit]);</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> @@ -190,17 +179,23 @@ i915_debugfs_describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)</div>
<div class="ContentPasted0">> struct i915_vma *vma;</div>
<div class="ContentPasted0">> int pin_count = 0;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> - seq_printf(m, "%pK: %c%c%c %8zdKiB %02x %02x %s%s%s",</div>
<div class="ContentPasted0">> + seq_printf(m, "%pK: %c%c%c %8zdKiB %02x %02x",</div>
<div class="ContentPasted0">> &obj->base,</div>
<div class="ContentPasted0">> get_tiling_flag(obj),</div>
<div class="ContentPasted0">> get_global_flag(obj),</div>
<div class="ContentPasted0">> get_pin_mapped_flag(obj),</div>
<div class="ContentPasted0">> obj->base.size / 1024,</div>
<div class="ContentPasted0">> obj->read_domains,</div>
<div class="ContentPasted0">> - obj->write_domain,</div>
<div class="ContentPasted0">> - i915_cache_level_str(obj),</div>
<div class="ContentPasted0">> - obj->mm.dirty ? " dirty" : "",</div>
<div class="ContentPasted0">> - obj->mm.madv == I915_MADV_DONTNEED ? " purgeable" : "");</div>
<div class="ContentPasted0">> + obj->write_domain);</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> + obj_cache_str(m, obj);</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> + if (obj->mm.dirty)</div>
<div class="ContentPasted0">> + seq_puts(m, " dirty");</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> + if (obj->mm.madv == I915_MADV_DONTNEED)</div>
<div class="ContentPasted0">> + seq_puts(m, " purgeable");</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> if (obj->base.name)</div>
<div class="ContentPasted0">> seq_printf(m, " (name: %d)", obj->base.name);</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c</div>
<div class="ContentPasted0">> index 222d0a1f3b55..deab26752ba4 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/i915_driver.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/i915_driver.c</div>
<div class="ContentPasted0">> @@ -80,6 +80,7 @@</div>
<div class="ContentPasted0">> #include "soc/intel_dram.h"</div>
<div class="ContentPasted0">> #include "soc/intel_gmch.h"</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> +#include "i915_cache.h"</div>
<div class="ContentPasted0">> #include "i915_debugfs.h"</div>
<div class="ContentPasted0">> #include "i915_driver.h"</div>
<div class="ContentPasted0">> #include "i915_drm_client.h"</div>
<div class="ContentPasted0">> @@ -267,6 +268,8 @@ static int i915_driver_early_probe(struct drm_i915_private *dev_priv)</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> intel_detect_preproduction_hw(dev_priv);</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> + i915_cache_init(dev_priv);</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> return 0;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> err_rootgt:</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h</div>
<div class="ContentPasted0">> index b4cf6f0f636d..cb1c0c9d98ef 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/i915_drv.h</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/i915_drv.h</div>
<div class="ContentPasted0">> @@ -251,6 +251,9 @@ struct drm_i915_private {</div>
<div class="ContentPasted0">> unsigned int hpll_freq;</div>
<div class="ContentPasted0">> unsigned int czclk_freq;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> + unsigned int pat_uc;</div>
<div class="ContentPasted0">> + unsigned int pat_wb;</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">How about making these part of INTEL_INFO(i915)? They are predetermined, no need to be</div>
<div class="ContentPasted0">dynamic.</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> /**</div>
<div class="ContentPasted0">> * wq - Driver workqueue for GEM.</div>
<div class="ContentPasted0">> *</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c</div>
<div class="ContentPasted0">> index 7ae42f746cc2..9aae75862e6f 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/i915_gem.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/i915_gem.c</div>
<div class="ContentPasted0">> @@ -422,9 +422,7 @@ i915_gem_gtt_pread(struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">> ggtt->vm.insert_page(&ggtt->vm,</div>
<div class="ContentPasted0">> i915_gem_object_get_dma_address(obj,</div>
<div class="ContentPasted0">> offset >> PAGE_SHIFT),</div>
<div class="ContentPasted0">> - node.start,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE), 0);</div>
<div class="ContentPasted0">> + node.start, i915->pat_uc, 0);</div>
<div class="ContentPasted0">> } else {</div>
<div class="ContentPasted0">> page_base += offset & PAGE_MASK;</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> @@ -603,9 +601,7 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">> ggtt->vm.insert_page(&ggtt->vm,</div>
<div class="ContentPasted0">> i915_gem_object_get_dma_address(obj,</div>
<div class="ContentPasted0">> offset >> PAGE_SHIFT),</div>
<div class="ContentPasted0">> - node.start,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE), 0);</div>
<div class="ContentPasted0">> + node.start, i915->pat_uc, 0);</div>
<div class="ContentPasted0">> wmb(); /* flush modifications to the GGTT (insert_page) */</div>
<div class="ContentPasted0">> } else {</div>
<div class="ContentPasted0">> page_base += offset & PAGE_MASK;</div>
<div class="ContentPasted0">> @@ -1148,19 +1144,6 @@ int i915_gem_init(struct drm_i915_private *dev_priv)</div>
<div class="ContentPasted0">> unsigned int i;</div>
<div class="ContentPasted0">> int ret;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> - /*</div>
<div class="ContentPasted0">> - * In the proccess of replacing cache_level with pat_index a tricky</div>
<div class="ContentPasted0">> - * dependency is created on the definition of the enum i915_cache_level.</div>
<div class="ContentPasted0">> - * in case this enum is changed, PTE encode would be broken.</div>
<div class="ContentPasted0">> - * Add a WARNING here. And remove when we completely quit using this</div>
<div class="ContentPasted0">> - * enum</div>
<div class="ContentPasted0">> - */</div>
<div class="ContentPasted0">> - BUILD_BUG_ON(I915_CACHE_NONE != 0 ||</div>
<div class="ContentPasted0">> - I915_CACHE_LLC != 1 ||</div>
<div class="ContentPasted0">> - I915_CACHE_L3_LLC != 2 ||</div>
<div class="ContentPasted0">> - I915_CACHE_WT != 3 ||</div>
<div class="ContentPasted0">> - I915_MAX_CACHE_LEVEL != 4);</div>
<div class="ContentPasted0">> -</div>
<div class="ContentPasted0">> /* We need to fallback to 4K pages if host doesn't support huge gtt. */</div>
<div class="ContentPasted0">> if (intel_vgpu_active(dev_priv) && !intel_vgpu_has_huge_gtt(dev_priv))</div>
<div class="ContentPasted0">> RUNTIME_INFO(dev_priv)->page_sizes = I915_GTT_PAGE_SIZE_4K;</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c</div>
<div class="ContentPasted0">> index 4749f99e6320..fad336a45699 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/i915_gpu_error.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c</div>
<div class="ContentPasted0">> @@ -1122,14 +1122,11 @@ i915_vma_coredump_create(const struct intel_gt *gt,</div>
<div class="ContentPasted0">> mutex_lock(&ggtt->error_mutex);</div>
<div class="ContentPasted0">> if (ggtt->vm.raw_insert_page)</div>
<div class="ContentPasted0">> ggtt->vm.raw_insert_page(&ggtt->vm, dma, slot,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(gt->i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE),</div>
<div class="ContentPasted0">> + ggtt->vm.i915->pat_uc,</div>
<div class="ContentPasted0">> 0);</div>
<div class="ContentPasted0">> else</div>
<div class="ContentPasted0">> ggtt->vm.insert_page(&ggtt->vm, dma, slot,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(gt->i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE),</div>
<div class="ContentPasted0">> - 0);</div>
<div class="ContentPasted0">> + ggtt->vm.i915->pat_uc, 0);</div>
<div class="ContentPasted0">> mb();</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> s = io_mapping_map_wc(&ggtt->iomap, slot, PAGE_SIZE);</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c</div>
<div class="ContentPasted0">> index 3d7a5db9833b..fbdce31afeb1 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/i915_pci.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/i915_pci.c</div>
<div class="ContentPasted0">> @@ -32,6 +32,7 @@</div>
<div class="ContentPasted0">> #include "gt/intel_sa_media.h"</div>
<div class="ContentPasted0">> #include "gem/i915_gem_object_types.h"</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> +#include "i915_cache.h"</div>
<div class="ContentPasted0">> #include "i915_driver.h"</div>
<div class="ContentPasted0">> #include "i915_drv.h"</div>
<div class="ContentPasted0">> #include "i915_pci.h"</div>
<div class="ContentPasted0">> @@ -46,36 +47,42 @@ __diag_ignore_all("-Woverride-init", "Allow overriding inherited members");</div>
<div class="ContentPasted0">> .__runtime.graphics.ip.ver = (x), \</div>
<div class="ContentPasted0">> .__runtime.media.ip.ver = (x)</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> -#define LEGACY_CACHELEVEL \</div>
<div class="ContentPasted0">> - .cachelevel_to_pat = { \</div>
<div class="ContentPasted0">> - [I915_CACHE_NONE] = 0, \</div>
<div class="ContentPasted0">> - [I915_CACHE_LLC] = 1, \</div>
<div class="ContentPasted0">> - [I915_CACHE_L3_LLC] = 2, \</div>
<div class="ContentPasted0">> - [I915_CACHE_WT] = 3, \</div>
<div class="ContentPasted0">> +/* TODO/QQQ index 1 & 2 */</div>
<div class="ContentPasted0">> +#define LEGACY_CACHE_MODES \</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">I was thinking to just put the PAT settings here, instead of cache_modes, simply</div>
<div class="ContentPasted0"> .pat = {\</div>
<div class="ContentPasted0"> GEN8_PPAT_WB, \</div>
<div class="ContentPasted0"> GEN8_PPAT_WC, \</div>
<div class="ContentPasted0"> GEN8_PPAT_WT, \</div>
<div class="ContentPasted0"> GEN8_PPAT_UC,</div>
<div class="ContentPasted0"> }</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> + .cache_modes = { \</div>
<div class="ContentPasted0">> + [0] = I915_CACHE(UC), \</div>
<div class="ContentPasted0">> + [1] = I915_CACHE(WB), \</div>
<div class="ContentPasted0">> + [2] = _I915_CACHE(WB, L3), \</div>
<div class="ContentPasted0">> + [3] = I915_CACHE(WT), \</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> -#define TGL_CACHELEVEL \</div>
<div class="ContentPasted0">> - .cachelevel_to_pat = { \</div>
<div class="ContentPasted0">> - [I915_CACHE_NONE] = 3, \</div>
<div class="ContentPasted0">> - [I915_CACHE_LLC] = 0, \</div>
<div class="ContentPasted0">> - [I915_CACHE_L3_LLC] = 0, \</div>
<div class="ContentPasted0">> - [I915_CACHE_WT] = 2, \</div>
<div class="ContentPasted0">> +#define GEN12_CACHE_MODES \</div>
<div class="ContentPasted0">> + .cache_modes = { \</div>
<div class="ContentPasted0">> + [0] = I915_CACHE(WB), \</div>
<div class="ContentPasted0">> + [1] = I915_CACHE(WC), \</div>
<div class="ContentPasted0">> + [2] = I915_CACHE(WT), \</div>
<div class="ContentPasted0">> + [3] = I915_CACHE(UC), \</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> -#define PVC_CACHELEVEL \</div>
<div class="ContentPasted0">> - .cachelevel_to_pat = { \</div>
<div class="ContentPasted0">> - [I915_CACHE_NONE] = 0, \</div>
<div class="ContentPasted0">> - [I915_CACHE_LLC] = 3, \</div>
<div class="ContentPasted0">> - [I915_CACHE_L3_LLC] = 3, \</div>
<div class="ContentPasted0">> - [I915_CACHE_WT] = 2, \</div>
<div class="ContentPasted0">> +#define PVC_CACHE_MODES \</div>
<div class="ContentPasted0">> + .cache_modes = { \</div>
<div class="ContentPasted0">> + [0] = I915_CACHE(UC), \</div>
<div class="ContentPasted0">> + [1] = I915_CACHE(WC), \</div>
<div class="ContentPasted0">> + [2] = I915_CACHE(WT), \</div>
<div class="ContentPasted0">> + [3] = I915_CACHE(WB), \</div>
<div class="ContentPasted0">> + [4] = _I915_CACHE(WT, CLOS1), \</div>
<div class="ContentPasted0">> + [5] = _I915_CACHE(WB, CLOS1), \</div>
<div class="ContentPasted0">> + [6] = _I915_CACHE(WT, CLOS2), \</div>
<div class="ContentPasted0">> + [7] = _I915_CACHE(WB, CLOS2), \</div>
<div class="ContentPasted0">> }</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0"> .pat = {\</div>
<div class="ContentPasted0"> GEN8_PPAT_UC, \</div>
<div class="ContentPasted0"> GEN8_PPAT_WC, \</div>
<div class="ContentPasted0"> GEN8_PPAT_WT, \</div>
<div class="ContentPasted0"> GEN8_PPAT_WB, \</div>
<div class="ContentPasted0"> GEN12_PPAT_CLOS(1) | GEN8_PPAT_WT, \</div>
<div class="ContentPasted0"> GEN12_PPAT_CLOS(1) | GEN8_PPAT_WB, \</div>
<div class="ContentPasted0"> GEN12_PPAT_CLOS(2) | GEN8_PPAT_WT, \</div>
<div class="ContentPasted0"> GEN12_PPAT_CLOS(2) | GEN8_PPAT_WB, \</div>
<div class="ContentPasted0"> }</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> -#define MTL_CACHELEVEL \</div>
<div class="ContentPasted0">> - .cachelevel_to_pat = { \</div>
<div class="ContentPasted0">> - [I915_CACHE_NONE] = 2, \</div>
<div class="ContentPasted0">> - [I915_CACHE_LLC] = 3, \</div>
<div class="ContentPasted0">> - [I915_CACHE_L3_LLC] = 3, \</div>
<div class="ContentPasted0">> - [I915_CACHE_WT] = 1, \</div>
<div class="ContentPasted0">> +#define MTL_CACHE_MODES \</div>
<div class="ContentPasted0">> + .cache_modes = { \</div>
<div class="ContentPasted0">> + [0] = I915_CACHE(WB), \</div>
<div class="ContentPasted0">> + [1] = I915_CACHE(WT), \</div>
<div class="ContentPasted0">> + [2] = I915_CACHE(UC), \</div>
<div class="ContentPasted0">> + [3] = _I915_CACHE(WB, COH1W), \</div>
<div class="ContentPasted0">> + [4] = _I915_CACHE(WB, COH2W), \</div>
<div class="ContentPasted0">> }</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0"> .pat = {\</div>
<div class="ContentPasted0"> MTL_PPAT_L4_0_WB, \</div>
<div class="ContentPasted0"> MTL_PPAT_L4_1_WT, \</div>
<div class="ContentPasted0"> MTL_PPAT_L4_3_UC, \</div>
<div class="ContentPasted0"> MTL_PPAT_L4_0_WB | MTL_2_COH_1W, \</div>
<div class="ContentPasted0"> MTL_PPAT_L4_0_WB | MTL_3_COH_2W, \</div>
<div class="ContentPasted0"> }</div>
<div><br class="ContentPasted0">
</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> /* Keep in gen based order, and chronological order within a gen */</div>
<div class="ContentPasted0">> @@ -100,7 +107,7 @@ __diag_ignore_all("-Woverride-init", "Allow overriding inherited members");</div>
<div class="ContentPasted0">> .max_pat_index = 3, \</div>
<div class="ContentPasted0">> GEN_DEFAULT_PAGE_SIZES, \</div>
<div class="ContentPasted0">> GEN_DEFAULT_REGIONS, \</div>
<div class="ContentPasted0">> - LEGACY_CACHELEVEL</div>
<div class="ContentPasted0">> + LEGACY_CACHE_MODES</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> #define I845_FEATURES \</div>
<div class="ContentPasted0">> GEN(2), \</div>
<div class="ContentPasted0">> @@ -115,7 +122,7 @@ __diag_ignore_all("-Woverride-init", "Allow overriding inherited members");</div>
<div class="ContentPasted0">> .max_pat_index = 3, \</div>
<div class="ContentPasted0">> GEN_DEFAULT_PAGE_SIZES, \</div>
<div class="ContentPasted0">> GEN_DEFAULT_REGIONS, \</div>
<div class="ContentPasted0">> - LEGACY_CACHELEVEL</div>
<div class="ContentPasted0">> + LEGACY_CACHE_MODES</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> static const struct intel_device_info i830_info = {</div>
<div class="ContentPasted0">> I830_FEATURES,</div>
<div class="ContentPasted0">> @@ -148,7 +155,7 @@ static const struct intel_device_info i865g_info = {</div>
<div class="ContentPasted0">> .max_pat_index = 3, \</div>
<div class="ContentPasted0">> GEN_DEFAULT_PAGE_SIZES, \</div>
<div class="ContentPasted0">> GEN_DEFAULT_REGIONS, \</div>
<div class="ContentPasted0">> - LEGACY_CACHELEVEL</div>
<div class="ContentPasted0">> + LEGACY_CACHE_MODES</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> static const struct intel_device_info i915g_info = {</div>
<div class="ContentPasted0">> GEN3_FEATURES,</div>
<div class="ContentPasted0">> @@ -211,7 +218,7 @@ static const struct intel_device_info pnv_m_info = {</div>
<div class="ContentPasted0">> .max_pat_index = 3, \</div>
<div class="ContentPasted0">> GEN_DEFAULT_PAGE_SIZES, \</div>
<div class="ContentPasted0">> GEN_DEFAULT_REGIONS, \</div>
<div class="ContentPasted0">> - LEGACY_CACHELEVEL</div>
<div class="ContentPasted0">> + LEGACY_CACHE_MODES</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> static const struct intel_device_info i965g_info = {</div>
<div class="ContentPasted0">> GEN4_FEATURES,</div>
<div class="ContentPasted0">> @@ -255,7 +262,7 @@ static const struct intel_device_info gm45_info = {</div>
<div class="ContentPasted0">> .max_pat_index = 3, \</div>
<div class="ContentPasted0">> GEN_DEFAULT_PAGE_SIZES, \</div>
<div class="ContentPasted0">> GEN_DEFAULT_REGIONS, \</div>
<div class="ContentPasted0">> - LEGACY_CACHELEVEL</div>
<div class="ContentPasted0">> + LEGACY_CACHE_MODES</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> static const struct intel_device_info ilk_d_info = {</div>
<div class="ContentPasted0">> GEN5_FEATURES,</div>
<div class="ContentPasted0">> @@ -285,7 +292,7 @@ static const struct intel_device_info ilk_m_info = {</div>
<div class="ContentPasted0">> .__runtime.ppgtt_size = 31, \</div>
<div class="ContentPasted0">> GEN_DEFAULT_PAGE_SIZES, \</div>
<div class="ContentPasted0">> GEN_DEFAULT_REGIONS, \</div>
<div class="ContentPasted0">> - LEGACY_CACHELEVEL</div>
<div class="ContentPasted0">> + LEGACY_CACHE_MODES</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> #define SNB_D_PLATFORM \</div>
<div class="ContentPasted0">> GEN6_FEATURES, \</div>
<div class="ContentPasted0">> @@ -333,7 +340,7 @@ static const struct intel_device_info snb_m_gt2_info = {</div>
<div class="ContentPasted0">> .__runtime.ppgtt_size = 31, \</div>
<div class="ContentPasted0">> GEN_DEFAULT_PAGE_SIZES, \</div>
<div class="ContentPasted0">> GEN_DEFAULT_REGIONS, \</div>
<div class="ContentPasted0">> - LEGACY_CACHELEVEL</div>
<div class="ContentPasted0">> + LEGACY_CACHE_MODES</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> #define IVB_D_PLATFORM \</div>
<div class="ContentPasted0">> GEN7_FEATURES, \</div>
<div class="ContentPasted0">> @@ -390,7 +397,7 @@ static const struct intel_device_info vlv_info = {</div>
<div class="ContentPasted0">> .__runtime.platform_engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0),</div>
<div class="ContentPasted0">> GEN_DEFAULT_PAGE_SIZES,</div>
<div class="ContentPasted0">> GEN_DEFAULT_REGIONS,</div>
<div class="ContentPasted0">> - LEGACY_CACHELEVEL,</div>
<div class="ContentPasted0">> + LEGACY_CACHE_MODES</div>
<div class="ContentPasted0">> };</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> #define G75_FEATURES \</div>
<div class="ContentPasted0">> @@ -476,7 +483,7 @@ static const struct intel_device_info chv_info = {</div>
<div class="ContentPasted0">> .has_coherent_ggtt = false,</div>
<div class="ContentPasted0">> GEN_DEFAULT_PAGE_SIZES,</div>
<div class="ContentPasted0">> GEN_DEFAULT_REGIONS,</div>
<div class="ContentPasted0">> - LEGACY_CACHELEVEL,</div>
<div class="ContentPasted0">> + LEGACY_CACHE_MODES</div>
<div class="ContentPasted0">> };</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> #define GEN9_DEFAULT_PAGE_SIZES \</div>
<div class="ContentPasted0">> @@ -539,7 +546,7 @@ static const struct intel_device_info skl_gt4_info = {</div>
<div class="ContentPasted0">> .max_pat_index = 3, \</div>
<div class="ContentPasted0">> GEN9_DEFAULT_PAGE_SIZES, \</div>
<div class="ContentPasted0">> GEN_DEFAULT_REGIONS, \</div>
<div class="ContentPasted0">> - LEGACY_CACHELEVEL</div>
<div class="ContentPasted0">> + LEGACY_CACHE_MODES</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> static const struct intel_device_info bxt_info = {</div>
<div class="ContentPasted0">> GEN9_LP_FEATURES,</div>
<div class="ContentPasted0">> @@ -643,7 +650,7 @@ static const struct intel_device_info jsl_info = {</div>
<div class="ContentPasted0">> #define GEN12_FEATURES \</div>
<div class="ContentPasted0">> GEN11_FEATURES, \</div>
<div class="ContentPasted0">> GEN(12), \</div>
<div class="ContentPasted0">> - TGL_CACHELEVEL, \</div>
<div class="ContentPasted0">> + GEN12_CACHE_MODES, \</div>
<div class="ContentPasted0">> .has_global_mocs = 1, \</div>
<div class="ContentPasted0">> .has_pxp = 1, \</div>
<div class="ContentPasted0">> .max_pat_index = 3</div>
<div class="ContentPasted0">> @@ -711,7 +718,7 @@ static const struct intel_device_info adl_p_info = {</div>
<div class="ContentPasted0">> .__runtime.graphics.ip.ver = 12, \</div>
<div class="ContentPasted0">> .__runtime.graphics.ip.rel = 50, \</div>
<div class="ContentPasted0">> XE_HP_PAGE_SIZES, \</div>
<div class="ContentPasted0">> - TGL_CACHELEVEL, \</div>
<div class="ContentPasted0">> + GEN12_CACHE_MODES, \</div>
<div class="ContentPasted0">> .dma_mask_size = 46, \</div>
<div class="ContentPasted0">> .has_3d_pipeline = 1, \</div>
<div class="ContentPasted0">> .has_64bit_reloc = 1, \</div>
<div class="ContentPasted0">> @@ -806,7 +813,7 @@ static const struct intel_device_info pvc_info = {</div>
<div class="ContentPasted0">> BIT(VCS0) |</div>
<div class="ContentPasted0">> BIT(CCS0) | BIT(CCS1) | BIT(CCS2) | BIT(CCS3),</div>
<div class="ContentPasted0">> .require_force_probe = 1,</div>
<div class="ContentPasted0">> - PVC_CACHELEVEL,</div>
<div class="ContentPasted0">> + PVC_CACHE_MODES</div>
<div class="ContentPasted0">> };</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> static const struct intel_gt_definition xelpmp_extra_gt[] = {</div>
<div class="ContentPasted0">> @@ -841,7 +848,7 @@ static const struct intel_device_info mtl_info = {</div>
<div class="ContentPasted0">> .__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM,</div>
<div class="ContentPasted0">> .__runtime.platform_engine_mask = BIT(RCS0) | BIT(BCS0) | BIT(CCS0),</div>
<div class="ContentPasted0">> .require_force_probe = 1,</div>
<div class="ContentPasted0">> - MTL_CACHELEVEL,</div>
<div class="ContentPasted0">> + MTL_CACHE_MODES</div>
<div class="ContentPasted0">> };</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> #undef PLATFORM</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h</div>
<div class="ContentPasted0">> index 069291b3bd37..5cbae7c2ee30 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/intel_device_info.h</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/intel_device_info.h</div>
<div class="ContentPasted0">> @@ -27,6 +27,8 @@</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> #include <uapi/drm/i915_drm.h></div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> +#include "i915_cache.h"</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> #include "intel_step.h"</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> #include "display/intel_display_device.h"</div>
<div class="ContentPasted0">> @@ -248,8 +250,8 @@ struct intel_device_info {</div>
<div class="ContentPasted0">> */</div>
<div class="ContentPasted0">> const struct intel_runtime_info __runtime;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> - u32 cachelevel_to_pat[I915_MAX_CACHE_LEVEL];</div>
<div class="ContentPasted0">> - u32 max_pat_index;</div>
<div class="ContentPasted0">> + i915_cache_t cache_modes[9];</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0"> u32 pat[16];</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">See https://gfxspecs.intel.com/Predator/Home/Index/63019, there are PAT[3..0]</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">-Fei</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> + unsigned int max_pat_index;</div>
<div class="ContentPasted0">> };</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> struct intel_driver_caps {</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem.c b/drivers/gpu/drm/i915/selftests/i915_gem.c</div>
<div class="ContentPasted0">> index 61da4ed9d521..e620f73793a5 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/selftests/i915_gem.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/selftests/i915_gem.c</div>
<div class="ContentPasted0">> @@ -57,10 +57,7 @@ static void trash_stolen(struct drm_i915_private *i915)</div>
<div class="ContentPasted0">> u32 __iomem *s;</div>
<div class="ContentPasted0">> int x;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> - ggtt->vm.insert_page(&ggtt->vm, dma, slot,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE),</div>
<div class="ContentPasted0">> - 0);</div>
<div class="ContentPasted0">> + ggtt->vm.insert_page(&ggtt->vm, dma, slot, i915->pat_uc, 0);</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> s = io_mapping_map_atomic_wc(&ggtt->iomap, slot);</div>
<div class="ContentPasted0">> for (x = 0; x < PAGE_SIZE / sizeof(u32); x++) {</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c</div>
<div class="ContentPasted0">> index f8fe3681c3dc..658a5b59545e 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c</div>
<div class="ContentPasted0">> @@ -246,7 +246,7 @@ static int igt_evict_for_cache_color(void *arg)</div>
<div class="ContentPasted0">> struct drm_mm_node target = {</div>
<div class="ContentPasted0">> .start = I915_GTT_PAGE_SIZE * 2,</div>
<div class="ContentPasted0">> .size = I915_GTT_PAGE_SIZE,</div>
<div class="ContentPasted0">> - .color = i915_gem_get_pat_index(gt->i915, I915_CACHE_LLC),</div>
<div class="ContentPasted0">> + .color = I915_CACHE(WB),</div>
<div class="ContentPasted0">> };</div>
<div class="ContentPasted0">> struct drm_i915_gem_object *obj;</div>
<div class="ContentPasted0">> struct i915_vma *vma;</div>
<div class="ContentPasted0">> @@ -309,7 +309,7 @@ static int igt_evict_for_cache_color(void *arg)</div>
<div class="ContentPasted0">> /* Attempt to remove the first *pinned* vma, by removing the (empty)</div>
<div class="ContentPasted0">> * neighbour -- this should fail.</div>
<div class="ContentPasted0">> */</div>
<div class="ContentPasted0">> - target.color = i915_gem_get_pat_index(gt->i915, I915_CACHE_L3_LLC);</div>
<div class="ContentPasted0">> + target.color = _I915_CACHE(WB, LLC);</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> mutex_lock(&ggtt->vm.mutex);</div>
<div class="ContentPasted0">> err = i915_gem_evict_for_node(&ggtt->vm, NULL, &target, 0);</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c</div>
<div class="ContentPasted0">> index 5c397a2df70e..a24585784f75 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c</div>
<div class="ContentPasted0">> @@ -135,7 +135,7 @@ fake_dma_object(struct drm_i915_private *i915, u64 size)</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> obj->write_domain = I915_GEM_DOMAIN_CPU;</div>
<div class="ContentPasted0">> obj->read_domains = I915_GEM_DOMAIN_CPU;</div>
<div class="ContentPasted0">> - obj->pat_index = i915_gem_get_pat_index(i915, I915_CACHE_NONE);</div>
<div class="ContentPasted0">> + obj->pat_index = i915->pat_uc;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> /* Preallocate the "backing storage" */</div>
<div class="ContentPasted0">> if (i915_gem_object_pin_pages_unlocked(obj))</div>
<div class="ContentPasted0">> @@ -358,10 +358,8 @@ static int lowlevel_hole(struct i915_address_space *vm,</div>
<div class="ContentPasted0">> mock_vma_res->start = addr;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> with_intel_runtime_pm(vm->gt->uncore->rpm, wakeref)</div>
<div class="ContentPasted0">> - vm->insert_entries(vm, mock_vma_res,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(vm->i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE),</div>
<div class="ContentPasted0">> - 0);</div>
<div class="ContentPasted0">> + vm->insert_entries(vm, mock_vma_res,</div>
<div class="ContentPasted0">> + vm->i915->pat_uc, 0);</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> count = n;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> @@ -1379,10 +1377,7 @@ static int igt_ggtt_page(void *arg)</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> ggtt->vm.insert_page(&ggtt->vm,</div>
<div class="ContentPasted0">> i915_gem_object_get_dma_address(obj, 0),</div>
<div class="ContentPasted0">> - offset,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE),</div>
<div class="ContentPasted0">> - 0);</div>
<div class="ContentPasted0">> + offset, ggtt->vm.i915->pat_uc, 0);</div>
<div class="ContentPasted0">> }</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> order = i915_random_order(count, &prng);</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c</div>
<div class="ContentPasted0">> index d985d9bae2e8..b82fe0ef8cd7 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c</div>
<div class="ContentPasted0">> @@ -1070,9 +1070,7 @@ static int igt_lmem_write_cpu(void *arg)</div>
<div class="ContentPasted0">> /* Put the pages into a known state -- from the gpu for added fun */</div>
<div class="ContentPasted0">> intel_engine_pm_get(engine);</div>
<div class="ContentPasted0">> err = intel_context_migrate_clear(engine->gt->migrate.context, NULL,</div>
<div class="ContentPasted0">> - obj->mm.pages->sgl,</div>
<div class="ContentPasted0">> - i915_gem_get_pat_index(i915,</div>
<div class="ContentPasted0">> - I915_CACHE_NONE),</div>
<div class="ContentPasted0">> + obj->mm.pages->sgl, i915->pat_uc,</div>
<div class="ContentPasted0">> true, 0xdeadbeaf, &rq);</div>
<div class="ContentPasted0">> if (rq) {</div>
<div class="ContentPasted0">> dma_resv_add_fence(obj->base.resv, &rq->fence,</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c</div>
<div class="ContentPasted0">> index 09d4bbcdcdbf..ad778842cba2 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c</div>
<div class="ContentPasted0">> @@ -126,7 +126,12 @@ struct drm_i915_private *mock_gem_device(void)</div>
<div class="ContentPasted0">> struct drm_i915_private *i915;</div>
<div class="ContentPasted0">> struct intel_device_info *i915_info;</div>
<div class="ContentPasted0">> struct pci_dev *pdev;</div>
<div class="ContentPasted0">> - unsigned int i;</div>
<div class="ContentPasted0">> + static const i915_cache_t legacy_cache_modes[] = {</div>
<div class="ContentPasted0">> + [0] = I915_CACHE(UC),</div>
<div class="ContentPasted0">> + [1] = I915_CACHE(WB),</div>
<div class="ContentPasted0">> + [2] = _I915_CACHE(WB, L3),</div>
<div class="ContentPasted0">> + [3] = I915_CACHE(WT),</div>
<div class="ContentPasted0">> + };</div>
<div class="ContentPasted0">> int ret;</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> pdev = kzalloc(sizeof(*pdev), GFP_KERNEL);</div>
<div class="ContentPasted0">> @@ -187,8 +192,7 @@ struct drm_i915_private *mock_gem_device(void)</div>
<div class="ContentPasted0">> /* simply use legacy cache level for mock device */</div>
<div class="ContentPasted0">> i915_info = (struct intel_device_info *)INTEL_INFO(i915);</div>
<div class="ContentPasted0">> i915_info->max_pat_index = 3;</div>
<div class="ContentPasted0">> - for (i = 0; i < I915_MAX_CACHE_LEVEL; i++)</div>
<div class="ContentPasted0">> - i915_info->cachelevel_to_pat[i] = i;</div>
<div class="ContentPasted0">> + memcpy(i915_info->cache_modes, legacy_cache_modes, sizeof(legacy_cache_modes));</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> intel_memory_regions_hw_probe(i915);</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> -- </div>
<div class="ContentPasted0">> 2.39.2</div>
<br>
</div>
</body>
</html>