<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof ContentPasted0">
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> Informal commit message for now.</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> I got a bit impatient and curious to see if the idea we discussed would</div>
<div class="ContentPasted0">> work so sketched something out. I think it is what I was describing back</div>
<div class="ContentPasted0">> then..</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">Oops, you beat me on this, shame on me.</div>
<div class="ContentPasted0"> </div>
<div class="ContentPasted0">> So high level idea is to teach the driver what caching modes are hidden</div>
<div class="ContentPasted0">> behind PAT indices. Given you already had that in static tables, if we</div>
<div class="ContentPasted0">> just turn the tables a bit around and add a driver abstraction of caching</div>
<div class="ContentPasted0">> modes this is what happens:</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">>  * We can lose the ugly runtime i915_gem_get_pat_index.</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">>  * We can have a smarter i915_gem_object_has_cache_level, which now can</div>
<div class="ContentPasted0">>    use the above mentioned table to understand the caching modes and so</div>
<div class="ContentPasted0">>    does not have to pessimistically return true for _any_ input when user</div>
<div class="ContentPasted0">>    has set the PAT index. This may improve things even for MTL.</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">>  * We can simplify the debugfs printout to be platform agnostic.</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">>  * We are perhaps opening the door to un-regress the dodgy addition</div>
<div class="ContentPasted0">>    made to i915_gem_object_can_bypass_llc? See QQQ/FIXME in the patch.</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> I hope I did not forget anything, but anyway, please have a read and see</div>
<div class="ContentPasted0">> what you think. I think it has potential.</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> Proper commit message can come later.</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com></div>
<div class="ContentPasted0">> Cc: Fei Yang <fei.yang@intel.com></div>
<div class="ContentPasted0">> ---</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/Makefile                 |   1 +</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/gem/i915_gem_domain.c    |  34 ++---</div>
<div class="ContentPasted0">>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  13 +-</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/gem/i915_gem_mman.c      |  10 +-</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/gem/i915_gem_object.c    |  78 ++++-------</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/gem/i915_gem_object.h    |  18 ++-</div>
<div class="ContentPasted0">>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |  99 +-------------</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/gem/i915_gem_stolen.c    |   7 +-</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  |  26 ++--</div>
<div class="ContentPasted0">>  .../gpu/drm/i915/gem/selftests/huge_pages.c   |   2 +-</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/gt/gen6_ppgtt.c          |   4 +-</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c          |  13 +-</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/gt/intel_ggtt.c          |   9 +-</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/gt/intel_migrate.c       |  11 +-</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/gt/selftest_migrate.c    |   9 +-</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/gt/selftest_reset.c      |  14 +-</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/gt/selftest_timeline.c   |   5 +-</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/gt/selftest_tlb.c        |   5 +-</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c      |   8 +-</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/i915_cache.c             |  59 ++++++++</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/i915_cache.h             | 129 ++++++++++++++++++</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/i915_debugfs.c           |  83 ++++++-----</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/i915_driver.c            |   3 +</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/i915_drv.h               |   3 +</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/i915_gem.c               |  21 +--</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/i915_gpu_error.c         |   7 +-</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/i915_pci.c               |  83 +++++------</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/intel_device_info.h      |   6 +-</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/selftests/i915_gem.c     |   5 +-</div>
<div class="ContentPasted0">>  .../gpu/drm/i915/selftests/i915_gem_evict.c   |   4 +-</div>
<div class="ContentPasted0">>  drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  13 +-</div>
<div class="ContentPasted0">>  .../drm/i915/selftests/intel_memory_region.c  |   4 +-</div>
<div class="ContentPasted0">>  .../gpu/drm/i915/selftests/mock_gem_device.c  |  10 +-</div>
<div class="ContentPasted0">>  33 files changed, 415 insertions(+), 381 deletions(-)</div>
<div class="ContentPasted0">>  create mode 100644 drivers/gpu/drm/i915/i915_cache.c</div>
<div class="ContentPasted0">>  create mode 100644 drivers/gpu/drm/i915/i915_cache.h</div>
<div class="ContentPasted0">> </div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile</div>
<div class="ContentPasted0">> index 2be9dd960540..2c3da8f0c78e 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/Makefile</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/Makefile</div>
<div class="ContentPasted0">> @@ -30,6 +30,7 @@ subdir-ccflags-y += -I$(srctree)/$(src)</div>
<div class="ContentPasted0">>  # core driver code</div>
<div class="ContentPasted0">>  i915-y += i915_driver.o \</div>
<div class="ContentPasted0">>       i915_drm_client.o \</div>
<div class="ContentPasted0">> +     i915_cache.o \</div>
<div class="ContentPasted0">>       i915_config.o \</div>
<div class="ContentPasted0">>       i915_getparam.o \</div>
<div class="ContentPasted0">>       i915_ioctl.o \</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c</div>
<div class="ContentPasted0">> index dfaaa8b66ac3..49bfae45390f 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c</div>
<div class="ContentPasted0">> @@ -8,6 +8,7 @@</div>
<div class="ContentPasted0">>  #include "display/intel_frontbuffer.h"</div>
<div class="ContentPasted0">>  #include "gt/intel_gt.h"</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> +#include "i915_cache.h"</div>
<div class="ContentPasted0">>  #include "i915_drv.h"</div>
<div class="ContentPasted0">>  #include "i915_gem_clflush.h"</div>
<div class="ContentPasted0">>  #include "i915_gem_domain.h"</div>
<div class="ContentPasted0">> @@ -27,15 +28,8 @@ static bool gpu_write_needs_clflush(struct drm_i915_gem_object *obj)</div>
<div class="ContentPasted0">>     if (IS_DGFX(i915))</div>
<div class="ContentPasted0">>           return false;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -   /*</div>
<div class="ContentPasted0">> -    * For objects created by userspace through GEM_CREATE with pat_index</div>
<div class="ContentPasted0">> -    * set by set_pat extension, i915_gem_object_has_cache_level() will</div>
<div class="ContentPasted0">> -    * always return true, because the coherency of such object is managed</div>
<div class="ContentPasted0">> -    * by userspace. Othereise the call here would fall back to checking</div>
<div class="ContentPasted0">> -    * whether the object is un-cached or write-through.</div>
<div class="ContentPasted0">> -    */</div>
<div class="ContentPasted0">> -   return !(i915_gem_object_has_cache_level(obj, I915_CACHE_NONE) ||</div>
<div class="ContentPasted0">> -          i915_gem_object_has_cache_level(obj, I915_CACHE_WT));</div>
<div class="ContentPasted0">> +   return i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_UC) != 1 &&</div>
<div class="ContentPasted0">> +          i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_WT) != 1;</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">Why is it necessary to define I915_CACHE_MODE's while there is already i915_cache_level?</div>
<div class="ContentPasted0">I thought we wanted to get rid of such abstractions instead of adding more.</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">This patch also introduced INTEL_INFO(i915)->cache_modes, why don't we directly add the</div>
<div class="ContentPasted0">platform specific PAT there? For example, add the following for MTL,</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">INTEL_INFO(i915)->pat[] = {</div>
<div class="ContentPasted0">      [0] = MTL_PPAT_L4_0_WB, \</div>
<div class="ContentPasted0">      [1] = MTL_PPAT_L4_1_WT, \</div>
<div class="ContentPasted0">      [2] = MTL_PPAT_L4_3_UC, \</div>
<div class="ContentPasted0">      [3] = MTL_PPAT_L4_0_WB | MTL_2_COH_1W, \</div>
<div class="ContentPasted0">      [4] = MTL_PPAT_L4_0_WB | MTL_3_COH_2W, \</div>
<div class="ContentPasted0">}</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">Everything here has already been defined, no need to introduce new macros.</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">This can also be used to initialize the PAT index registers, like in</div>
<div class="ContentPasted0">xelpg_setup_private_ppat() and xelpmp_setup_private_ppat().</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">>  }</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  bool i915_gem_cpu_write_needs_clflush(struct drm_i915_gem_object *obj)</div>
<div class="ContentPasted0">> @@ -272,15 +266,18 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)</div>
<div class="ContentPasted0">>  int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">>                           enum i915_cache_level cache_level)</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">s/enum i915_cache_level cache_level/unsigned int pat_index</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">This function is for KMD objects only, I don't think we even need to keep</div>
<div class="ContentPasted0">the i915_cache_level, simply passing in INTEL_INFO(i915)->pat_uc/wb/wt is</div>
<div class="ContentPasted0">good enough.</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">>  {</div>
<div class="ContentPasted0">> +   struct drm_i915_private *i915 = to_i915(obj->base.dev);</div>
<div class="ContentPasted0">> +   i915_cache_t mode;</div>
<div class="ContentPasted0">>     int ret;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -   /*</div>
<div class="ContentPasted0">> -    * For objects created by userspace through GEM_CREATE with pat_index</div>
<div class="ContentPasted0">> -    * set by set_pat extension, simply return 0 here without touching</div>
<div class="ContentPasted0">> -    * the cache setting, because such objects should have an immutable</div>
<div class="ContentPasted0">> -    * cache setting by desgin and always managed by userspace.</div>
<div class="ContentPasted0">> -    */</div>
<div class="ContentPasted0">> -   if (i915_gem_object_has_cache_level(obj, cache_level))</div>
<div class="ContentPasted0">> +   if (obj->pat_set_by_user)</div>
<div class="ContentPasted0">> +         return -EINVAL;</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">I don't think this condition would ever be true, but okay to keep it.</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> +   ret = i915_cache_level_to_pat_and_mode(i915, cache_level, &mode);</div>
<div class="ContentPasted0">> +   if (ret < 0)</div>
<div class="ContentPasted0">> +         return -EINVAL;</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +   if (mode == obj->cache_mode)</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">The above lines can be just one line,</div>
<div class="ContentPasted0">      if (pat_index == obj->pat_index)</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">>           return 0;</div>
<div class="ContentPasted0">></div>
<div class="ContentPasted0">>     ret = i915_gem_object_wait(obj,</div>
<div class="ContentPasted0">> @@ -326,10 +323,9 @@ int i915_gem_get_caching_ioctl(struct drm_device *dev, void *data,</div>
<div class="ContentPasted0">>           goto out;</div>
<div class="ContentPasted0">>     }</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -   if (i915_gem_object_has_cache_level(obj, I915_CACHE_LLC) ||</div>
<div class="ContentPasted0">> -       i915_gem_object_has_cache_level(obj, I915_CACHE_L3_LLC))</div>
<div class="ContentPasted0">> +   if (i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_WB))</div>
<div class="ContentPasted0">>           args->caching = I915_CACHING_CACHED;</div>
<div class="ContentPasted0">> -   else if (i915_gem_object_has_cache_level(obj, I915_CACHE_WT))</div>
<div class="ContentPasted0">> +   else if (i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_WT))</div>
<div class="ContentPasted0">>           args->caching = I915_CACHING_DISPLAY;</div>
<div class="ContentPasted0">>     else</div>
<div class="ContentPasted0">>           args->caching = I915_CACHING_NONE;</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c</div>
<div class="ContentPasted0">> index d3208a325614..ee85221fa6eb 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c</div>
<div class="ContentPasted0">> @@ -640,15 +640,9 @@ static inline int use_cpu_reloc(const struct reloc_cache *cache,</div>
<div class="ContentPasted0">>     if (DBG_FORCE_RELOC == FORCE_GTT_RELOC)</div>
<div class="ContentPasted0">>           return false;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -   /*</div>
<div class="ContentPasted0">> -    * For objects created by userspace through GEM_CREATE with pat_index</div>
<div class="ContentPasted0">> -    * set by set_pat extension, i915_gem_object_has_cache_level() always</div>
<div class="ContentPasted0">> -    * return true, otherwise the call would fall back to checking whether</div>
<div class="ContentPasted0">> -    * the object is un-cached.</div>
<div class="ContentPasted0">> -    */</div>
<div class="ContentPasted0">>     return (cache->has_llc ||</div>
<div class="ContentPasted0">>           obj->cache_dirty ||</div>
<div class="ContentPasted0">> -         !i915_gem_object_has_cache_level(obj, I915_CACHE_NONE));</div>
<div class="ContentPasted0">> +         i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_UC) != 1);</div>
<div class="ContentPasted0">>  }</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  static int eb_reserve_vma(struct i915_execbuffer *eb,</div>
<div class="ContentPasted0">> @@ -1329,10 +1323,7 @@ static void *reloc_iomap(struct i915_vma *batch,</div>
<div class="ContentPasted0">>     if (drm_mm_node_allocated(&cache->node)) {</div>
<div class="ContentPasted0">>           ggtt->vm.insert_page(&ggtt->vm,</div>
<div class="ContentPasted0">>                            i915_gem_object_get_dma_address(obj, page),</div>
<div class="ContentPasted0">> -                          offset,</div>
<div class="ContentPasted0">> -                          i915_gem_get_pat_index(ggtt->vm.i915,</div>
<div class="ContentPasted0">> -                                           I915_CACHE_NONE),</div>
<div class="ContentPasted0">> -                          0);</div>
<div class="ContentPasted0">> +                          offset, eb->i915->pat_uc, 0);</div>
<div class="ContentPasted0">>     } else {</div>
<div class="ContentPasted0">>           offset += page << PAGE_SHIFT;</div>
<div class="ContentPasted0">>     }</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c</div>
<div class="ContentPasted0">> index aa4d842d4c5a..5e21aedb02d2 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c</div>
<div class="ContentPasted0">> @@ -386,13 +386,11 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf)</div>
<div class="ContentPasted0">>     /*</div>
<div class="ContentPasted0">>      * For objects created by userspace through GEM_CREATE with pat_index</div>
<div class="ContentPasted0">>      * set by set_pat extension, coherency is managed by userspace, make</div>
<div class="ContentPasted0">> -    * sure we don't fail handling the vm fault by calling</div>
<div class="ContentPasted0">> -    * i915_gem_object_has_cache_level() which always return true for such</div>
<div class="ContentPasted0">> -    * objects. Otherwise this helper function would fall back to checking</div>
<div class="ContentPasted0">> -    * whether the object is un-cached.</div>
<div class="ContentPasted0">> +    * sure we don't fail handling the vm fault by making sure that we</div>
<div class="ContentPasted0">> +    * know the object is uncached or that we have LLC.</div>
<div class="ContentPasted0">>      */</div>
<div class="ContentPasted0">> -   if (!(i915_gem_object_has_cache_level(obj, I915_CACHE_NONE) ||</div>
<div class="ContentPasted0">> -         HAS_LLC(i915))) {</div>
<div class="ContentPasted0">> +   if (i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_UC) != 1 &&</div>
<div class="ContentPasted0">> +       !HAS_LLC(i915)) {</div>
<div class="ContentPasted0">>           ret = -EFAULT;</div>
<div class="ContentPasted0">>           goto err_unpin;</div>
<div class="ContentPasted0">>     }</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c</div>
<div class="ContentPasted0">> index 0004d5fa7cc2..52c6c5f09bdd 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c</div>
<div class="ContentPasted0">> @@ -45,33 +45,6 @@ static struct kmem_cache *slab_objects;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  static const struct drm_gem_object_funcs i915_gem_object_funcs;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,</div>
<div class="ContentPasted0">> -                         enum i915_cache_level level)</div>
<div class="ContentPasted0">> -{</div>
<div class="ContentPasted0">> -   if (drm_WARN_ON(&i915->drm, level >= I915_MAX_CACHE_LEVEL))</div>
<div class="ContentPasted0">> -         return 0;</div>
<div class="ContentPasted0">> -</div>
<div class="ContentPasted0">> -   return INTEL_INFO(i915)->cachelevel_to_pat[level];</div>
<div class="ContentPasted0">> -}</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">Yes, this can be removed. INTEL_INFO(i915)->pat_uc/wb/wt should be sufficient,</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> -</div>
<div class="ContentPasted0">> -bool i915_gem_object_has_cache_level(const struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">> -                          enum i915_cache_level lvl)</div>
<div class="ContentPasted0">> -{</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">If we had INTEL_INFO(i915)->pat[] setup, it would be easier just to keep this</div>
<div class="ContentPasted0">function, because we can simply check the cache policy bit field in</div>
<div class="ContentPasted0">INTEL_INFO(i915)->pat[obj->pat_index] to see whether it is cached, uncached,</div>
<div class="ContentPasted0">or write-through.</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">See Bspec https://gfxspecs.intel.com/Predator/Home/Index/44235</div>
<div class="ContentPasted0">For MTL check bit[3:2], for other gen12 platforms check bit[1:0]</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> -   /*</div>
<div class="ContentPasted0">> -    * In case the pat_index is set by user space, this kernel mode</div>
<div class="ContentPasted0">> -    * driver should leave the coherency to be managed by user space,</div>
<div class="ContentPasted0">> -    * simply return true here.</div>
<div class="ContentPasted0">> -    */</div>
<div class="ContentPasted0">> -   if (obj->pat_set_by_user)</div>
<div class="ContentPasted0">> -         return true;</div>
<div class="ContentPasted0">> -</div>
<div class="ContentPasted0">> -   /*</div>
<div class="ContentPasted0">> -    * Otherwise the pat_index should have been converted from cache_level</div>
<div class="ContentPasted0">> -    * so that the following comparison is valid.</div>
<div class="ContentPasted0">> -    */</div>
<div class="ContentPasted0">> -   return obj->pat_index == i915_gem_get_pat_index(obj_to_i915(obj), lvl);</div>
<div class="ContentPasted0">> -}</div>
<div class="ContentPasted0">> -</div>
<div class="ContentPasted0">>  struct drm_i915_gem_object *i915_gem_object_alloc(void)</div>
<div class="ContentPasted0">>  {</div>
<div class="ContentPasted0">>     struct drm_i915_gem_object *obj;</div>
<div class="ContentPasted0">> @@ -144,6 +117,24 @@ void __i915_gem_object_fini(struct drm_i915_gem_object *obj)</div>
<div class="ContentPasted0">>     dma_resv_fini(&obj->base._resv);</div>
<div class="ContentPasted0">>  }</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> +void __i915_gem_object_update_coherency(struct drm_i915_gem_object *obj)</div>
<div class="ContentPasted0">> +{</div>
<div class="ContentPasted0">> +   struct drm_i915_private *i915 = to_i915(obj->base.dev);</div>
<div class="ContentPasted0">> +   const unsigned int mode = I915_CACHE_MODE(obj->cache_mode);</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">obj->cache_mode seems to be redundant if we have INTEL_INFO(i915)->pat[] and</div>
<div class="ContentPasted0">obj->pat_index.</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +   if (!(mode == I915_CACHE_MODE_UNKNOWN || mode == I915_CACHE_MODE_UC))</div>
<div class="ContentPasted0">> +         obj->cache_coherent = (I915_BO_CACHE_COHERENT_FOR_READ |</div>
<div class="ContentPasted0">> +                            I915_BO_CACHE_COHERENT_FOR_WRITE);</div>
<div class="ContentPasted0">> +   else if (mode != I915_CACHE_MODE_UNKNOWN && HAS_LLC(i915))</div>
<div class="ContentPasted0">> +         obj->cache_coherent = I915_BO_CACHE_COHERENT_FOR_READ;</div>
<div class="ContentPasted0">> +   else</div>
<div class="ContentPasted0">> +         obj->cache_coherent = 0;</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +   obj->cache_dirty =</div>
<div class="ContentPasted0">> +         !(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE) &&</div>
<div class="ContentPasted0">> +         !IS_DGFX(i915);</div>
<div class="ContentPasted0">> +}</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">>  /**</div>
<div class="ContentPasted0">>   * i915_gem_object_set_cache_coherency - Mark up the object's coherency levels</div>
<div class="ContentPasted0">>   * for a given cache_level</div>
<div class="ContentPasted0">> @@ -154,20 +145,15 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">>                              unsigned int cache_level)</div>
<div class="ContentPasted0">>  {</div>
<div class="ContentPasted0">>     struct drm_i915_private *i915 = to_i915(obj->base.dev);</div>
<div class="ContentPasted0">> +   i915_cache_t mode;</div>
<div class="ContentPasted0">> +   int found;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -   obj->pat_index = i915_gem_get_pat_index(i915, cache_level);</div>
<div class="ContentPasted0">> +   found = i915_cache_level_to_pat_and_mode(i915, cache_level, &mode);</div>
<div class="ContentPasted0">> +   GEM_WARN_ON(found < 0);</div>
<div class="ContentPasted0">> +   obj->pat_index = found;</div>
<div class="ContentPasted0">> +   obj->cache_mode = mode;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -   if (cache_level != I915_CACHE_NONE)</div>
<div class="ContentPasted0">> -         obj->cache_coherent = (I915_BO_CACHE_COHERENT_FOR_READ |</div>
<div class="ContentPasted0">> -                            I915_BO_CACHE_COHERENT_FOR_WRITE);</div>
<div class="ContentPasted0">> -   else if (HAS_LLC(i915))</div>
<div class="ContentPasted0">> -         obj->cache_coherent = I915_BO_CACHE_COHERENT_FOR_READ;</div>
<div class="ContentPasted0">> -   else</div>
<div class="ContentPasted0">> -         obj->cache_coherent = 0;</div>
<div class="ContentPasted0">> -</div>
<div class="ContentPasted0">> -   obj->cache_dirty =</div>
<div class="ContentPasted0">> -         !(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE) &&</div>
<div class="ContentPasted0">> -         !IS_DGFX(i915);</div>
<div class="ContentPasted0">> +   __i915_gem_object_update_coherency(obj);</div>
<div class="ContentPasted0">>  }</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  /**</div>
<div class="ContentPasted0">> @@ -187,18 +173,9 @@ void i915_gem_object_set_pat_index(struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">>           return;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>     obj->pat_index = pat_index;</div>
<div class="ContentPasted0">> +   obj->cache_mode = INTEL_INFO(i915)->cache_modes[pat_index];</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -   if (pat_index != i915_gem_get_pat_index(i915, I915_CACHE_NONE))</div>
<div class="ContentPasted0">> -         obj->cache_coherent = (I915_BO_CACHE_COHERENT_FOR_READ |</div>
<div class="ContentPasted0">> -                            I915_BO_CACHE_COHERENT_FOR_WRITE);</div>
<div class="ContentPasted0">> -   else if (HAS_LLC(i915))</div>
<div class="ContentPasted0">> -         obj->cache_coherent = I915_BO_CACHE_COHERENT_FOR_READ;</div>
<div class="ContentPasted0">> -   else</div>
<div class="ContentPasted0">> -         obj->cache_coherent = 0;</div>
<div class="ContentPasted0">> -</div>
<div class="ContentPasted0">> -   obj->cache_dirty =</div>
<div class="ContentPasted0">> -         !(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE) &&</div>
<div class="ContentPasted0">> -         !IS_DGFX(i915);</div>
<div class="ContentPasted0">> +   __i915_gem_object_update_coherency(obj);</div>
<div class="ContentPasted0">>  }</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  bool i915_gem_object_can_bypass_llc(struct drm_i915_gem_object *obj)</div>
<div class="ContentPasted0">> @@ -215,6 +192,7 @@ bool i915_gem_object_can_bypass_llc(struct drm_i915_gem_object *obj)</div>
<div class="ContentPasted0">>     /*</div>
<div class="ContentPasted0">>      * Always flush cache for UMD objects at creation time.</div>
<div class="ContentPasted0">>      */</div>
<div class="ContentPasted0">> +   /* QQQ/FIXME why? avoidable performance penalty? */</div>
<div class="ContentPasted0">>     if (obj->pat_set_by_user)</div>
<div class="ContentPasted0">>           return true;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h</div>
<div class="ContentPasted0">> index 884a17275b3a..f84f41e9f81f 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h</div>
<div class="ContentPasted0">> @@ -13,6 +13,7 @@</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  #include "display/intel_frontbuffer.h"</div>
<div class="ContentPasted0">>  #include "intel_memory_region.h"</div>
<div class="ContentPasted0">> +#include "i915_cache.h"</div>
<div class="ContentPasted0">>  #include "i915_gem_object_types.h"</div>
<div class="ContentPasted0">>  #include "i915_gem_gtt.h"</div>
<div class="ContentPasted0">>  #include "i915_gem_ww.h"</div>
<div class="ContentPasted0">> @@ -32,10 +33,18 @@ static inline bool i915_gem_object_size_2big(u64 size)</div>
<div class="ContentPasted0">>     return false;</div>
<div class="ContentPasted0">>  }</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,</div>
<div class="ContentPasted0">> -                         enum i915_cache_level level);</div>
<div class="ContentPasted0">> -bool i915_gem_object_has_cache_level(const struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">> -                          enum i915_cache_level lvl);</div>
<div class="ContentPasted0">> +static inline int</div>
<div class="ContentPasted0">> +i915_gem_object_has_cache_mode(const struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">> +                      unsigned int mode)</div>
<div class="ContentPasted0">> +{</div>
<div class="ContentPasted0">> +   if (I915_CACHE_MODE(obj->cache_mode) == mode)</div>
<div class="ContentPasted0">> +         return 1;</div>
<div class="ContentPasted0">> +   else if (obj->pat_set_by_user)</div>
<div class="ContentPasted0">> +         return -1; /* Unknown, callers should assume no. */</div>
<div class="ContentPasted0">> +   else</div>
<div class="ContentPasted0">> +         return 0;</div>
<div class="ContentPasted0">> +}</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">>  void i915_gem_init__objects(struct drm_i915_private *i915);</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  void i915_objects_module_exit(void);</div>
<div class="ContentPasted0">> @@ -764,6 +773,7 @@ int i915_gem_object_wait_moving_fence(struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">>                             bool intr);</div>
<div class="ContentPasted0">>  bool i915_gem_object_has_unknown_state(struct drm_i915_gem_object *obj);</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> +void __i915_gem_object_update_coherency(struct drm_i915_gem_object *obj);</div>
<div class="ContentPasted0">>  void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">>                              unsigned int cache_level);</div>
<div class="ContentPasted0">>  void i915_gem_object_set_pat_index(struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h</div>
<div class="ContentPasted0">> index 8de2b91b3edf..1f9fa28d07df 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h</div>
<div class="ContentPasted0">> @@ -14,6 +14,7 @@</div>
<div class="ContentPasted0">>  #include <uapi/drm/i915_drm.h></div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  #include "i915_active.h"</div>
<div class="ContentPasted0">> +#include "i915_cache.h"</div>
<div class="ContentPasted0">>  #include "i915_selftest.h"</div>
<div class="ContentPasted0">>  #include "i915_vma_resource.h"</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> @@ -116,93 +117,6 @@ struct drm_i915_gem_object_ops {</div>
<div class="ContentPasted0">>     const char *name; /* friendly name for debug, e.g. lockdep classes */</div>
<div class="ContentPasted0">>  };</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -/**</div>
<div class="ContentPasted0">> - * enum i915_cache_level - The supported GTT caching values for system memory</div>
<div class="ContentPasted0">> - * pages.</div>
<div class="ContentPasted0">> - *</div>
<div class="ContentPasted0">> - * These translate to some special GTT PTE bits when binding pages into some</div>
<div class="ContentPasted0">> - * address space. It also determines whether an object, or rather its pages are</div>
<div class="ContentPasted0">> - * coherent with the GPU, when also reading or writing through the CPU cache</div>
<div class="ContentPasted0">> - * with those pages.</div>
<div class="ContentPasted0">> - *</div>
<div class="ContentPasted0">> - * Userspace can also control this through struct drm_i915_gem_caching.</div>
<div class="ContentPasted0">> - */</div>
<div class="ContentPasted0">> -enum i915_cache_level {</div>
<div class="ContentPasted0">> -   /**</div>
<div class="ContentPasted0">> -    * @I915_CACHE_NONE:</div>
<div class="ContentPasted0">> -    *</div>
<div class="ContentPasted0">> -    * GPU access is not coherent with the CPU cache. If the cache is dirty</div>
<div class="ContentPasted0">> -    * and we need the underlying pages to be coherent with some later GPU</div>
<div class="ContentPasted0">> -    * access then we need to manually flush the pages.</div>
<div class="ContentPasted0">> -    *</div>
<div class="ContentPasted0">> -    * On shared LLC platforms reads and writes through the CPU cache are</div>
<div class="ContentPasted0">> -    * still coherent even with this setting. See also</div>
<div class="ContentPasted0">> -    * &drm_i915_gem_object.cache_coherent for more details. Due to this we</div>
<div class="ContentPasted0">> -    * should only ever use uncached for scanout surfaces, otherwise we end</div>
<div class="ContentPasted0">> -    * up over-flushing in some places.</div>
<div class="ContentPasted0">> -    *</div>
<div class="ContentPasted0">> -    * This is the default on non-LLC platforms.</div>
<div class="ContentPasted0">> -    */</div>
<div class="ContentPasted0">> -   I915_CACHE_NONE = 0,</div>
<div class="ContentPasted0">> -   /**</div>
<div class="ContentPasted0">> -    * @I915_CACHE_LLC:</div>
<div class="ContentPasted0">> -    *</div>
<div class="ContentPasted0">> -    * GPU access is coherent with the CPU cache. If the cache is dirty,</div>
<div class="ContentPasted0">> -    * then the GPU will ensure that access remains coherent, when both</div>
<div class="ContentPasted0">> -    * reading and writing through the CPU cache. GPU writes can dirty the</div>
<div class="ContentPasted0">> -    * CPU cache.</div>
<div class="ContentPasted0">> -    *</div>
<div class="ContentPasted0">> -    * Not used for scanout surfaces.</div>
<div class="ContentPasted0">> -    *</div>
<div class="ContentPasted0">> -    * Applies to both platforms with shared LLC(HAS_LLC), and snooping</div>
<div class="ContentPasted0">> -    * based platforms(HAS_SNOOP).</div>
<div class="ContentPasted0">> -    *</div>
<div class="ContentPasted0">> -    * This is the default on shared LLC platforms.  The only exception is</div>
<div class="ContentPasted0">> -    * scanout objects, where the display engine is not coherent with the</div>
<div class="ContentPasted0">> -    * CPU cache. For such objects I915_CACHE_NONE or I915_CACHE_WT is</div>
<div class="ContentPasted0">> -    * automatically applied by the kernel in pin_for_display, if userspace</div>
<div class="ContentPasted0">> -    * has not done so already.</div>
<div class="ContentPasted0">> -    */</div>
<div class="ContentPasted0">> -   I915_CACHE_LLC,</div>
<div class="ContentPasted0">> -   /**</div>
<div class="ContentPasted0">> -    * @I915_CACHE_L3_LLC:</div>
<div class="ContentPasted0">> -    *</div>
<div class="ContentPasted0">> -    * Explicitly enable the Gfx L3 cache, with coherent LLC.</div>
<div class="ContentPasted0">> -    *</div>
<div class="ContentPasted0">> -    * The Gfx L3 sits between the domain specific caches, e.g</div>
<div class="ContentPasted0">> -    * sampler/render caches, and the larger LLC. LLC is coherent with the</div>
<div class="ContentPasted0">> -    * GPU, but L3 is only visible to the GPU, so likely needs to be flushed</div>
<div class="ContentPasted0">> -    * when the workload completes.</div>
<div class="ContentPasted0">> -    *</div>
<div class="ContentPasted0">> -    * Not used for scanout surfaces.</div>
<div class="ContentPasted0">> -    *</div>
<div class="ContentPasted0">> -    * Only exposed on some gen7 + GGTT. More recent hardware has dropped</div>
<div class="ContentPasted0">> -    * this explicit setting, where it should now be enabled by default.</div>
<div class="ContentPasted0">> -    */</div>
<div class="ContentPasted0">> -   I915_CACHE_L3_LLC,</div>
<div class="ContentPasted0">> -   /**</div>
<div class="ContentPasted0">> -    * @I915_CACHE_WT:</div>
<div class="ContentPasted0">> -    *</div>
<div class="ContentPasted0">> -    * Write-through. Used for scanout surfaces.</div>
<div class="ContentPasted0">> -    *</div>
<div class="ContentPasted0">> -    * The GPU can utilise the caches, while still having the display engine</div>
<div class="ContentPasted0">> -    * be coherent with GPU writes, as a result we don't need to flush the</div>
<div class="ContentPasted0">> -    * CPU caches when moving out of the render domain. This is the default</div>
<div class="ContentPasted0">> -    * setting chosen by the kernel, if supported by the HW, otherwise we</div>
<div class="ContentPasted0">> -    * fallback to I915_CACHE_NONE. On the CPU side writes through the CPU</div>
<div class="ContentPasted0">> -    * cache still need to be flushed, to remain coherent with the display</div>
<div class="ContentPasted0">> -    * engine.</div>
<div class="ContentPasted0">> -    */</div>
<div class="ContentPasted0">> -   I915_CACHE_WT,</div>
<div class="ContentPasted0">> -   /**</div>
<div class="ContentPasted0">> -    * @I915_MAX_CACHE_LEVEL:</div>
<div class="ContentPasted0">> -    *</div>
<div class="ContentPasted0">> -    * Mark the last entry in the enum. Used for defining cachelevel_to_pat</div>
<div class="ContentPasted0">> -    * array for cache_level to pat translation table.</div>
<div class="ContentPasted0">> -    */</div>
<div class="ContentPasted0">> -   I915_MAX_CACHE_LEVEL,</div>
<div class="ContentPasted0">> -};</div>
<div class="ContentPasted0">> -</div>
<div class="ContentPasted0">>  enum i915_map_type {</div>
<div class="ContentPasted0">>     I915_MAP_WB = 0,</div>
<div class="ContentPasted0">>     I915_MAP_WC,</div>
<div class="ContentPasted0">> @@ -375,6 +289,9 @@ struct drm_i915_gem_object {</div>
<div class="ContentPasted0">>     unsigned int mem_flags;</div>
<div class="ContentPasted0">>  #define I915_BO_FLAG_STRUCT_PAGE BIT(0) /* Object backed by struct pages */</div>
<div class="ContentPasted0">>  #define I915_BO_FLAG_IOMEM       BIT(1) /* Object backed by IO memory */</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +   i915_cache_t cache_mode;</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">>     /**</div>
<div class="ContentPasted0">>      * @pat_index: The desired PAT index.</div>
<div class="ContentPasted0">>      *</div>
<div class="ContentPasted0">> @@ -409,9 +326,7 @@ struct drm_i915_gem_object {</div>
<div class="ContentPasted0">>      * Check for @pat_set_by_user to find out if an object has pat index set</div>
<div class="ContentPasted0">>      * by userspace. The ioctl's to change cache settings have also been</div>
<div class="ContentPasted0">>      * disabled for the objects with pat index set by userspace. Please don't</div>
<div class="ContentPasted0">> -    * assume @cache_coherent having the flags set as describe here. A helper</div>
<div class="ContentPasted0">> -    * function i915_gem_object_has_cache_level() provides one way to bypass</div>
<div class="ContentPasted0">> -    * the use of this field.</div>
<div class="ContentPasted0">> +    * assume @cache_coherent having the flags set as describe here.</div>
<div class="ContentPasted0">>      *</div>
<div class="ContentPasted0">>      * Track whether the pages are coherent with the GPU if reading or</div>
<div class="ContentPasted0">>      * writing through the CPU caches. The largely depends on the</div>
<div class="ContentPasted0">> @@ -492,9 +407,7 @@ struct drm_i915_gem_object {</div>
<div class="ContentPasted0">>      * Check for @pat_set_by_user to find out if an object has pat index set</div>
<div class="ContentPasted0">>      * by userspace. The ioctl's to change cache settings have also been</div>
<div class="ContentPasted0">>      * disabled for the objects with pat_index set by userspace. Please don't</div>
<div class="ContentPasted0">> -    * assume @cache_dirty is set as describe here. Also see helper function</div>
<div class="ContentPasted0">> -    * i915_gem_object_has_cache_level() for possible ways to bypass the use</div>
<div class="ContentPasted0">> -    * of this field.</div>
<div class="ContentPasted0">> +    * assume @cache_dirty is set as describe here.</div>
<div class="ContentPasted0">>      *</div>
<div class="ContentPasted0">>      * Track if we are we dirty with writes through the CPU cache for this</div>
<div class="ContentPasted0">>      * object. As a result reading directly from main memory might yield</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c</div>
<div class="ContentPasted0">> index 3b094d36a0b0..a7012f1a9c70 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c</div>
<div class="ContentPasted0">> @@ -563,11 +563,8 @@ static void dbg_poison(struct i915_ggtt *ggtt,</div>
<div class="ContentPasted0">>     while (size) {</div>
<div class="ContentPasted0">>           void __iomem *s;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -         ggtt->vm.insert_page(&ggtt->vm, addr,</div>
<div class="ContentPasted0">> -                          ggtt->error_capture.start,</div>
<div class="ContentPasted0">> -                          i915_gem_get_pat_index(ggtt->vm.i915,</div>
<div class="ContentPasted0">> -                                           I915_CACHE_NONE),</div>
<div class="ContentPasted0">> -                          0);</div>
<div class="ContentPasted0">> +         ggtt->vm.insert_page(&ggtt->vm, addr, ggtt->error_capture.start,</div>
<div class="ContentPasted0">> +                          ggtt->vm.i915->pat_uc, 0);</div>
<div class="ContentPasted0">>           mb();</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>           s = io_mapping_map_wc(&ggtt->iomap,</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c</div>
<div class="ContentPasted0">> index 7078af2f8f79..e794bd2a7ccb 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c</div>
<div class="ContentPasted0">> @@ -58,6 +58,16 @@ i915_ttm_cache_level(struct drm_i915_private *i915, struct ttm_resource *res,</div>
<div class="ContentPasted0">>           I915_CACHE_NONE;</div>
<div class="ContentPasted0">>  }</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> +static unsigned int</div>
<div class="ContentPasted0">> +i915_ttm_cache_pat(struct drm_i915_private *i915, struct ttm_resource *res,</div>
<div class="ContentPasted0">> +              struct ttm_tt *ttm)</div>
<div class="ContentPasted0">> +{</div>
<div class="ContentPasted0">> +   return ((HAS_LLC(i915) || HAS_SNOOP(i915)) &&</div>
<div class="ContentPasted0">> +         !i915_ttm_gtt_binds_lmem(res) &&</div>
<div class="ContentPasted0">> +         ttm->caching == ttm_cached) ? i915->pat_wb :</div>
<div class="ContentPasted0">> +         i915->pat_uc;</div>
<div class="ContentPasted0">> +}</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">>  static struct intel_memory_region *</div>
<div class="ContentPasted0">>  i915_ttm_region(struct ttm_device *bdev, int ttm_mem_type)</div>
<div class="ContentPasted0">>  {</div>
<div class="ContentPasted0">> @@ -196,7 +206,7 @@ static struct dma_fence *i915_ttm_accel_move(struct ttm_buffer_object *bo,</div>
<div class="ContentPasted0">>     struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);</div>
<div class="ContentPasted0">>     struct i915_request *rq;</div>
<div class="ContentPasted0">>     struct ttm_tt *src_ttm = bo->ttm;</div>
<div class="ContentPasted0">> -   enum i915_cache_level src_level, dst_level;</div>
<div class="ContentPasted0">> +   unsigned int src_pat, dst_pat;</div>
<div class="ContentPasted0">>     int ret;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>     if (!to_gt(i915)->migrate.context || intel_gt_is_wedged(to_gt(i915)))</div>
<div class="ContentPasted0">> @@ -206,16 +216,15 @@ static struct dma_fence *i915_ttm_accel_move(struct ttm_buffer_object *bo,</div>
<div class="ContentPasted0">>     if (I915_SELFTEST_ONLY(fail_gpu_migration))</div>
<div class="ContentPasted0">>           clear = true;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -   dst_level = i915_ttm_cache_level(i915, dst_mem, dst_ttm);</div>
<div class="ContentPasted0">> +   dst_pat = i915_ttm_cache_pat(i915, dst_mem, dst_ttm);</div>
<div class="ContentPasted0">>     if (clear) {</div>
<div class="ContentPasted0">>           if (bo->type == ttm_bo_type_kernel &&</div>
<div class="ContentPasted0">>               !I915_SELFTEST_ONLY(fail_gpu_migration))</div>
<div class="ContentPasted0">>                 return ERR_PTR(-EINVAL);</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>           intel_engine_pm_get(to_gt(i915)->migrate.context->engine);</div>
<div class="ContentPasted0">> -         ret = intel_context_migrate_clear(to_gt(i915)->migrate.context, deps,</div>
<div class="ContentPasted0">> -                                   dst_st->sgl,</div>
<div class="ContentPasted0">> -                                   i915_gem_get_pat_index(i915, dst_level),</div>
<div class="ContentPasted0">> +         ret = intel_context_migrate_clear(to_gt(i915)->migrate.context,</div>
<div class="ContentPasted0">> +                                   deps, dst_st->sgl, dst_pat,</div>
<div class="ContentPasted0">>                                     i915_ttm_gtt_binds_lmem(dst_mem),</div>
<div class="ContentPasted0">>                                     0, &rq);</div>
<div class="ContentPasted0">>     } else {</div>
<div class="ContentPasted0">> @@ -225,14 +234,13 @@ static struct dma_fence *i915_ttm_accel_move(struct ttm_buffer_object *bo,</div>
<div class="ContentPasted0">>           if (IS_ERR(src_rsgt))</div>
<div class="ContentPasted0">>                 return ERR_CAST(src_rsgt);</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -         src_level = i915_ttm_cache_level(i915, bo->resource, src_ttm);</div>
<div class="ContentPasted0">> +         src_pat = i915_ttm_cache_pat(i915, bo->resource, src_ttm);</div>
<div class="ContentPasted0">>           intel_engine_pm_get(to_gt(i915)->migrate.context->engine);</div>
<div class="ContentPasted0">>           ret = intel_context_migrate_copy(to_gt(i915)->migrate.context,</div>
<div class="ContentPasted0">>                                    deps, src_rsgt->table.sgl,</div>
<div class="ContentPasted0">> -                                  i915_gem_get_pat_index(i915, src_level),</div>
<div class="ContentPasted0">> +                                  src_pat,</div>
<div class="ContentPasted0">>                                    i915_ttm_gtt_binds_lmem(bo->resource),</div>
<div class="ContentPasted0">> -                                  dst_st->sgl,</div>
<div class="ContentPasted0">> -                                  i915_gem_get_pat_index(i915, dst_level),</div>
<div class="ContentPasted0">> +                                  dst_st->sgl, dst_pat,</div>
<div class="ContentPasted0">>                                    i915_ttm_gtt_binds_lmem(dst_mem),</div>
<div class="ContentPasted0">>                                    &rq);</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c</div>
<div class="ContentPasted0">> index df6c9a84252c..c8925918784e 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c</div>
<div class="ContentPasted0">> @@ -354,7 +354,7 @@ fake_huge_pages_object(struct drm_i915_private *i915, u64 size, bool single)</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>     obj->write_domain = I915_GEM_DOMAIN_CPU;</div>
<div class="ContentPasted0">>     obj->read_domains = I915_GEM_DOMAIN_CPU;</div>
<div class="ContentPasted0">> -   obj->pat_index = i915_gem_get_pat_index(i915, I915_CACHE_NONE);</div>
<div class="ContentPasted0">> +   i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>     return obj;</div>
<div class="ContentPasted0">>  }</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c</div>
<div class="ContentPasted0">> index c2bdc133c89a..fb69f667652a 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c</div>
<div class="ContentPasted0">> @@ -226,9 +226,7 @@ static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)</div>
<div class="ContentPasted0">>           return ret;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>     vm->scratch[0]->encode =</div>
<div class="ContentPasted0">> -         vm->pte_encode(px_dma(vm->scratch[0]),</div>
<div class="ContentPasted0">> -                      i915_gem_get_pat_index(vm->i915,</div>
<div class="ContentPasted0">> -                                       I915_CACHE_NONE),</div>
<div class="ContentPasted0">> +         vm->pte_encode(px_dma(vm->scratch[0]), vm->i915->pat_uc,</div>
<div class="ContentPasted0">>                        PTE_READ_ONLY);</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>     vm->scratch[1] = vm->alloc_pt_dma(vm, I915_GTT_PAGE_SIZE_4K);</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c</div>
<div class="ContentPasted0">> index f948d33e5ec5..a6692ea1a91e 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c</div>
<div class="ContentPasted0">> @@ -40,16 +40,11 @@ static u64 gen8_pte_encode(dma_addr_t addr,</div>
<div class="ContentPasted0">>     if (flags & PTE_LM)</div>
<div class="ContentPasted0">>           pte |= GEN12_PPGTT_PTE_LM;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -   /*</div>
<div class="ContentPasted0">> -    * For pre-gen12 platforms pat_index is the same as enum</div>
<div class="ContentPasted0">> -    * i915_cache_level, so the switch-case here is still valid.</div>
<div class="ContentPasted0">> -    * See translation table defined by LEGACY_CACHELEVEL.</div>
<div class="ContentPasted0">> -    */</div>
<div class="ContentPasted0">>     switch (pat_index) {</div>
<div class="ContentPasted0">> -   case I915_CACHE_NONE:</div>
<div class="ContentPasted0">> +   case 0:</div>
<div class="ContentPasted0">>           pte |= PPAT_UNCACHED;</div>
<div class="ContentPasted0">>           break;</div>
<div class="ContentPasted0">> -   case I915_CACHE_WT:</div>
<div class="ContentPasted0">> +   case 3:</div>
<div class="ContentPasted0">>           pte |= PPAT_DISPLAY_ELLC;</div>
<div class="ContentPasted0">>           break;</div>
<div class="ContentPasted0">>     default:</div>
<div class="ContentPasted0">> @@ -853,9 +848,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)</div>
<div class="ContentPasted0">>           pte_flags |= PTE_LM;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>     vm->scratch[0]->encode =</div>
<div class="ContentPasted0">> -         vm->pte_encode(px_dma(vm->scratch[0]),</div>
<div class="ContentPasted0">> -                      i915_gem_get_pat_index(vm->i915,</div>
<div class="ContentPasted0">> -                                       I915_CACHE_NONE),</div>
<div class="ContentPasted0">> +         vm->pte_encode(px_dma(vm->scratch[0]), vm->i915->pat_uc,</div>
<div class="ContentPasted0">>                        pte_flags);</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>     for (i = 1; i <= vm->top; i++) {</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c</div>
<div class="ContentPasted0">> index dd0ed941441a..c97379cf8241 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c</div>
<div class="ContentPasted0">> @@ -921,9 +921,7 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)</div>
<div class="ContentPasted0">>           pte_flags |= PTE_LM;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>     ggtt->vm.scratch[0]->encode =</div>
<div class="ContentPasted0">> -         ggtt->vm.pte_encode(px_dma(ggtt->vm.scratch[0]),</div>
<div class="ContentPasted0">> -                         i915_gem_get_pat_index(i915,</div>
<div class="ContentPasted0">> -                                          I915_CACHE_NONE),</div>
<div class="ContentPasted0">> +         ggtt->vm.pte_encode(px_dma(ggtt->vm.scratch[0]), i915->pat_uc,</div>
<div class="ContentPasted0">>                           pte_flags);</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>     return 0;</div>
<div class="ContentPasted0">> @@ -1297,10 +1295,7 @@ bool i915_ggtt_resume_vm(struct i915_address_space *vm)</div>
<div class="ContentPasted0">>            * ptes to be repopulated.</div>
<div class="ContentPasted0">>            */</div>
<div class="ContentPasted0">>           vma->resource->bound_flags = 0;</div>
<div class="ContentPasted0">> -         vma->ops->bind_vma(vm, NULL, vma->resource,</div>
<div class="ContentPasted0">> -                        obj ? obj->pat_index :</div>
<div class="ContentPasted0">> -                            i915_gem_get_pat_index(vm->i915,</div>
<div class="ContentPasted0">> -                                             I915_CACHE_NONE),</div>
<div class="ContentPasted0">> +         vma->ops->bind_vma(vm, NULL, vma->resource, obj->cache_mode,</div>
<div class="ContentPasted0">>                          was_bound);</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>           if (obj) { /* only used during resume => exclusive access */</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c</div>
<div class="ContentPasted0">> index 6023288b0e2d..81f7834cc2db 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gt/intel_migrate.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c</div>
<div class="ContentPasted0">> @@ -45,9 +45,7 @@ static void xehpsdv_toggle_pdes(struct i915_address_space *vm,</div>
<div class="ContentPasted0">>      * Insert a dummy PTE into every PT that will map to LMEM to ensure</div>
<div class="ContentPasted0">>      * we have a correctly setup PDE structure for later use.</div>
<div class="ContentPasted0">>      */</div>
<div class="ContentPasted0">> -   vm->insert_page(vm, 0, d->offset,</div>
<div class="ContentPasted0">> -               i915_gem_get_pat_index(vm->i915, I915_CACHE_NONE),</div>
<div class="ContentPasted0">> -               PTE_LM);</div>
<div class="ContentPasted0">> +   vm->insert_page(vm, 0, d->offset, vm->i915->pat_uc, PTE_LM);</div>
<div class="ContentPasted0">>     GEM_BUG_ON(!pt->is_compact);</div>
<div class="ContentPasted0">>     d->offset += SZ_2M;</div>
<div class="ContentPasted0">>  }</div>
<div class="ContentPasted0">> @@ -65,9 +63,7 @@ static void xehpsdv_insert_pte(struct i915_address_space *vm,</div>
<div class="ContentPasted0">>      * alignment is 64K underneath for the pt, and we are careful</div>
<div class="ContentPasted0">>      * not to access the space in the void.</div>
<div class="ContentPasted0">>      */</div>
<div class="ContentPasted0">> -   vm->insert_page(vm, px_dma(pt), d->offset,</div>
<div class="ContentPasted0">> -               i915_gem_get_pat_index(vm->i915, I915_CACHE_NONE),</div>
<div class="ContentPasted0">> -               PTE_LM);</div>
<div class="ContentPasted0">> +   vm->insert_page(vm, px_dma(pt), d->offset, vm->i915->pat_uc, PTE_LM);</div>
<div class="ContentPasted0">>     d->offset += SZ_64K;</div>
<div class="ContentPasted0">>  }</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> @@ -77,8 +73,7 @@ static void insert_pte(struct i915_address_space *vm,</div>
<div class="ContentPasted0">>  {</div>
<div class="ContentPasted0">>     struct insert_pte_data *d = data;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -   vm->insert_page(vm, px_dma(pt), d->offset,</div>
<div class="ContentPasted0">> -               i915_gem_get_pat_index(vm->i915, I915_CACHE_NONE),</div>
<div class="ContentPasted0">> +   vm->insert_page(vm, px_dma(pt), d->offset, vm->i915->pat_uc,</div>
<div class="ContentPasted0">>                 i915_gem_object_is_lmem(pt->base) ? PTE_LM : 0);</div>
<div class="ContentPasted0">>     d->offset += PAGE_SIZE;</div>
<div class="ContentPasted0">>  }</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gt/selftest_migrate.c b/drivers/gpu/drm/i915/gt/selftest_migrate.c</div>
<div class="ContentPasted0">> index 3def5ca72dec..a67ede65d816 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gt/selftest_migrate.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gt/selftest_migrate.c</div>
<div class="ContentPasted0">> @@ -904,8 +904,7 @@ static int perf_clear_blt(void *arg)</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>           err = __perf_clear_blt(gt->migrate.context,</div>
<div class="ContentPasted0">>                              dst->mm.pages->sgl,</div>
<div class="ContentPasted0">> -                            i915_gem_get_pat_index(gt->i915,</div>
<div class="ContentPasted0">> -                                             I915_CACHE_NONE),</div>
<div class="ContentPasted0">> +                            gt->i915->pat_uc,</div>
<div class="ContentPasted0">>                              i915_gem_object_is_lmem(dst),</div>
<div class="ContentPasted0">>                              sizes[i]);</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> @@ -995,12 +994,10 @@ static int perf_copy_blt(void *arg)</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>           err = __perf_copy_blt(gt->migrate.context,</div>
<div class="ContentPasted0">>                             src->mm.pages->sgl,</div>
<div class="ContentPasted0">> -                           i915_gem_get_pat_index(gt->i915,</div>
<div class="ContentPasted0">> -                                            I915_CACHE_NONE),</div>
<div class="ContentPasted0">> +                           gt->i915->pat_uc,</div>
<div class="ContentPasted0">>                             i915_gem_object_is_lmem(src),</div>
<div class="ContentPasted0">>                             dst->mm.pages->sgl,</div>
<div class="ContentPasted0">> -                           i915_gem_get_pat_index(gt->i915,</div>
<div class="ContentPasted0">> -                                            I915_CACHE_NONE),</div>
<div class="ContentPasted0">> +                           gt->i915->pat_uc,</div>
<div class="ContentPasted0">>                             i915_gem_object_is_lmem(dst),</div>
<div class="ContentPasted0">>                             sz);</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gt/selftest_reset.c b/drivers/gpu/drm/i915/gt/selftest_reset.c</div>
<div class="ContentPasted0">> index 79aa6ac66ad2..327dc9294e0f 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gt/selftest_reset.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gt/selftest_reset.c</div>
<div class="ContentPasted0">> @@ -84,11 +84,8 @@ __igt_reset_stolen(struct intel_gt *gt,</div>
<div class="ContentPasted0">>           void __iomem *s;</div>
<div class="ContentPasted0">>           void *in;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -         ggtt->vm.insert_page(&ggtt->vm, dma,</div>
<div class="ContentPasted0">> -                          ggtt->error_capture.start,</div>
<div class="ContentPasted0">> -                          i915_gem_get_pat_index(gt->i915,</div>
<div class="ContentPasted0">> -                                           I915_CACHE_NONE),</div>
<div class="ContentPasted0">> -                          0);</div>
<div class="ContentPasted0">> +         ggtt->vm.insert_page(&ggtt->vm, dma, ggtt->error_capture.start,</div>
<div class="ContentPasted0">> +                          gt->i915->pat_uc, 0);</div>
<div class="ContentPasted0">>           mb();</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>           s = io_mapping_map_wc(&ggtt->iomap,</div>
<div class="ContentPasted0">> @@ -127,11 +124,8 @@ __igt_reset_stolen(struct intel_gt *gt,</div>
<div class="ContentPasted0">>           void *in;</div>
<div class="ContentPasted0">>           u32 x;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -         ggtt->vm.insert_page(&ggtt->vm, dma,</div>
<div class="ContentPasted0">> -                          ggtt->error_capture.start,</div>
<div class="ContentPasted0">> -                          i915_gem_get_pat_index(gt->i915,</div>
<div class="ContentPasted0">> -                                           I915_CACHE_NONE),</div>
<div class="ContentPasted0">> -                          0);</div>
<div class="ContentPasted0">> +         ggtt->vm.insert_page(&ggtt->vm, dma, ggtt->error_capture.start,</div>
<div class="ContentPasted0">> +                          gt->i915->pat_uc, 0);</div>
<div class="ContentPasted0">>           mb();</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>           s = io_mapping_map_wc(&ggtt->iomap,</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gt/selftest_timeline.c b/drivers/gpu/drm/i915/gt/selftest_timeline.c</div>
<div class="ContentPasted0">> index 39c3ec12df1a..db64dc7d3fce 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gt/selftest_timeline.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gt/selftest_timeline.c</div>
<div class="ContentPasted0">> @@ -836,7 +836,10 @@ static int setup_watcher(struct hwsp_watcher *w, struct intel_gt *gt,</div>
<div class="ContentPasted0">>           return PTR_ERR(obj);</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>     /* keep the same cache settings as timeline */</div>
<div class="ContentPasted0">> -   i915_gem_object_set_pat_index(obj, tl->hwsp_ggtt->obj->pat_index);</div>
<div class="ContentPasted0">> +   obj->pat_index = tl->hwsp_ggtt->obj->pat_index;</div>
<div class="ContentPasted0">> +   obj->cache_mode = tl->hwsp_ggtt->obj->cache_mode;</div>
<div class="ContentPasted0">> +   __i915_gem_object_update_coherency(obj);</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">>     w->map = i915_gem_object_pin_map_unlocked(obj,</div>
<div class="ContentPasted0">>                                     page_unmask_bits(tl->hwsp_ggtt->obj->mm.mapping));</div>
<div class="ContentPasted0">>     if (IS_ERR(w->map)) {</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gt/selftest_tlb.c b/drivers/gpu/drm/i915/gt/selftest_tlb.c</div>
<div class="ContentPasted0">> index 3bd6b540257b..6049f01be219 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gt/selftest_tlb.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gt/selftest_tlb.c</div>
<div class="ContentPasted0">> @@ -36,8 +36,6 @@ pte_tlbinv(struct intel_context *ce,</div>
<div class="ContentPasted0">>        u64 length,</div>
<div class="ContentPasted0">>        struct rnd_state *prng)</div>
<div class="ContentPasted0">>  {</div>
<div class="ContentPasted0">> -   const unsigned int pat_index =</div>
<div class="ContentPasted0">> -         i915_gem_get_pat_index(ce->vm->i915, I915_CACHE_NONE);</div>
<div class="ContentPasted0">>     struct drm_i915_gem_object *batch;</div>
<div class="ContentPasted0">>     struct drm_mm_node vb_node;</div>
<div class="ContentPasted0">>     struct i915_request *rq;</div>
<div class="ContentPasted0">> @@ -157,7 +155,8 @@ pte_tlbinv(struct intel_context *ce,</div>
<div class="ContentPasted0">>           /* Flip the PTE between A and B */</div>
<div class="ContentPasted0">>           if (i915_gem_object_is_lmem(vb->obj))</div>
<div class="ContentPasted0">>                 pte_flags |= PTE_LM;</div>
<div class="ContentPasted0">> -         ce->vm->insert_entries(ce->vm, &vb_res, pat_index, pte_flags);</div>
<div class="ContentPasted0">> +         ce->vm->insert_entries(ce->vm, &vb_res, ce->vm->i915->pat_uc,</div>
<div class="ContentPasted0">> +                            pte_flags);</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>           /* Flush the PTE update to concurrent HW */</div>
<div class="ContentPasted0">>           tlbinv(ce->vm, addr & -length, length);</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c</div>
<div class="ContentPasted0">> index d408856ae4c0..e099414d624d 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c</div>
<div class="ContentPasted0">> @@ -991,14 +991,10 @@ static void uc_fw_bind_ggtt(struct intel_uc_fw *uc_fw)</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>     if (ggtt->vm.raw_insert_entries)</div>
<div class="ContentPasted0">>           ggtt->vm.raw_insert_entries(&ggtt->vm, vma_res,</div>
<div class="ContentPasted0">> -                               i915_gem_get_pat_index(ggtt->vm.i915,</div>
<div class="ContentPasted0">> -                                                I915_CACHE_NONE),</div>
<div class="ContentPasted0">> -                               pte_flags);</div>
<div class="ContentPasted0">> +                               ggtt->vm.i915->pat_uc, pte_flags);</div>
<div class="ContentPasted0">>     else</div>
<div class="ContentPasted0">>           ggtt->vm.insert_entries(&ggtt->vm, vma_res,</div>
<div class="ContentPasted0">> -                           i915_gem_get_pat_index(ggtt->vm.i915,</div>
<div class="ContentPasted0">> -                                              I915_CACHE_NONE),</div>
<div class="ContentPasted0">> -                           pte_flags);</div>
<div class="ContentPasted0">> +                           ggtt->vm.i915->pat_uc, pte_flags);</div>
<div class="ContentPasted0">>  }</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  static void uc_fw_unbind_ggtt(struct intel_uc_fw *uc_fw)</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/i915_cache.c b/drivers/gpu/drm/i915/i915_cache.c</div>
<div class="ContentPasted0">> new file mode 100644</div>
<div class="ContentPasted0">> index 000000000000..7a8002ebd2ec</div>
<div class="ContentPasted0">> --- /dev/null</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/i915_cache.c</div>
<div class="ContentPasted0">> @@ -0,0 +1,59 @@</div>
<div class="ContentPasted0">> +/*</div>
<div class="ContentPasted0">> + * SPDX-License-Identifier: MIT</div>
<div class="ContentPasted0">> + *</div>
<div class="ContentPasted0">> + * Copyright © 2023 Intel Corporation</div>
<div class="ContentPasted0">> + */</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +#include "i915_cache.h"</div>
<div class="ContentPasted0">> +#include "i915_drv.h"</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +static int find_pat(const struct intel_device_info *info, i915_cache_t mode)</div>
<div class="ContentPasted0">> +{</div>
<div class="ContentPasted0">> +   int i;</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +   for (i = 0; i < ARRAY_SIZE(info->cache_modes); i++) {</div>
<div class="ContentPasted0">> +         if (info->cache_modes[i] == mode)</div>
<div class="ContentPasted0">> +               return i;</div>
<div class="ContentPasted0">> +   }</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +   return -1;</div>
<div class="ContentPasted0">> +}</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +void i915_cache_init(struct drm_i915_private *i915)</div>
<div class="ContentPasted0">> +{</div>
<div class="ContentPasted0">> +   int ret;</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +   ret = find_pat(INTEL_INFO(i915), I915_CACHE(UC));</div>
<div class="ContentPasted0">> +   WARN_ON(ret < 0);</div>
<div class="ContentPasted0">> +   drm_info(&i915->drm, "Using PAT index %u for uncached access\n", ret);</div>
<div class="ContentPasted0">> +   i915->pat_uc = ret;</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +   ret = find_pat(INTEL_INFO(i915), I915_CACHE(WB));</div>
<div class="ContentPasted0">> +   WARN_ON(ret < 0);</div>
<div class="ContentPasted0">> +   drm_info(&i915->drm, "Using PAT index %u for write-back access\n", ret);</div>
<div class="ContentPasted0">> +   i915->pat_wb = ret;</div>
<div class="ContentPasted0">> +}</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">I don't think we need the above two functions. Why don't we just hard code</div>
<div class="ContentPasted0">pat_uc and pat_wb in device_info? plus pat_wt too? These are predetermined,</div>
<div class="ContentPasted0">and used by KMD only.</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> +int i915_cache_level_to_pat_and_mode(struct drm_i915_private *i915,</div>
<div class="ContentPasted0">> +                          unsigned int cache_level,</div>
<div class="ContentPasted0">> +                          i915_cache_t *mode)</div>
<div class="ContentPasted0">> +{</div>
<div class="ContentPasted0">> +   const struct intel_device_info *info = INTEL_INFO(i915);</div>
<div class="ContentPasted0">> +   i915_cache_t level_to_mode[] = {</div>
<div class="ContentPasted0">> +         [I915_CACHE_NONE] = I915_CACHE(UC),</div>
<div class="ContentPasted0">> +         [I915_CACHE_WT]         = I915_CACHE(WT),</div>
<div class="ContentPasted0">> +         [I915_CACHE_LLC]  = I915_CACHE(WB),</div>
<div class="ContentPasted0">> +   };</div>
<div class="ContentPasted0">> +   int ret;</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +   if (GRAPHICS_VER(i915) >= 12)</div>
<div class="ContentPasted0">> +         level_to_mode[I915_CACHE_L3_LLC] = I915_CACHE(WB);</div>
<div class="ContentPasted0">> +   else</div>
<div class="ContentPasted0">> +         level_to_mode[I915_CACHE_L3_LLC] = _I915_CACHE(WB, LLC);</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +   ret = find_pat(info, level_to_mode[cache_level]);</div>
<div class="ContentPasted0">> +   if (ret >= 0 && mode)</div>
<div class="ContentPasted0">> +         *mode = info->cache_modes[ret];</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +   return ret;</div>
<div class="ContentPasted0">> +}</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/i915_cache.h b/drivers/gpu/drm/i915/i915_cache.h</div>
<div class="ContentPasted0">> new file mode 100644</div>
<div class="ContentPasted0">> index 000000000000..0df03f1f01ef</div>
<div class="ContentPasted0">> --- /dev/null</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/i915_cache.h</div>
<div class="ContentPasted0">> @@ -0,0 +1,129 @@</div>
<div class="ContentPasted0">> +/* SPDX-License-Identifier: MIT */</div>
<div class="ContentPasted0">> +/*</div>
<div class="ContentPasted0">> + * Copyright © 2023 Intel Corporation</div>
<div class="ContentPasted0">> + */</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +#ifndef __I915_CACHE_H__</div>
<div class="ContentPasted0">> +#define __I915_CACHE_H__</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +#include <linux/types.h></div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +struct drm_i915_private;</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +typedef u16 i915_cache_t;</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +#define I915_CACHE(mode) \</div>
<div class="ContentPasted0">> +   (i915_cache_t)(I915_CACHE_MODE_##mode)</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +#define _I915_CACHE(mode, flag) \</div>
<div class="ContentPasted0">> +   (i915_cache_t)((I915_CACHE_MODE_##mode) | ( BIT(8 + I915_CACHE_##flag)))</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +#define I915_CACHE_MODE(cache) \</div>
<div class="ContentPasted0">> +   (unsigned int)(((i915_cache_t)(cache)) & 0xff)</div>
<div class="ContentPasted0">> +#define I915_CACHE_FLAGS(cache) \</div>
<div class="ContentPasted0">> +   (unsigned int)((((i915_cache_t)(cache) & 0xff00)) >> 16)</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +/* Cache mode values */</div>
<div class="ContentPasted0">> +#define I915_CACHE_MODE_UNKNOWN (0)</div>
<div class="ContentPasted0">> +#define I915_CACHE_MODE_UC (1)</div>
<div class="ContentPasted0">> +#define I915_CACHE_MODE_WB (2)</div>
<div class="ContentPasted0">> +#define I915_CACHE_MODE_WT (3)</div>
<div class="ContentPasted0">> +#define I915_CACHE_MODE_WC (4)</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">Why do you need these CACHE_MODE's? Aren't they the same as i915_cache_level, which</div>
<div class="ContentPasted0">need some sort of translation?</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +/* Mode flag bits */</div>
<div class="ContentPasted0">> +#define I915_CACHE_L3            (0)</div>
<div class="ContentPasted0">> +#define I915_CACHE_COH1W   (1)</div>
<div class="ContentPasted0">> +#define I915_CACHE_COH2W   (2)</div>
<div class="ContentPasted0">> +#define I915_CACHE_CLOS1   (3)</div>
<div class="ContentPasted0">> +#define I915_CACHE_CLOS2   (4)</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">These had been defined in drivers/gpu/drm/i915/gt/intel_gtt.h already, why add new ones?</div>
<div class="ContentPasted0">The CLOS ones are not needed in upstream unless we want to support PVC here.</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +void i915_cache_init(struct drm_i915_private *i915);</div>
<div class="ContentPasted0">> +int i915_cache_level_to_pat_and_mode(struct drm_i915_private *i915,</div>
<div class="ContentPasted0">> +                          unsigned int cache_level,</div>
<div class="ContentPasted0">> +                          i915_cache_t *mode);</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +/*</div>
<div class="ContentPasted0">> + * Legacy/kernel internal interface below:</div>
<div class="ContentPasted0">> + */</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +/**</div>
<div class="ContentPasted0">> + * enum i915_cache_level - The supported GTT caching values for system memory</div>
<div class="ContentPasted0">> + * pages.</div>
<div class="ContentPasted0">> + *</div>
<div class="ContentPasted0">> + * These translate to some special GTT PTE bits when binding pages into some</div>
<div class="ContentPasted0">> + * address space. It also determines whether an object, or rather its pages are</div>
<div class="ContentPasted0">> + * coherent with the GPU, when also reading or writing through the CPU cache</div>
<div class="ContentPasted0">> + * with those pages.</div>
<div class="ContentPasted0">> + *</div>
<div class="ContentPasted0">> + * Userspace can also control this through struct drm_i915_gem_caching.</div>
<div class="ContentPasted0">> + */</div>
<div class="ContentPasted0">> +enum i915_cache_level {</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">Shouldn't we completely get rid of this enum now? It should be replaced by</div>
<div class="ContentPasted0">INTEL_INFO(i915)->pat_uc/wb/wt.</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> +   /**</div>
<div class="ContentPasted0">> +    * @I915_CACHE_NONE:</div>
<div class="ContentPasted0">> +    *</div>
<div class="ContentPasted0">> +    * GPU access is not coherent with the CPU cache. If the cache is dirty</div>
<div class="ContentPasted0">> +    * and we need the underlying pages to be coherent with some later GPU</div>
<div class="ContentPasted0">> +    * access then we need to manually flush the pages.</div>
<div class="ContentPasted0">> +    *</div>
<div class="ContentPasted0">> +    * On shared LLC platforms reads and writes through the CPU cache are</div>
<div class="ContentPasted0">> +    * still coherent even with this setting. See also</div>
<div class="ContentPasted0">> +    * &drm_i915_gem_object.cache_coherent for more details. Due to this we</div>
<div class="ContentPasted0">> +    * should only ever use uncached for scanout surfaces, otherwise we end</div>
<div class="ContentPasted0">> +    * up over-flushing in some places.</div>
<div class="ContentPasted0">> +    *</div>
<div class="ContentPasted0">> +    * This is the default on non-LLC platforms.</div>
<div class="ContentPasted0">> +    */</div>
<div class="ContentPasted0">> +   I915_CACHE_NONE = 0,</div>
<div class="ContentPasted0">> +   /**</div>
<div class="ContentPasted0">> +    * @I915_CACHE_LLC:</div>
<div class="ContentPasted0">> +    *</div>
<div class="ContentPasted0">> +    * GPU access is coherent with the CPU cache. If the cache is dirty,</div>
<div class="ContentPasted0">> +    * then the GPU will ensure that access remains coherent, when both</div>
<div class="ContentPasted0">> +    * reading and writing through the CPU cache. GPU writes can dirty the</div>
<div class="ContentPasted0">> +    * CPU cache.</div>
<div class="ContentPasted0">> +    *</div>
<div class="ContentPasted0">> +    * Not used for scanout surfaces.</div>
<div class="ContentPasted0">> +    *</div>
<div class="ContentPasted0">> +    * Applies to both platforms with shared LLC(HAS_LLC), and snooping</div>
<div class="ContentPasted0">> +    * based platforms(HAS_SNOOP).</div>
<div class="ContentPasted0">> +    *</div>
<div class="ContentPasted0">> +    * This is the default on shared LLC platforms.  The only exception is</div>
<div class="ContentPasted0">> +    * scanout objects, where the display engine is not coherent with the</div>
<div class="ContentPasted0">> +    * CPU cache. For such objects I915_CACHE_NONE or I915_CACHE_WT is</div>
<div class="ContentPasted0">> +    * automatically applied by the kernel in pin_for_display, if userspace</div>
<div class="ContentPasted0">> +    * has not done so already.</div>
<div class="ContentPasted0">> +    */</div>
<div class="ContentPasted0">> +   I915_CACHE_LLC,</div>
<div class="ContentPasted0">> +   /**</div>
<div class="ContentPasted0">> +    * @I915_CACHE_L3_LLC:</div>
<div class="ContentPasted0">> +    *</div>
<div class="ContentPasted0">> +    * Explicitly enable the Gfx L3 cache, with coherent LLC.</div>
<div class="ContentPasted0">> +    *</div>
<div class="ContentPasted0">> +    * The Gfx L3 sits between the domain specific caches, e.g</div>
<div class="ContentPasted0">> +    * sampler/render caches, and the larger LLC. LLC is coherent with the</div>
<div class="ContentPasted0">> +    * GPU, but L3 is only visible to the GPU, so likely needs to be flushed</div>
<div class="ContentPasted0">> +    * when the workload completes.</div>
<div class="ContentPasted0">> +    *</div>
<div class="ContentPasted0">> +    * Not used for scanout surfaces.</div>
<div class="ContentPasted0">> +    *</div>
<div class="ContentPasted0">> +    * Only exposed on some gen7 + GGTT. More recent hardware has dropped</div>
<div class="ContentPasted0">> +    * this explicit setting, where it should now be enabled by default.</div>
<div class="ContentPasted0">> +    */</div>
<div class="ContentPasted0">> +   I915_CACHE_L3_LLC,</div>
<div class="ContentPasted0">> +   /**</div>
<div class="ContentPasted0">> +    * @I915_CACHE_WT:</div>
<div class="ContentPasted0">> +    *</div>
<div class="ContentPasted0">> +    * Write-through. Used for scanout surfaces.</div>
<div class="ContentPasted0">> +    *</div>
<div class="ContentPasted0">> +    * The GPU can utilise the caches, while still having the display engine</div>
<div class="ContentPasted0">> +    * be coherent with GPU writes, as a result we don't need to flush the</div>
<div class="ContentPasted0">> +    * CPU caches when moving out of the render domain. This is the default</div>
<div class="ContentPasted0">> +    * setting chosen by the kernel, if supported by the HW, otherwise we</div>
<div class="ContentPasted0">> +    * fallback to I915_CACHE_NONE. On the CPU side writes through the CPU</div>
<div class="ContentPasted0">> +    * cache still need to be flushed, to remain coherent with the display</div>
<div class="ContentPasted0">> +    * engine.</div>
<div class="ContentPasted0">> +    */</div>
<div class="ContentPasted0">> +   I915_CACHE_WT,</div>
<div class="ContentPasted0">> +};</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +#endif /* __I915_CACHE_H__ */</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c</div>
<div class="ContentPasted0">> index 76ccd4e03e31..e2da57397770 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/i915_debugfs.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/i915_debugfs.c</div>
<div class="ContentPasted0">> @@ -139,48 +139,37 @@ static const char *stringify_vma_type(const struct i915_vma *vma)</div>
<div class="ContentPasted0">>     return "ppgtt";</div>
<div class="ContentPasted0">>  }</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -static const char *i915_cache_level_str(struct drm_i915_gem_object *obj)</div>
<div class="ContentPasted0">> +static void obj_cache_str(struct seq_file *m, struct drm_i915_gem_object *obj)</div>
<div class="ContentPasted0">>  {</div>
<div class="ContentPasted0">> -   struct drm_i915_private *i915 = obj_to_i915(obj);</div>
<div class="ContentPasted0">> +   const i915_cache_t cache = obj->cache_mode;</div>
<div class="ContentPasted0">> +   const unsigned int mode = I915_CACHE_MODE(cache);</div>
<div class="ContentPasted0">> +   const unsigned long flags = I915_CACHE_FLAGS(cache);</div>
<div class="ContentPasted0">> +   static const char *mode_str[] = {</div>
<div class="ContentPasted0">> +         [I915_CACHE_MODE_UC] = "UC",</div>
<div class="ContentPasted0">> +         [I915_CACHE_MODE_WB] = "WB",</div>
<div class="ContentPasted0">> +         [I915_CACHE_MODE_WT] = "WT",</div>
<div class="ContentPasted0">> +         [I915_CACHE_MODE_WC] = "WC",</div>
<div class="ContentPasted0">> +   };</div>
<div class="ContentPasted0">> +   static const char *flag_str[] = {</div>
<div class="ContentPasted0">> +         [I915_CACHE_L3] = "L3",</div>
<div class="ContentPasted0">> +         [I915_CACHE_COH1W] = "1-Way-Coherent",</div>
<div class="ContentPasted0">> +         [I915_CACHE_COH2W] = "2-Way-Coherent",</div>
<div class="ContentPasted0">> +         [I915_CACHE_CLOS1] = "CLOS1",</div>
<div class="ContentPasted0">> +         [I915_CACHE_CLOS2] = "CLOS2",</div>
<div class="ContentPasted0">> +   };</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -   if (IS_METEORLAKE(i915)) {</div>
<div class="ContentPasted0">> -         switch (obj->pat_index) {</div>
<div class="ContentPasted0">> -         case 0: return " WB";</div>
<div class="ContentPasted0">> -         case 1: return " WT";</div>
<div class="ContentPasted0">> -         case 2: return " UC";</div>
<div class="ContentPasted0">> -         case 3: return " WB (1-Way Coh)";</div>
<div class="ContentPasted0">> -         case 4: return " WB (2-Way Coh)";</div>
<div class="ContentPasted0">> -         default: return " not defined";</div>
<div class="ContentPasted0">> -         }</div>
<div class="ContentPasted0">> -   } else if (IS_PONTEVECCHIO(i915)) {</div>
<div class="ContentPasted0">> -         switch (obj->pat_index) {</div>
<div class="ContentPasted0">> -         case 0: return " UC";</div>
<div class="ContentPasted0">> -         case 1: return " WC";</div>
<div class="ContentPasted0">> -         case 2: return " WT";</div>
<div class="ContentPasted0">> -         case 3: return " WB";</div>
<div class="ContentPasted0">> -         case 4: return " WT (CLOS1)";</div>
<div class="ContentPasted0">> -         case 5: return " WB (CLOS1)";</div>
<div class="ContentPasted0">> -         case 6: return " WT (CLOS2)";</div>
<div class="ContentPasted0">> -         case 7: return " WT (CLOS2)";</div>
<div class="ContentPasted0">> -         default: return " not defined";</div>
<div class="ContentPasted0">> -         }</div>
<div class="ContentPasted0">> -   } else if (GRAPHICS_VER(i915) >= 12) {</div>
<div class="ContentPasted0">> -         switch (obj->pat_index) {</div>
<div class="ContentPasted0">> -         case 0: return " WB";</div>
<div class="ContentPasted0">> -         case 1: return " WC";</div>
<div class="ContentPasted0">> -         case 2: return " WT";</div>
<div class="ContentPasted0">> -         case 3: return " UC";</div>
<div class="ContentPasted0">> -         default: return " not defined";</div>
<div class="ContentPasted0">> -         }</div>
<div class="ContentPasted0">> +   if (mode == I915_CACHE_MODE_UNKNOWN || mode > ARRAY_SIZE(mode_str)) {</div>
<div class="ContentPasted0">> +         if (obj->pat_set_by_user)</div>
<div class="ContentPasted0">> +               seq_printf(m, " PAT-%u", obj->pat_index);</div>
<div class="ContentPasted0">> +         else</div>
<div class="ContentPasted0">> +               seq_printf(m, " PAT-%u-???", obj->pat_index);</div>
<div class="ContentPasted0">>     } else {</div>
<div class="ContentPasted0">> -         switch (obj->pat_index) {</div>
<div class="ContentPasted0">> -         case 0: return " UC";</div>
<div class="ContentPasted0">> -         case 1: return HAS_LLC(i915) ?</div>
<div class="ContentPasted0">> -                      " LLC" : " snooped";</div>
<div class="ContentPasted0">> -         case 2: return " L3+LLC";</div>
<div class="ContentPasted0">> -         case 3: return " WT";</div>
<div class="ContentPasted0">> -         default: return " not defined";</div>
<div class="ContentPasted0">> -         }</div>
<div class="ContentPasted0">> +         unsigned long bit;</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +         seq_printf(m, " %s", mode_str[mode]);</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +         for_each_set_bit(bit, &flags, sizeof(i915_cache_t))</div>
<div class="ContentPasted0">> +               seq_printf(m, "-%s", flag_str[bit]);</div>
<div class="ContentPasted0">>     }</div>
<div class="ContentPasted0">>  }</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> @@ -190,17 +179,23 @@ i915_debugfs_describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)</div>
<div class="ContentPasted0">>     struct i915_vma *vma;</div>
<div class="ContentPasted0">>     int pin_count = 0;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -   seq_printf(m, "%pK: %c%c%c %8zdKiB %02x %02x %s%s%s",</div>
<div class="ContentPasted0">> +   seq_printf(m, "%pK: %c%c%c %8zdKiB %02x %02x",</div>
<div class="ContentPasted0">>              &obj->base,</div>
<div class="ContentPasted0">>              get_tiling_flag(obj),</div>
<div class="ContentPasted0">>              get_global_flag(obj),</div>
<div class="ContentPasted0">>              get_pin_mapped_flag(obj),</div>
<div class="ContentPasted0">>              obj->base.size / 1024,</div>
<div class="ContentPasted0">>              obj->read_domains,</div>
<div class="ContentPasted0">> -            obj->write_domain,</div>
<div class="ContentPasted0">> -            i915_cache_level_str(obj),</div>
<div class="ContentPasted0">> -            obj->mm.dirty ? " dirty" : "",</div>
<div class="ContentPasted0">> -            obj->mm.madv == I915_MADV_DONTNEED ? " purgeable" : "");</div>
<div class="ContentPasted0">> +            obj->write_domain);</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +   obj_cache_str(m, obj);</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +   if (obj->mm.dirty)</div>
<div class="ContentPasted0">> +         seq_puts(m, " dirty");</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">> +   if (obj->mm.madv == I915_MADV_DONTNEED)</div>
<div class="ContentPasted0">> +         seq_puts(m, " purgeable");</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">>     if (obj->base.name)</div>
<div class="ContentPasted0">>           seq_printf(m, " (name: %d)", obj->base.name);</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c</div>
<div class="ContentPasted0">> index 222d0a1f3b55..deab26752ba4 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/i915_driver.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/i915_driver.c</div>
<div class="ContentPasted0">> @@ -80,6 +80,7 @@</div>
<div class="ContentPasted0">>  #include "soc/intel_dram.h"</div>
<div class="ContentPasted0">>  #include "soc/intel_gmch.h"</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> +#include "i915_cache.h"</div>
<div class="ContentPasted0">>  #include "i915_debugfs.h"</div>
<div class="ContentPasted0">>  #include "i915_driver.h"</div>
<div class="ContentPasted0">>  #include "i915_drm_client.h"</div>
<div class="ContentPasted0">> @@ -267,6 +268,8 @@ static int i915_driver_early_probe(struct drm_i915_private *dev_priv)</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>     intel_detect_preproduction_hw(dev_priv);</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> +   i915_cache_init(dev_priv);</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">>     return 0;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  err_rootgt:</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h</div>
<div class="ContentPasted0">> index b4cf6f0f636d..cb1c0c9d98ef 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/i915_drv.h</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/i915_drv.h</div>
<div class="ContentPasted0">> @@ -251,6 +251,9 @@ struct drm_i915_private {</div>
<div class="ContentPasted0">>     unsigned int hpll_freq;</div>
<div class="ContentPasted0">>     unsigned int czclk_freq;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> +   unsigned int pat_uc;</div>
<div class="ContentPasted0">> +   unsigned int pat_wb;</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">How about making these part of INTEL_INFO(i915)? They are predetermined, no need to be</div>
<div class="ContentPasted0">dynamic.</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">>     /**</div>
<div class="ContentPasted0">>      * wq - Driver workqueue for GEM.</div>
<div class="ContentPasted0">>      *</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c</div>
<div class="ContentPasted0">> index 7ae42f746cc2..9aae75862e6f 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/i915_gem.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/i915_gem.c</div>
<div class="ContentPasted0">> @@ -422,9 +422,7 @@ i915_gem_gtt_pread(struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">>                 ggtt->vm.insert_page(&ggtt->vm,</div>
<div class="ContentPasted0">>                                  i915_gem_object_get_dma_address(obj,</div>
<div class="ContentPasted0">>                                                          offset >> PAGE_SHIFT),</div>
<div class="ContentPasted0">> -                                node.start,</div>
<div class="ContentPasted0">> -                                i915_gem_get_pat_index(i915,</div>
<div class="ContentPasted0">> -                                                 I915_CACHE_NONE), 0);</div>
<div class="ContentPasted0">> +                                node.start, i915->pat_uc, 0);</div>
<div class="ContentPasted0">>           } else {</div>
<div class="ContentPasted0">>                 page_base += offset & PAGE_MASK;</div>
<div class="ContentPasted0">>           }</div>
<div class="ContentPasted0">> @@ -603,9 +601,7 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,</div>
<div class="ContentPasted0">>                 ggtt->vm.insert_page(&ggtt->vm,</div>
<div class="ContentPasted0">>                                  i915_gem_object_get_dma_address(obj,</div>
<div class="ContentPasted0">>                                                          offset >> PAGE_SHIFT),</div>
<div class="ContentPasted0">> -                                node.start,</div>
<div class="ContentPasted0">> -                                i915_gem_get_pat_index(i915,</div>
<div class="ContentPasted0">> -                                                 I915_CACHE_NONE), 0);</div>
<div class="ContentPasted0">> +                                node.start, i915->pat_uc, 0);</div>
<div class="ContentPasted0">>                 wmb(); /* flush modifications to the GGTT (insert_page) */</div>
<div class="ContentPasted0">>           } else {</div>
<div class="ContentPasted0">>                 page_base += offset & PAGE_MASK;</div>
<div class="ContentPasted0">> @@ -1148,19 +1144,6 @@ int i915_gem_init(struct drm_i915_private *dev_priv)</div>
<div class="ContentPasted0">>     unsigned int i;</div>
<div class="ContentPasted0">>     int ret;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -   /*</div>
<div class="ContentPasted0">> -    * In the proccess of replacing cache_level with pat_index a tricky</div>
<div class="ContentPasted0">> -    * dependency is created on the definition of the enum i915_cache_level.</div>
<div class="ContentPasted0">> -    * in case this enum is changed, PTE encode would be broken.</div>
<div class="ContentPasted0">> -    * Add a WARNING here. And remove when we completely quit using this</div>
<div class="ContentPasted0">> -    * enum</div>
<div class="ContentPasted0">> -    */</div>
<div class="ContentPasted0">> -   BUILD_BUG_ON(I915_CACHE_NONE != 0 ||</div>
<div class="ContentPasted0">> -              I915_CACHE_LLC != 1 ||</div>
<div class="ContentPasted0">> -              I915_CACHE_L3_LLC != 2 ||</div>
<div class="ContentPasted0">> -              I915_CACHE_WT != 3 ||</div>
<div class="ContentPasted0">> -              I915_MAX_CACHE_LEVEL != 4);</div>
<div class="ContentPasted0">> -</div>
<div class="ContentPasted0">>     /* We need to fallback to 4K pages if host doesn't support huge gtt. */</div>
<div class="ContentPasted0">>     if (intel_vgpu_active(dev_priv) && !intel_vgpu_has_huge_gtt(dev_priv))</div>
<div class="ContentPasted0">>           RUNTIME_INFO(dev_priv)->page_sizes = I915_GTT_PAGE_SIZE_4K;</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c</div>
<div class="ContentPasted0">> index 4749f99e6320..fad336a45699 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/i915_gpu_error.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c</div>
<div class="ContentPasted0">> @@ -1122,14 +1122,11 @@ i915_vma_coredump_create(const struct intel_gt *gt,</div>
<div class="ContentPasted0">>                 mutex_lock(&ggtt->error_mutex);</div>
<div class="ContentPasted0">>                 if (ggtt->vm.raw_insert_page)</div>
<div class="ContentPasted0">>                       ggtt->vm.raw_insert_page(&ggtt->vm, dma, slot,</div>
<div class="ContentPasted0">> -                                        i915_gem_get_pat_index(gt->i915,</div>
<div class="ContentPasted0">> -                                                         I915_CACHE_NONE),</div>
<div class="ContentPasted0">> +                                        ggtt->vm.i915->pat_uc,</div>
<div class="ContentPasted0">>                                          0);</div>
<div class="ContentPasted0">>                 else</div>
<div class="ContentPasted0">>                       ggtt->vm.insert_page(&ggtt->vm, dma, slot,</div>
<div class="ContentPasted0">> -                                      i915_gem_get_pat_index(gt->i915,</div>
<div class="ContentPasted0">> -                                                       I915_CACHE_NONE),</div>
<div class="ContentPasted0">> -                                      0);</div>
<div class="ContentPasted0">> +                                      ggtt->vm.i915->pat_uc, 0);</div>
<div class="ContentPasted0">>                 mb();</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>                 s = io_mapping_map_wc(&ggtt->iomap, slot, PAGE_SIZE);</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c</div>
<div class="ContentPasted0">> index 3d7a5db9833b..fbdce31afeb1 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/i915_pci.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/i915_pci.c</div>
<div class="ContentPasted0">> @@ -32,6 +32,7 @@</div>
<div class="ContentPasted0">>  #include "gt/intel_sa_media.h"</div>
<div class="ContentPasted0">>  #include "gem/i915_gem_object_types.h"</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> +#include "i915_cache.h"</div>
<div class="ContentPasted0">>  #include "i915_driver.h"</div>
<div class="ContentPasted0">>  #include "i915_drv.h"</div>
<div class="ContentPasted0">>  #include "i915_pci.h"</div>
<div class="ContentPasted0">> @@ -46,36 +47,42 @@ __diag_ignore_all("-Woverride-init", "Allow overriding inherited members");</div>
<div class="ContentPasted0">>     .__runtime.graphics.ip.ver = (x), \</div>
<div class="ContentPasted0">>     .__runtime.media.ip.ver = (x)</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -#define LEGACY_CACHELEVEL \</div>
<div class="ContentPasted0">> -   .cachelevel_to_pat = { \</div>
<div class="ContentPasted0">> -         [I915_CACHE_NONE]   = 0, \</div>
<div class="ContentPasted0">> -         [I915_CACHE_LLC]    = 1, \</div>
<div class="ContentPasted0">> -         [I915_CACHE_L3_LLC] = 2, \</div>
<div class="ContentPasted0">> -         [I915_CACHE_WT]     = 3, \</div>
<div class="ContentPasted0">> +/* TODO/QQQ index 1 & 2 */</div>
<div class="ContentPasted0">> +#define LEGACY_CACHE_MODES \</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">I was thinking to just put the PAT settings here, instead of cache_modes, simply</div>
<div class="ContentPasted0">      .pat = {\</div>
<div class="ContentPasted0">            GEN8_PPAT_WB, \</div>
<div class="ContentPasted0">            GEN8_PPAT_WC, \</div>
<div class="ContentPasted0">            GEN8_PPAT_WT, \</div>
<div class="ContentPasted0">            GEN8_PPAT_UC,</div>
<div class="ContentPasted0">      }</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> +   .cache_modes = { \</div>
<div class="ContentPasted0">> +         [0] = I915_CACHE(UC), \</div>
<div class="ContentPasted0">> +         [1] = I915_CACHE(WB), \</div>
<div class="ContentPasted0">> +         [2] = _I915_CACHE(WB, L3), \</div>
<div class="ContentPasted0">> +         [3] = I915_CACHE(WT), \</div>
<div class="ContentPasted0">>     }</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -#define TGL_CACHELEVEL \</div>
<div class="ContentPasted0">> -   .cachelevel_to_pat = { \</div>
<div class="ContentPasted0">> -         [I915_CACHE_NONE]   = 3, \</div>
<div class="ContentPasted0">> -         [I915_CACHE_LLC]    = 0, \</div>
<div class="ContentPasted0">> -         [I915_CACHE_L3_LLC] = 0, \</div>
<div class="ContentPasted0">> -         [I915_CACHE_WT]     = 2, \</div>
<div class="ContentPasted0">> +#define GEN12_CACHE_MODES \</div>
<div class="ContentPasted0">> +   .cache_modes = { \</div>
<div class="ContentPasted0">> +         [0] = I915_CACHE(WB), \</div>
<div class="ContentPasted0">> +         [1] = I915_CACHE(WC), \</div>
<div class="ContentPasted0">> +         [2] = I915_CACHE(WT), \</div>
<div class="ContentPasted0">> +         [3] = I915_CACHE(UC), \</div>
<div class="ContentPasted0">>     }</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -#define PVC_CACHELEVEL \</div>
<div class="ContentPasted0">> -   .cachelevel_to_pat = { \</div>
<div class="ContentPasted0">> -         [I915_CACHE_NONE]   = 0, \</div>
<div class="ContentPasted0">> -         [I915_CACHE_LLC]    = 3, \</div>
<div class="ContentPasted0">> -         [I915_CACHE_L3_LLC] = 3, \</div>
<div class="ContentPasted0">> -         [I915_CACHE_WT]     = 2, \</div>
<div class="ContentPasted0">> +#define PVC_CACHE_MODES \</div>
<div class="ContentPasted0">> +   .cache_modes = { \</div>
<div class="ContentPasted0">> +         [0] = I915_CACHE(UC), \</div>
<div class="ContentPasted0">> +         [1] = I915_CACHE(WC), \</div>
<div class="ContentPasted0">> +         [2] = I915_CACHE(WT), \</div>
<div class="ContentPasted0">> +         [3] = I915_CACHE(WB), \</div>
<div class="ContentPasted0">> +         [4] = _I915_CACHE(WT, CLOS1), \</div>
<div class="ContentPasted0">> +         [5] = _I915_CACHE(WB, CLOS1), \</div>
<div class="ContentPasted0">> +         [6] = _I915_CACHE(WT, CLOS2), \</div>
<div class="ContentPasted0">> +         [7] = _I915_CACHE(WB, CLOS2), \</div>
<div class="ContentPasted0">>     }</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">      .pat = {\</div>
<div class="ContentPasted0">            GEN8_PPAT_UC, \</div>
<div class="ContentPasted0">            GEN8_PPAT_WC, \</div>
<div class="ContentPasted0">            GEN8_PPAT_WT, \</div>
<div class="ContentPasted0">            GEN8_PPAT_WB, \</div>
<div class="ContentPasted0">            GEN12_PPAT_CLOS(1) | GEN8_PPAT_WT, \</div>
<div class="ContentPasted0">            GEN12_PPAT_CLOS(1) | GEN8_PPAT_WB, \</div>
<div class="ContentPasted0">            GEN12_PPAT_CLOS(2) | GEN8_PPAT_WT, \</div>
<div class="ContentPasted0">            GEN12_PPAT_CLOS(2) | GEN8_PPAT_WB, \</div>
<div class="ContentPasted0">      }</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -#define MTL_CACHELEVEL \</div>
<div class="ContentPasted0">> -   .cachelevel_to_pat = { \</div>
<div class="ContentPasted0">> -         [I915_CACHE_NONE]   = 2, \</div>
<div class="ContentPasted0">> -         [I915_CACHE_LLC]    = 3, \</div>
<div class="ContentPasted0">> -         [I915_CACHE_L3_LLC] = 3, \</div>
<div class="ContentPasted0">> -         [I915_CACHE_WT]     = 1, \</div>
<div class="ContentPasted0">> +#define MTL_CACHE_MODES \</div>
<div class="ContentPasted0">> +   .cache_modes = { \</div>
<div class="ContentPasted0">> +         [0] = I915_CACHE(WB), \</div>
<div class="ContentPasted0">> +         [1] = I915_CACHE(WT), \</div>
<div class="ContentPasted0">> +         [2] = I915_CACHE(UC), \</div>
<div class="ContentPasted0">> +         [3] = _I915_CACHE(WB, COH1W), \</div>
<div class="ContentPasted0">> +         [4] = _I915_CACHE(WB, COH2W), \</div>
<div class="ContentPasted0">>     }</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">      .pat = {\</div>
<div class="ContentPasted0">            MTL_PPAT_L4_0_WB, \</div>
<div class="ContentPasted0">            MTL_PPAT_L4_1_WT, \</div>
<div class="ContentPasted0">            MTL_PPAT_L4_3_UC, \</div>
<div class="ContentPasted0">            MTL_PPAT_L4_0_WB | MTL_2_COH_1W, \</div>
<div class="ContentPasted0">            MTL_PPAT_L4_0_WB | MTL_3_COH_2W, \</div>
<div class="ContentPasted0">      }</div>
<div><br class="ContentPasted0">
</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">>  /* Keep in gen based order, and chronological order within a gen */</div>
<div class="ContentPasted0">> @@ -100,7 +107,7 @@ __diag_ignore_all("-Woverride-init", "Allow overriding inherited members");</div>
<div class="ContentPasted0">>     .max_pat_index = 3, \</div>
<div class="ContentPasted0">>     GEN_DEFAULT_PAGE_SIZES, \</div>
<div class="ContentPasted0">>     GEN_DEFAULT_REGIONS, \</div>
<div class="ContentPasted0">> -   LEGACY_CACHELEVEL</div>
<div class="ContentPasted0">> +   LEGACY_CACHE_MODES</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  #define I845_FEATURES \</div>
<div class="ContentPasted0">>     GEN(2), \</div>
<div class="ContentPasted0">> @@ -115,7 +122,7 @@ __diag_ignore_all("-Woverride-init", "Allow overriding inherited members");</div>
<div class="ContentPasted0">>     .max_pat_index = 3, \</div>
<div class="ContentPasted0">>     GEN_DEFAULT_PAGE_SIZES, \</div>
<div class="ContentPasted0">>     GEN_DEFAULT_REGIONS, \</div>
<div class="ContentPasted0">> -   LEGACY_CACHELEVEL</div>
<div class="ContentPasted0">> +   LEGACY_CACHE_MODES</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  static const struct intel_device_info i830_info = {</div>
<div class="ContentPasted0">>     I830_FEATURES,</div>
<div class="ContentPasted0">> @@ -148,7 +155,7 @@ static const struct intel_device_info i865g_info = {</div>
<div class="ContentPasted0">>     .max_pat_index = 3, \</div>
<div class="ContentPasted0">>     GEN_DEFAULT_PAGE_SIZES, \</div>
<div class="ContentPasted0">>     GEN_DEFAULT_REGIONS, \</div>
<div class="ContentPasted0">> -   LEGACY_CACHELEVEL</div>
<div class="ContentPasted0">> +   LEGACY_CACHE_MODES</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  static const struct intel_device_info i915g_info = {</div>
<div class="ContentPasted0">>     GEN3_FEATURES,</div>
<div class="ContentPasted0">> @@ -211,7 +218,7 @@ static const struct intel_device_info pnv_m_info = {</div>
<div class="ContentPasted0">>     .max_pat_index = 3, \</div>
<div class="ContentPasted0">>     GEN_DEFAULT_PAGE_SIZES, \</div>
<div class="ContentPasted0">>     GEN_DEFAULT_REGIONS, \</div>
<div class="ContentPasted0">> -   LEGACY_CACHELEVEL</div>
<div class="ContentPasted0">> +   LEGACY_CACHE_MODES</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  static const struct intel_device_info i965g_info = {</div>
<div class="ContentPasted0">>     GEN4_FEATURES,</div>
<div class="ContentPasted0">> @@ -255,7 +262,7 @@ static const struct intel_device_info gm45_info = {</div>
<div class="ContentPasted0">>     .max_pat_index = 3, \</div>
<div class="ContentPasted0">>     GEN_DEFAULT_PAGE_SIZES, \</div>
<div class="ContentPasted0">>     GEN_DEFAULT_REGIONS, \</div>
<div class="ContentPasted0">> -   LEGACY_CACHELEVEL</div>
<div class="ContentPasted0">> +   LEGACY_CACHE_MODES</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  static const struct intel_device_info ilk_d_info = {</div>
<div class="ContentPasted0">>     GEN5_FEATURES,</div>
<div class="ContentPasted0">> @@ -285,7 +292,7 @@ static const struct intel_device_info ilk_m_info = {</div>
<div class="ContentPasted0">>     .__runtime.ppgtt_size = 31, \</div>
<div class="ContentPasted0">>     GEN_DEFAULT_PAGE_SIZES, \</div>
<div class="ContentPasted0">>     GEN_DEFAULT_REGIONS, \</div>
<div class="ContentPasted0">> -   LEGACY_CACHELEVEL</div>
<div class="ContentPasted0">> +   LEGACY_CACHE_MODES</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  #define SNB_D_PLATFORM \</div>
<div class="ContentPasted0">>     GEN6_FEATURES, \</div>
<div class="ContentPasted0">> @@ -333,7 +340,7 @@ static const struct intel_device_info snb_m_gt2_info = {</div>
<div class="ContentPasted0">>     .__runtime.ppgtt_size = 31, \</div>
<div class="ContentPasted0">>     GEN_DEFAULT_PAGE_SIZES, \</div>
<div class="ContentPasted0">>     GEN_DEFAULT_REGIONS, \</div>
<div class="ContentPasted0">> -   LEGACY_CACHELEVEL</div>
<div class="ContentPasted0">> +   LEGACY_CACHE_MODES</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  #define IVB_D_PLATFORM \</div>
<div class="ContentPasted0">>     GEN7_FEATURES, \</div>
<div class="ContentPasted0">> @@ -390,7 +397,7 @@ static const struct intel_device_info vlv_info = {</div>
<div class="ContentPasted0">>     .__runtime.platform_engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0),</div>
<div class="ContentPasted0">>     GEN_DEFAULT_PAGE_SIZES,</div>
<div class="ContentPasted0">>     GEN_DEFAULT_REGIONS,</div>
<div class="ContentPasted0">> -   LEGACY_CACHELEVEL,</div>
<div class="ContentPasted0">> +   LEGACY_CACHE_MODES</div>
<div class="ContentPasted0">>  };</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  #define G75_FEATURES  \</div>
<div class="ContentPasted0">> @@ -476,7 +483,7 @@ static const struct intel_device_info chv_info = {</div>
<div class="ContentPasted0">>     .has_coherent_ggtt = false,</div>
<div class="ContentPasted0">>     GEN_DEFAULT_PAGE_SIZES,</div>
<div class="ContentPasted0">>     GEN_DEFAULT_REGIONS,</div>
<div class="ContentPasted0">> -   LEGACY_CACHELEVEL,</div>
<div class="ContentPasted0">> +   LEGACY_CACHE_MODES</div>
<div class="ContentPasted0">>  };</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  #define GEN9_DEFAULT_PAGE_SIZES \</div>
<div class="ContentPasted0">> @@ -539,7 +546,7 @@ static const struct intel_device_info skl_gt4_info = {</div>
<div class="ContentPasted0">>     .max_pat_index = 3, \</div>
<div class="ContentPasted0">>     GEN9_DEFAULT_PAGE_SIZES, \</div>
<div class="ContentPasted0">>     GEN_DEFAULT_REGIONS, \</div>
<div class="ContentPasted0">> -   LEGACY_CACHELEVEL</div>
<div class="ContentPasted0">> +   LEGACY_CACHE_MODES</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  static const struct intel_device_info bxt_info = {</div>
<div class="ContentPasted0">>     GEN9_LP_FEATURES,</div>
<div class="ContentPasted0">> @@ -643,7 +650,7 @@ static const struct intel_device_info jsl_info = {</div>
<div class="ContentPasted0">>  #define GEN12_FEATURES \</div>
<div class="ContentPasted0">>     GEN11_FEATURES, \</div>
<div class="ContentPasted0">>     GEN(12), \</div>
<div class="ContentPasted0">> -   TGL_CACHELEVEL, \</div>
<div class="ContentPasted0">> +   GEN12_CACHE_MODES, \</div>
<div class="ContentPasted0">>     .has_global_mocs = 1, \</div>
<div class="ContentPasted0">>     .has_pxp = 1, \</div>
<div class="ContentPasted0">>     .max_pat_index = 3</div>
<div class="ContentPasted0">> @@ -711,7 +718,7 @@ static const struct intel_device_info adl_p_info = {</div>
<div class="ContentPasted0">>     .__runtime.graphics.ip.ver = 12, \</div>
<div class="ContentPasted0">>     .__runtime.graphics.ip.rel = 50, \</div>
<div class="ContentPasted0">>     XE_HP_PAGE_SIZES, \</div>
<div class="ContentPasted0">> -   TGL_CACHELEVEL, \</div>
<div class="ContentPasted0">> +   GEN12_CACHE_MODES, \</div>
<div class="ContentPasted0">>     .dma_mask_size = 46, \</div>
<div class="ContentPasted0">>     .has_3d_pipeline = 1, \</div>
<div class="ContentPasted0">>     .has_64bit_reloc = 1, \</div>
<div class="ContentPasted0">> @@ -806,7 +813,7 @@ static const struct intel_device_info pvc_info = {</div>
<div class="ContentPasted0">>           BIT(VCS0) |</div>
<div class="ContentPasted0">>           BIT(CCS0) | BIT(CCS1) | BIT(CCS2) | BIT(CCS3),</div>
<div class="ContentPasted0">>     .require_force_probe = 1,</div>
<div class="ContentPasted0">> -   PVC_CACHELEVEL,</div>
<div class="ContentPasted0">> +   PVC_CACHE_MODES</div>
<div class="ContentPasted0">>  };</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  static const struct intel_gt_definition xelpmp_extra_gt[] = {</div>
<div class="ContentPasted0">> @@ -841,7 +848,7 @@ static const struct intel_device_info mtl_info = {</div>
<div class="ContentPasted0">>     .__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM,</div>
<div class="ContentPasted0">>     .__runtime.platform_engine_mask = BIT(RCS0) | BIT(BCS0) | BIT(CCS0),</div>
<div class="ContentPasted0">>     .require_force_probe = 1,</div>
<div class="ContentPasted0">> -   MTL_CACHELEVEL,</div>
<div class="ContentPasted0">> +   MTL_CACHE_MODES</div>
<div class="ContentPasted0">>  };</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  #undef PLATFORM</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h</div>
<div class="ContentPasted0">> index 069291b3bd37..5cbae7c2ee30 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/intel_device_info.h</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/intel_device_info.h</div>
<div class="ContentPasted0">> @@ -27,6 +27,8 @@</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  #include <uapi/drm/i915_drm.h></div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> +#include "i915_cache.h"</div>
<div class="ContentPasted0">> +</div>
<div class="ContentPasted0">>  #include "intel_step.h"</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  #include "display/intel_display_device.h"</div>
<div class="ContentPasted0">> @@ -248,8 +250,8 @@ struct intel_device_info {</div>
<div class="ContentPasted0">>      */</div>
<div class="ContentPasted0">>     const struct intel_runtime_info __runtime;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -   u32 cachelevel_to_pat[I915_MAX_CACHE_LEVEL];</div>
<div class="ContentPasted0">> -   u32 max_pat_index;</div>
<div class="ContentPasted0">> +   i915_cache_t cache_modes[9];</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">      u32 pat[16];</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">See https://gfxspecs.intel.com/Predator/Home/Index/63019, there are PAT[3..0]</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">-Fei</div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0">> +   unsigned int max_pat_index;</div>
<div class="ContentPasted0">>  };</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>  struct intel_driver_caps {</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem.c b/drivers/gpu/drm/i915/selftests/i915_gem.c</div>
<div class="ContentPasted0">> index 61da4ed9d521..e620f73793a5 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/selftests/i915_gem.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/selftests/i915_gem.c</div>
<div class="ContentPasted0">> @@ -57,10 +57,7 @@ static void trash_stolen(struct drm_i915_private *i915)</div>
<div class="ContentPasted0">>           u32 __iomem *s;</div>
<div class="ContentPasted0">>           int x;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -         ggtt->vm.insert_page(&ggtt->vm, dma, slot,</div>
<div class="ContentPasted0">> -                          i915_gem_get_pat_index(i915,</div>
<div class="ContentPasted0">> -                                           I915_CACHE_NONE),</div>
<div class="ContentPasted0">> -                          0);</div>
<div class="ContentPasted0">> +         ggtt->vm.insert_page(&ggtt->vm, dma, slot, i915->pat_uc, 0);</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>           s = io_mapping_map_atomic_wc(&ggtt->iomap, slot);</div>
<div class="ContentPasted0">>           for (x = 0; x < PAGE_SIZE / sizeof(u32); x++) {</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c</div>
<div class="ContentPasted0">> index f8fe3681c3dc..658a5b59545e 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c</div>
<div class="ContentPasted0">> @@ -246,7 +246,7 @@ static int igt_evict_for_cache_color(void *arg)</div>
<div class="ContentPasted0">>     struct drm_mm_node target = {</div>
<div class="ContentPasted0">>           .start = I915_GTT_PAGE_SIZE * 2,</div>
<div class="ContentPasted0">>           .size = I915_GTT_PAGE_SIZE,</div>
<div class="ContentPasted0">> -         .color = i915_gem_get_pat_index(gt->i915, I915_CACHE_LLC),</div>
<div class="ContentPasted0">> +         .color = I915_CACHE(WB),</div>
<div class="ContentPasted0">>     };</div>
<div class="ContentPasted0">>     struct drm_i915_gem_object *obj;</div>
<div class="ContentPasted0">>     struct i915_vma *vma;</div>
<div class="ContentPasted0">> @@ -309,7 +309,7 @@ static int igt_evict_for_cache_color(void *arg)</div>
<div class="ContentPasted0">>     /* Attempt to remove the first *pinned* vma, by removing the (empty)</div>
<div class="ContentPasted0">>      * neighbour -- this should fail.</div>
<div class="ContentPasted0">>      */</div>
<div class="ContentPasted0">> -   target.color = i915_gem_get_pat_index(gt->i915, I915_CACHE_L3_LLC);</div>
<div class="ContentPasted0">> +   target.color = _I915_CACHE(WB, LLC);</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>     mutex_lock(&ggtt->vm.mutex);</div>
<div class="ContentPasted0">>     err = i915_gem_evict_for_node(&ggtt->vm, NULL, &target, 0);</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c</div>
<div class="ContentPasted0">> index 5c397a2df70e..a24585784f75 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c</div>
<div class="ContentPasted0">> @@ -135,7 +135,7 @@ fake_dma_object(struct drm_i915_private *i915, u64 size)</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>     obj->write_domain = I915_GEM_DOMAIN_CPU;</div>
<div class="ContentPasted0">>     obj->read_domains = I915_GEM_DOMAIN_CPU;</div>
<div class="ContentPasted0">> -   obj->pat_index = i915_gem_get_pat_index(i915, I915_CACHE_NONE);</div>
<div class="ContentPasted0">> +   obj->pat_index = i915->pat_uc;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>     /* Preallocate the "backing storage" */</div>
<div class="ContentPasted0">>     if (i915_gem_object_pin_pages_unlocked(obj))</div>
<div class="ContentPasted0">> @@ -358,10 +358,8 @@ static int lowlevel_hole(struct i915_address_space *vm,</div>
<div class="ContentPasted0">>                 mock_vma_res->start = addr;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>                 with_intel_runtime_pm(vm->gt->uncore->rpm, wakeref)</div>
<div class="ContentPasted0">> -                 vm->insert_entries(vm, mock_vma_res,</div>
<div class="ContentPasted0">> -                                i915_gem_get_pat_index(vm->i915,</div>
<div class="ContentPasted0">> -                                                 I915_CACHE_NONE),</div>
<div class="ContentPasted0">> -                                0);</div>
<div class="ContentPasted0">> +                     vm->insert_entries(vm, mock_vma_res,</div>
<div class="ContentPasted0">> +                                    vm->i915->pat_uc, 0);</div>
<div class="ContentPasted0">>           }</div>
<div class="ContentPasted0">>           count = n;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> @@ -1379,10 +1377,7 @@ static int igt_ggtt_page(void *arg)</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>           ggtt->vm.insert_page(&ggtt->vm,</div>
<div class="ContentPasted0">>                            i915_gem_object_get_dma_address(obj, 0),</div>
<div class="ContentPasted0">> -                          offset,</div>
<div class="ContentPasted0">> -                          i915_gem_get_pat_index(i915,</div>
<div class="ContentPasted0">> -                                           I915_CACHE_NONE),</div>
<div class="ContentPasted0">> -                          0);</div>
<div class="ContentPasted0">> +                          offset, ggtt->vm.i915->pat_uc, 0);</div>
<div class="ContentPasted0">>     }</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>     order = i915_random_order(count, &prng);</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c</div>
<div class="ContentPasted0">> index d985d9bae2e8..b82fe0ef8cd7 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c</div>
<div class="ContentPasted0">> @@ -1070,9 +1070,7 @@ static int igt_lmem_write_cpu(void *arg)</div>
<div class="ContentPasted0">>     /* Put the pages into a known state -- from the gpu for added fun */</div>
<div class="ContentPasted0">>     intel_engine_pm_get(engine);</div>
<div class="ContentPasted0">>     err = intel_context_migrate_clear(engine->gt->migrate.context, NULL,</div>
<div class="ContentPasted0">> -                             obj->mm.pages->sgl,</div>
<div class="ContentPasted0">> -                             i915_gem_get_pat_index(i915,</div>
<div class="ContentPasted0">> -                                              I915_CACHE_NONE),</div>
<div class="ContentPasted0">> +                             obj->mm.pages->sgl, i915->pat_uc,</div>
<div class="ContentPasted0">>                               true, 0xdeadbeaf, &rq);</div>
<div class="ContentPasted0">>     if (rq) {</div>
<div class="ContentPasted0">>           dma_resv_add_fence(obj->base.resv, &rq->fence,</div>
<div class="ContentPasted0">> diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c</div>
<div class="ContentPasted0">> index 09d4bbcdcdbf..ad778842cba2 100644</div>
<div class="ContentPasted0">> --- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c</div>
<div class="ContentPasted0">> +++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c</div>
<div class="ContentPasted0">> @@ -126,7 +126,12 @@ struct drm_i915_private *mock_gem_device(void)</div>
<div class="ContentPasted0">>     struct drm_i915_private *i915;</div>
<div class="ContentPasted0">>     struct intel_device_info *i915_info;</div>
<div class="ContentPasted0">>     struct pci_dev *pdev;</div>
<div class="ContentPasted0">> -   unsigned int i;</div>
<div class="ContentPasted0">> +   static const i915_cache_t legacy_cache_modes[] = {</div>
<div class="ContentPasted0">> +         [0] = I915_CACHE(UC),</div>
<div class="ContentPasted0">> +         [1] = I915_CACHE(WB),</div>
<div class="ContentPasted0">> +         [2] = _I915_CACHE(WB, L3),</div>
<div class="ContentPasted0">> +         [3] = I915_CACHE(WT),</div>
<div class="ContentPasted0">> +   };</div>
<div class="ContentPasted0">>     int ret;</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>     pdev = kzalloc(sizeof(*pdev), GFP_KERNEL);</div>
<div class="ContentPasted0">> @@ -187,8 +192,7 @@ struct drm_i915_private *mock_gem_device(void)</div>
<div class="ContentPasted0">>     /* simply use legacy cache level for mock device */</div>
<div class="ContentPasted0">>     i915_info = (struct intel_device_info *)INTEL_INFO(i915);</div>
<div class="ContentPasted0">>     i915_info->max_pat_index = 3;</div>
<div class="ContentPasted0">> -   for (i = 0; i < I915_MAX_CACHE_LEVEL; i++)</div>
<div class="ContentPasted0">> -         i915_info->cachelevel_to_pat[i] = i;</div>
<div class="ContentPasted0">> +   memcpy(i915_info->cache_modes, legacy_cache_modes, sizeof(legacy_cache_modes));</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">>     intel_memory_regions_hw_probe(i915);</div>
<div class="ContentPasted0">>  </div>
<div class="ContentPasted0">> -- </div>
<div class="ContentPasted0">> 2.39.2</div>
<br>
</div>
</body>
</html>