[Intel-gfx] [RFC 7/8] drm/i915: Lift the user PAT restriction from use_cpu_reloc

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Fri Jul 28 12:45:03 UTC 2023


On 28/07/2023 01:09, Matt Roper wrote:
> On Thu, Jul 27, 2023 at 03:55:03PM +0100, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>
>> Now that i915 understands the caching modes behind PAT indices, we can
>> refine the check in use_cpu_reloc() to not reject the uncached PAT if it
>> was set by userspace.
>>
>> Instead it can decide based on the presence of full coherency which
>> should be functionally equivalent on legacy platforms. We can ignore WT
>> since it is only used by the display, and we can ignore Meteorlake since
>> it will fail on the existing "has_llc" condition before the object cache
>> mode check.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>> Cc: Fei Yang <fei.yang at intel.com>
>> Cc: Matt Roper <matthew.d.roper at intel.com>
>> ---
>>   drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 9 +--------
>>   1 file changed, 1 insertion(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> index 9d6e49c8a4c6..f74b33670bad 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> @@ -640,16 +640,9 @@ static inline int use_cpu_reloc(const struct reloc_cache *cache,
>>   	if (DBG_FORCE_RELOC == FORCE_GTT_RELOC)
>>   		return false;
>>   
>> -	/*
>> -	 * For objects created by userspace through GEM_CREATE with pat_index
>> -	 * set by set_pat extension, i915_gem_object_has_cache_level() always
>> -	 * return true, otherwise the call would fall back to checking whether
>> -	 * the object is un-cached.
>> -	 */
>>   	return (cache->has_llc ||
>>   		obj->cache_dirty ||
>> -		!(obj->pat_set_by_user ||
>> -		  i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_UC)));
>> +		i915_gem_object_has_cache_flag(obj, I915_CACHE_FLAG_COH2W));
> 
> My understanding of relocations is minimal, but does 2W actually matter
> here (CPU snooping GPU caches)?  I would have expected only 1W coherency
> to be necessary (GPU snooping CPU caches)?

I struggled with this one. Original code was:

         return (cache->has_llc ||
                 obj->cache_dirty ||
                 obj->cache_level != I915_CACHE_NONE);

And I struggled to figure out the intent. It is not "don't do CPU 
relocations for uncached" because it will do them when LLC or dirty 
regardless.

You could be right.. can we interpret it as any mode apart from uncached 
was viewed as coherent for CPU writes being seen by the GPU?

In which case should/could it be based on I915_BO_CACHE_COHERENT_FOR_WRITE?

Regards,

Tvrtko

> 
> 
> Matt
> 
>>   }
>>   
>>   static int eb_reserve_vma(struct i915_execbuffer *eb,
>> -- 
>> 2.39.2
>>
> 


More information about the Intel-gfx mailing list