[Intel-gfx] [PATCH] drm/i915/mtl: workaround coherency issue for Media

Fri Apr 21 07:57:26 UTC 2023

On 4/20/2023 11:30 PM, Matt Roper wrote:
> On Thu, Apr 20, 2023 at 01:38:59PM +0200, Nirmoy Das wrote:
>> From: Fei Yang <fei.yang at intel.com>
>>
>> This patch implements Wa_22016122933.
>>
>> In MTL, memory writes initiated by Media tile update the whole
>> cache line even for partial writes. This creates a coherency
>> problem for cacheable memory if both CPU and GPU are writing data
>> to different locations within a single cache line. CTB communication
>> is impacted by this issue because the head and tail pointers are
>> adjacent words within a cache line (see struct guc_ct_buffer_desc),
>> where one is written by GuC and the other by the host.
>> This patch circumvents the issue by making CPU/GPU shared memory
>> uncacheable (WC on CPU side, and PAT index 2 for GPU). Also for
>> CTB which is being updated by both CPU and GuC, mfence instruction
>> is added to make sure the CPU writes are visible to GPU right away
>> (flush the write combining buffer).
> I posted a note about the commit message here on the original series
> about an hour ago:
>
> https://lore.kernel.org/intel-gfx/20230420205238.GA4085390@mdroper-desk1.amr.corp.intel.com/
>
> Patch itself looks fine, I just think the last sentence above should be
> simplified to avoid inaccuracy.

Thanks for your review, Matt. I will resend with that fixed.

Nirmoy

>
> Matt
>
>> While fixing the CTB issue, we noticed some random GSC firmware
>> loading failure because the share buffers are cacheable (WB) on CPU
>> side but uncached on GPU side. To fix these issues we need to map
>> such shared buffers as WC on CPU side. Since such allocations are
>> not all done through GuC allocator, to avoid too many code changes,
>> the i915_coherent_map_type() is now hard coded to return WC for MTL.
>>
>> BSpec: 45101
>>
>> Signed-off-by: Fei Yang <fei.yang at intel.com>
>> Reviewed-by: Andi Shyti <andi.shyti at linux.intel.com>
>> Acked-by: Nirmoy Das <nirmoy.das at intel.com>
>> Reviewed-by: Andrzej Hajda <andrzej.hajda at intel.com>
>> Signed-off-by: Nirmoy Das <nirmoy.das at intel.com>
>> ---
>>   drivers/gpu/drm/i915/gem/i915_gem_pages.c |  5 ++++-
>>   drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c | 13 +++++++++++++
>>   drivers/gpu/drm/i915/gt/uc/intel_guc.c    |  7 +++++++
>>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |  6 ++++++
>>   4 files changed, 30 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
>> index ecd86130b74f..89fc8ea6bcfc 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
>> @@ -469,7 +469,10 @@ enum i915_map_type i915_coherent_map_type(struct drm_i915_private *i915,
>>   					  struct drm_i915_gem_object *obj,
>>   					  bool always_coherent)
>>   {
>> -	if (i915_gem_object_is_lmem(obj))
>> +	/*
>> +	 * Wa_22016122933: always return I915_MAP_WC for MTL
>> +	 */
>> +	if (i915_gem_object_is_lmem(obj) || IS_METEORLAKE(i915))
>>   		return I915_MAP_WC;
>>   	if (HAS_LLC(i915) || always_coherent)
>>   		return I915_MAP_WB;
>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
>> index 1d9fdfb11268..236673c02f9a 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
>> @@ -110,6 +110,13 @@ static int gsc_fw_load_prepare(struct intel_gsc_uc *gsc)
>>   	if (obj->base.size < gsc->fw.size)
>>   		return -ENOSPC;
>>   
>> +	/*
>> +	 * Wa_22016122933: For MTL the shared memory needs to be mapped
>> +	 * as WC on CPU side and UC (PAT index 2) on GPU side
>> +	 */
>> +	if (IS_METEORLAKE(i915))
>> +		i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
>> +
>>   	dst = i915_gem_object_pin_map_unlocked(obj,
>>   					       i915_coherent_map_type(i915, obj, true));
>>   	if (IS_ERR(dst))
>> @@ -125,6 +132,12 @@ static int gsc_fw_load_prepare(struct intel_gsc_uc *gsc)
>>   	memset(dst, 0, obj->base.size);
>>   	memcpy(dst, src, gsc->fw.size);
>>   
>> +	/*
>> +	 * Wa_22016122933: Making sure the data in dst is
>> +	 * visible to GSC right away
>> +	 */
>> +	intel_guc_write_barrier(&gt->uc.guc);
>> +
>>   	i915_gem_object_unpin_map(gsc->fw.obj);
>>   	i915_gem_object_unpin_map(obj);
>>   
>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>> index e89f16ecf1ae..c9f20385f6a0 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>> @@ -744,6 +744,13 @@ struct i915_vma *intel_guc_allocate_vma(struct intel_guc *guc, u32 size)
>>   	if (IS_ERR(obj))
>>   		return ERR_CAST(obj);
>>   
>> +	/*
>> +	 * Wa_22016122933: For MTL the shared memory needs to be mapped
>> +	 * as WC on CPU side and UC (PAT index 2) on GPU side
>> +	 */
>> +	if (IS_METEORLAKE(gt->i915))
>> +		i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
>> +
>>   	vma = i915_vma_instance(obj, &gt->ggtt->vm, NULL);
>>   	if (IS_ERR(vma))
>>   		goto err;
>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>> index 1803a633ed64..99a0a89091e7 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>> @@ -902,6 +902,12 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
>>   	/* now update descriptor */
>>   	WRITE_ONCE(desc->head, head);
>>   
>> +	/*
>> +	 * Wa_22016122933: Making sure the head update is
>> +	 * visible to GuC right away
>> +	 */
>> +	intel_guc_write_barrier(ct_to_guc(ct));
>> +
>>   	return available - len;
>>   
>>   corrupted:
>> -- 
>> 2.39.0
>>