[PATCH] guc-wb-copy

Wed Jan 24 10:26:25 UTC 2018

Hi All,

There is potential issue with using cached mapping for GuC log buffer as 
explained below by Akash some time back.
Hence I was suggesting that we stay with uncached but fallback to 
memcpy. Looking at the problem description below,
I have another idea, Can we keep log buffer state page as uncached and 
others as cached.

Akash's analysis below: Protocol is GuC updates write pointer 
continuously and Host updates read pointer

/***************************************************************************************************/
Coming to the problem (very remote possibility of occurring), which can 
come if from the Host side Cached mapping is used to read the GuC log 
buffer.
If cached Mapping is used then KMD will have to appropriately do 
invalidation & flush operations on accessing the log buffer.

struct guc_log_buffer_state {
        u32 marker[2];
        u32 read_ptr;
        u32 write_ptr;
        u32 size;
        u32 sampled_write_ptr;
        union {
              struct {
                     u32 flush_to_file:1;
                     u32 buffer_full_cnt:4;
                     u32 reserved:27;
              };
              u32 flags;
        };
        u32 version;
} __packed;

KMD is supposed to update the read_ptr value with sampled_write_ptr value.
Actually there could be a problem with manual flushing on KMD side 
since, as per the above structure, the write_ptr & read_ptr could be on 
the samecacheline.
There is a potential race condition, when write_ptr (suppose of ISR) 
could get updated by the GuC firmware around the same time as KMD is 
reading the sampled_write_ptr value & updating the read_ptr with that.
In that case, the flush done by KMD, post update of read_ptr, could 
overwrite the update made by GuC firmware, since write_ptr & read_ptr 
would be on the samecacheline.

For example, suppose the DPC buffer got half full and GuC sent a flush 
interrupt to Host. Now KMD will update the read_ptr for all 3 types.
Since ISR buffer was not half full, GuC firmware will continue to update 
the corresponding write pointer, whilst KMD is sampling the Log buffer.

Using Uncached mapping will help avoid this scenario.

/*************************************************************************************************************/

On 1/23/2018 2:34 PM, Chris Wilson wrote:
> Cc: Changbin Du <changbin.du at intel.com>
> Cc: Sagar Arun Kamble <sagar.a.kamble at intel.com>
> ---
> Is what you were meant to write.
> -Chris
> ---
>   drivers/gpu/drm/i915/intel_guc_log.c | 33 +++++++++++++++++++--------------
>   1 file changed, 19 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_guc_log.c b/drivers/gpu/drm/i915/intel_guc_log.c
> index 2ffc966aa196..da57045e6cbd 100644
> --- a/drivers/gpu/drm/i915/intel_guc_log.c
> +++ b/drivers/gpu/drm/i915/intel_guc_log.c
> @@ -250,6 +250,14 @@ static unsigned int guc_get_log_buffer_size(enum guc_log_buffer_type type)
>   	return 0;
>   }
>   
> +static void guc_copy_log_buffer(void *dst, void *src, unsigned long len)
> +{
> +	if (!i915_memcpy_from_wc(dst, src, len)) {
> +		drm_clflush_virt_range(src, len);
> +		memcpy(dst, src, len);
> +	}
> +}
> +
>   static void guc_read_update_log_buffer(struct intel_guc *guc)
>   {
>   	unsigned int buffer_size, read_offset, write_offset, bytes_to_copy, full_cnt;
> @@ -323,12 +331,12 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
>   
>   		/* Just copy the newly written data */
>   		if (read_offset > write_offset) {
> -			i915_memcpy_from_wc(dst_data, src_data, write_offset);
> +			guc_copy_log_buffer(dst_data, src_data, write_offset);
>   			bytes_to_copy = buffer_size - read_offset;
>   		} else {
>   			bytes_to_copy = write_offset - read_offset;
>   		}
> -		i915_memcpy_from_wc(dst_data + read_offset,
> +		guc_copy_log_buffer(dst_data + read_offset,
>   				    src_data + read_offset, bytes_to_copy);
>   
>   		src_data += buffer_size;
> @@ -365,6 +373,7 @@ static int guc_log_runtime_create(struct intel_guc *guc)
>   	void *vaddr;
>   	struct rchan *guc_log_relay_chan;
>   	size_t n_subbufs, subbuf_size;
> +	enum i915_map_type type;
>   	int ret;
>   
>   	lockdep_assert_held(&dev_priv->drm.struct_mutex);
> @@ -375,11 +384,16 @@ static int guc_log_runtime_create(struct intel_guc *guc)
>   	if (ret)
>   		return ret;
>   
> -	/* Create a WC (Uncached for read) vmalloc mapping of log
> +	/*
> +	 * Try create a WC (Uncached for read) vmalloc mapping of log
>   	 * buffer pages, so that we can directly get the data
> -	 * (up-to-date) from memory.
> +	 * (up-to-date) from memory. Or else we will fallback to using
> +	 * clflush and a WB mapping.
>   	 */
> -	vaddr = i915_gem_object_pin_map(guc->log.vma->obj, I915_MAP_WC);
> +	type = I915_MAP_WC;
> +	if (!i915_has_memcpy_from_wc())
> +		type = I915_MAP_WB;
> +	vaddr = i915_gem_object_pin_map(guc->log.vma->obj, type);
>   	if (IS_ERR(vaddr)) {
>   		DRM_ERROR("Couldn't map log buffer pages %d\n", ret);
>   		return PTR_ERR(vaddr);
> @@ -520,15 +534,6 @@ int intel_guc_log_create(struct intel_guc *guc)
>   		GUC_LOG_ISR_PAGES + 1 +
>   		GUC_LOG_CRASH_PAGES + 1) << PAGE_SHIFT;
>   
> -	/* We require SSE 4.1 for fast reads from the GuC log buffer and
> -	 * it should be present on the chipsets supporting GuC based
> -	 * submisssions.
> -	 */
> -	if (WARN_ON(!i915_has_memcpy_from_wc())) {
> -		ret = -EINVAL;
> -		goto err;
> -	}
> -
>   	vma = intel_guc_allocate_vma(guc, size);
>   	if (IS_ERR(vma)) {
>   		ret = PTR_ERR(vma);

-- 
Thanks,
Sagar

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-trybot/attachments/20180124/572ddb31/attachment-0001.html>