[Intel-gfx] [PATCH 1/2 v2] drm/i915: mark GEM object pages dirty when mapped & written by the CPU

Thu Dec 10 05:29:09 PST 2015

On Wed, Dec 09, 2015 at 03:52:51PM +0000, Dave Gordon wrote:
> In various places, a single page of a (regular) GEM object is mapped into
> CPU address space and updated. In each such case, either the page or the
> the object should be marked dirty, to ensure that the modifications are
> not discarded if the object is evicted under memory pressure.
> 
> The typical sequence is:
> 	va = kmap_atomic(i915_gem_object_get_page(obj, pageno));
> 	*(va+offset) = ...
> 	kunmap_atomic(va);
> 
> Here we introduce i915_gem_object_get_dirty_page(), which performs the
> same operation as i915_gem_object_get_page() but with the side-effect
> of marking the returned page dirty in the pagecache.  This will ensure
> that if the object is subsequently evicted (due to memory pressure),
> the changes are written to backing store rather than discarded.
> 
> Note that it works only for regular (shmfs-backed) GEM objects, but (at
> least for now) those are the only ones that are updated in this way --
> the objects in question are contexts and batchbuffers, which are always
> shmfs-backed.
> 
> A separate patch deals with the case where whole objects are (or may
> be) dirtied.
> 
> Signed-off-by: Dave Gordon <david.s.gordon at intel.com>
> Cc: Chris Wilson <chris at chris-wilson.co.uk>

I like this. There are places were we do both obj->dirty and
set_page_dirty(), but this so much more clearly shows what is going on.
All of these locations should be infrequent (or at least have patches to
make them so), so moving the call out-of-line will also be a benefit.

>  /* Allocate a new GEM object and fill it with the supplied data */
>  struct drm_i915_gem_object *
>  i915_gem_object_create_from_data(struct drm_device *dev,
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index a4c243c..81796cc 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -264,7 +264,7 @@ relocate_entry_cpu(struct drm_i915_gem_object *obj,
>  	if (ret)
>  		return ret;
>  
> -	vaddr = kmap_atomic(i915_gem_object_get_page(obj,
> +	vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj,
>  				reloc->offset >> PAGE_SHIFT));
>  	*(uint32_t *)(vaddr + page_offset) = lower_32_bits(delta);
>  
> @@ -355,7 +355,7 @@ relocate_entry_clflush(struct drm_i915_gem_object *obj,
>  	if (ret)
>  		return ret;
>  
> -	vaddr = kmap_atomic(i915_gem_object_get_page(obj,
> +	vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj,
>  				reloc->offset >> PAGE_SHIFT));
>  	clflush_write32(vaddr + page_offset, lower_32_bits(delta));
>  

The relocation functions may dirty pairs of pages. Other than that, I
think you have the right mix of callsites.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre