[Intel-gfx] [PATCH 1/2 v2] drm/i915: mark GEM object pages dirty when mapped & written by the CPU

Thu Dec 10 09:24:42 PST 2015

On 10/12/15 13:29, Chris Wilson wrote:
> On Wed, Dec 09, 2015 at 03:52:51PM +0000, Dave Gordon wrote:
>> In various places, a single page of a (regular) GEM object is mapped into
>> CPU address space and updated. In each such case, either the page or the
>> the object should be marked dirty, to ensure that the modifications are
>> not discarded if the object is evicted under memory pressure.
>>
>> The typical sequence is:
>> 	va = kmap_atomic(i915_gem_object_get_page(obj, pageno));
>> 	*(va+offset) = ...
>> 	kunmap_atomic(va);
>>
>> Here we introduce i915_gem_object_get_dirty_page(), which performs the
>> same operation as i915_gem_object_get_page() but with the side-effect
>> of marking the returned page dirty in the pagecache.  This will ensure
>> that if the object is subsequently evicted (due to memory pressure),
>> the changes are written to backing store rather than discarded.
>>
>> Note that it works only for regular (shmfs-backed) GEM objects, but (at
>> least for now) those are the only ones that are updated in this way --
>> the objects in question are contexts and batchbuffers, which are always
>> shmfs-backed.
>>
>> A separate patch deals with the case where whole objects are (or may
>> be) dirtied.
>>
>> Signed-off-by: Dave Gordon <david.s.gordon at intel.com>
>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>
> I like this. There are places were we do both obj->dirty and
> set_page_dirty(), but this so much more clearly shows what is going on.
> All of these locations should be infrequent (or at least have patches to
> make them so), so moving the call out-of-line will also be a benefit.

I think there was only one place that both called set_page_dirty() AND 
set obj->dirty, which was in populate_lr_context(). You'll see that I've 
eliminated both in favour of a call to get_dirty_page() :)

>>   /* Allocate a new GEM object and fill it with the supplied data */
>>   struct drm_i915_gem_object *
>>   i915_gem_object_create_from_data(struct drm_device *dev,
>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> index a4c243c..81796cc 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> @@ -264,7 +264,7 @@ relocate_entry_cpu(struct drm_i915_gem_object *obj,
>>   	if (ret)
>>   		return ret;
>>
>> -	vaddr = kmap_atomic(i915_gem_object_get_page(obj,
>> +	vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj,
>>   				reloc->offset >> PAGE_SHIFT));
>>   	*(uint32_t *)(vaddr + page_offset) = lower_32_bits(delta);
>>
>> @@ -355,7 +355,7 @@ relocate_entry_clflush(struct drm_i915_gem_object *obj,
>>   	if (ret)
>>   		return ret;
>>
>> -	vaddr = kmap_atomic(i915_gem_object_get_page(obj,
>> +	vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj,
>>   				reloc->offset >> PAGE_SHIFT));
>>   	clflush_write32(vaddr + page_offset, lower_32_bits(delta));
>>
>
> The relocation functions may dirty pairs of pages. Other than that, I
> think you have the right mix of callsites.
> -Chris

Thanks, I've added the other two to the next (v3) version :)

.Dave.