[Intel-gfx] [PATCH 01/11] drm/i915: Add support for mapping an object page by page

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Fri Jan 15 01:53:46 PST 2016

On 14/01/16 13:34, Chris Wilson wrote:
> On Thu, Jan 14, 2016 at 10:32:11AM +0000, Tvrtko Ursulin wrote:
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
>>> index b448ad8..5f86596 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
>>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
>>> @@ -317,6 +317,11 @@ struct i915_address_space {
>>>   			    uint64_t start,
>>>   			    uint64_t length,
>>>   			    bool use_scratch);
>>> +	void (*insert_page)(struct i915_address_space *vm,
>>> +			    dma_addr_t addr,
>>> +			    uint64_t offset,
>>> +			    enum i915_cache_level cache_level,
>>> +			    u32 flags);
>>>   	void (*insert_entries)(struct i915_address_space *vm,
>>>   			       struct sg_table *st,
>>>   			       uint64_t start,
>> Why not extend the current API to support start page offset and
>> number of pages? Could default to full object like today if zero.
>> Eg:
>>   void (*insert_entries)(struct i915_address_space *vm,
>> 			struct sg_table *st,
>> +			unsigned page_offset,
>> +			unsigned num_pages,
> Ouch. That would be quite slow for the insert_page() use case of
> page-by-page looping.

It could use the same last page caching trick so why would be be so slow?

>> 			uint64_t start,
>> 			enum i915_cache_level cache_level,
>> 			u32 flags);
>> That way we would not have two functions for effectively the same
>> thing operating on different type of input parameters.
>> If extending insert_entries is not preferable, then still we could
>> add another compatible one, like insert_entries_range or something,
>> and then both could share the same underlying implementation for
>> less code.
>> Like this, this patch already does not match current codebase - see
>> assert_rpm_atomic_begin in insert_entries.
>> Also if API between the two was compatible there would be no need
>> for i915_gem_object_get_dma_address() and i915_gem_object_get_page()
>> could be used instead.
> The point was to write a lowlevel analog to provide a complementary API
> to insert_entries that could be used for everywhere the we wanted to peek
> through the GTT without even touching an object, i.e. for cases where we
> might allocate a scratch page and temporarily put it into the GTT.

Well maybe it is not that bad on the API front, but code duplication 
would still not be my first choice, as demonstrated by the divergence 
which already happened. Need to think about it some more.



More information about the Intel-gfx mailing list