[Intel-gfx] [PATCH 01/15] drm/i915: Add i915_gem_object_write() to i915_gem.c

Thu Jun 18 11:28:26 PDT 2015

On 18/06/15 15:31, Daniel Vetter wrote:
> On Thu, Jun 18, 2015 at 12:49:55PM +0100, Dave Gordon wrote:
>> On 17/06/15 13:02, Daniel Vetter wrote:
>>> On Wed, Jun 17, 2015 at 08:23:40AM +0100, Dave Gordon wrote:
>>>> On 15/06/15 21:09, Chris Wilson wrote:
>>>>> On Mon, Jun 15, 2015 at 07:36:19PM +0100, Dave Gordon wrote:
>>>>>> From: Alex Dai <yu.dai at intel.com>
>>>>>>
>>>>>> i915_gem_object_write() is a generic function to copy data from a plain
>>>>>> linear buffer to a paged gem object.
>>>>>>
>>>>>> We will need this for the microcontroller firmware loading support code.
>>>>>>
>>>>>> Issue: VIZ-4884
>>>>>> Signed-off-by: Alex Dai <yu.dai at intel.com>
>>>>>> Signed-off-by: Dave Gordon <david.s.gordon at intel.com>
>>>>>> ---
>>>>>>  drivers/gpu/drm/i915/i915_drv.h |    2 ++
>>>>>>  drivers/gpu/drm/i915/i915_gem.c |   28 ++++++++++++++++++++++++++++
>>>>>>  2 files changed, 30 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>>>>> index 611fbd8..9094c06 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>> @@ -2713,6 +2713,8 @@ void *i915_gem_object_alloc(struct drm_device *dev);
>>>>>>  void i915_gem_object_free(struct drm_i915_gem_object *obj);
>>>>>>  void i915_gem_object_init(struct drm_i915_gem_object *obj,
>>>>>>  			 const struct drm_i915_gem_object_ops *ops);
>>>>>> +int i915_gem_object_write(struct drm_i915_gem_object *obj,
>>>>>> +			  const void *data, size_t size);
>>>>>>  struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
>>>>>>  						  size_t size);
>>>>>>  void i915_init_vm(struct drm_i915_private *dev_priv,
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>>>>> index be35f04..75d63c2 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>>>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>>>>> @@ -5392,3 +5392,31 @@ bool i915_gem_obj_is_pinned(struct drm_i915_gem_object *obj)
>>>>>>  	return false;
>>>>>>  }
>>>>>>  
>>>>>> +/* Fill the @obj with the @size amount of @data */
>>>>>> +int i915_gem_object_write(struct drm_i915_gem_object *obj,
>>>>>> +			const void *data, size_t size)
>>>>>> +{
>>>>>> +	struct sg_table *sg;
>>>>>> +	size_t bytes;
>>>>>> +	int ret;
>>>>>> +
>>>>>> +	ret = i915_gem_object_get_pages(obj);
>>>>>> +	if (ret)
>>>>>> +		return ret;
>>>>>> +
>>>>>> +	i915_gem_object_pin_pages(obj);
>>>>>
>>>>> You don't set the object into the CPU domain, or instead manually handle
>>>>> the domain flushing. You don't handle objects that cannot be written
>>>>> directly by the CPU, nor do you handle objects whose representation in
>>>>> memory is not linear.
>>>>> -Chris
>>>>
>>>> No we don't handle just any random gem object, but we do return an error
>>>> code for any types not supported. However, as we don't really need the
>>>> full generality of writing into a gem object of any type, I will replace
>>>> this function with one that combines the allocation of a new object
>>>> (which will therefore definitely be of the correct type, in the correct
>>>> domain, etc) and filling it with the data to be preserved.
>>
>> The usage pattern for the particular case is going to be:
>> 	Once-only:
>> 		Allocate
>> 		Fill
>> 	Then each time GuC is (re-)initialised:
>> 		Map to GTT
>> 		DMA-read from buffer into GuC private memory
>> 		Unmap
>> 	Only on unload:
>> 		Dispose
>>
>> So our object is write-once by the CPU (and that's always the first
>> operation), thereafter read-occasionally by the GuC's DMA engine.
> 
> Yup. The problem is more that on atom platforms the objects aren't
> coherent by default and generally you need to do something. Hence we
> either have
> - an explicit set_caching call to document that this is a gpu object which
>   is always coherent (so also on chv/bxt), even when that's a no-op on big
>   core
> - or wrap everything in set_domain calls, even when those are no-ops too.
> 
> If either of those lack, reviews tend to freak out preemptively and the
> reptil brain takes over ;-)
> 
> Cheers, Daniel

We don't need "coherency" as such. The buffer is filled (once only) by
the CPU (so I should put a set-to-cpu-domain between the allocate and
fill stages?) Once it's filled, the CPU need not read or write it ever
again.

Then before the DMA engine accesses it, we call i915_gem_obj_ggtt_pin,
which I'm assuming will take care of any coherency issues (making sure
the data written by the CPU is now visible to the DMA engine) when it
puts the buffer into the GTT-readable domain. Is that not sufficient?

.Dave.