[Intel-gfx] [PATCH 01/15] drm/i915: Add i915_gem_object_write() to i915_gem.c

Thu Jun 25 05:28:38 PDT 2015

On 24/06/15 10:32, Daniel Vetter wrote:
> On Thu, Jun 18, 2015 at 07:28:26PM +0100, Dave Gordon wrote:
>> On 18/06/15 15:31, Daniel Vetter wrote:
>>> On Thu, Jun 18, 2015 at 12:49:55PM +0100, Dave Gordon wrote:
>>>> On 17/06/15 13:02, Daniel Vetter wrote:
>>>>> On Wed, Jun 17, 2015 at 08:23:40AM +0100, Dave Gordon wrote:
>>>>>> On 15/06/15 21:09, Chris Wilson wrote:
>>>>>>> On Mon, Jun 15, 2015 at 07:36:19PM +0100, Dave Gordon wrote:
>>>>>>>> From: Alex Dai <yu.dai at intel.com>
>>>>>>>>
>>>>>>>> i915_gem_object_write() is a generic function to copy data from a plain
>>>>>>>> linear buffer to a paged gem object.
>>>>>>>>
>>>>>>>> We will need this for the microcontroller firmware loading support code.
>>>>>>>>
>>>>>>>> Issue: VIZ-4884
>>>>>>>> Signed-off-by: Alex Dai <yu.dai at intel.com>
>>>>>>>> Signed-off-by: Dave Gordon <david.s.gordon at intel.com>
>>>>>>>> ---
>>>>>>>>  drivers/gpu/drm/i915/i915_drv.h |    2 ++
>>>>>>>>  drivers/gpu/drm/i915/i915_gem.c |   28 ++++++++++++++++++++++++++++
>>>>>>>>  2 files changed, 30 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>> index 611fbd8..9094c06 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>> @@ -2713,6 +2713,8 @@ void *i915_gem_object_alloc(struct drm_device *dev);
>>>>>>>>  void i915_gem_object_free(struct drm_i915_gem_object *obj);
>>>>>>>>  void i915_gem_object_init(struct drm_i915_gem_object *obj,
>>>>>>>>  			 const struct drm_i915_gem_object_ops *ops);
>>>>>>>> +int i915_gem_object_write(struct drm_i915_gem_object *obj,
>>>>>>>> +			  const void *data, size_t size);
>>>>>>>>  struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
>>>>>>>>  						  size_t size);
>>>>>>>>  void i915_init_vm(struct drm_i915_private *dev_priv,
>>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>>>>>>> index be35f04..75d63c2 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>>>>>>> @@ -5392,3 +5392,31 @@ bool i915_gem_obj_is_pinned(struct drm_i915_gem_object *obj)
>>>>>>>>  	return false;
>>>>>>>>  }
>>>>>>>>  
>>>>>>>> +/* Fill the @obj with the @size amount of @data */
>>>>>>>> +int i915_gem_object_write(struct drm_i915_gem_object *obj,
>>>>>>>> +			const void *data, size_t size)
>>>>>>>> +{
>>>>>>>> +	struct sg_table *sg;
>>>>>>>> +	size_t bytes;
>>>>>>>> +	int ret;
>>>>>>>> +
>>>>>>>> +	ret = i915_gem_object_get_pages(obj);
>>>>>>>> +	if (ret)
>>>>>>>> +		return ret;
>>>>>>>> +
>>>>>>>> +	i915_gem_object_pin_pages(obj);
>>>>>>>
>>>>>>> You don't set the object into the CPU domain, or instead manually handle
>>>>>>> the domain flushing. You don't handle objects that cannot be written
>>>>>>> directly by the CPU, nor do you handle objects whose representation in
>>>>>>> memory is not linear.
>>>>>>> -Chris
>>>>>>
>>>>>> No we don't handle just any random gem object, but we do return an error
>>>>>> code for any types not supported. However, as we don't really need the
>>>>>> full generality of writing into a gem object of any type, I will replace
>>>>>> this function with one that combines the allocation of a new object
>>>>>> (which will therefore definitely be of the correct type, in the correct
>>>>>> domain, etc) and filling it with the data to be preserved.
>>>>
>>>> The usage pattern for the particular case is going to be:
>>>> 	Once-only:
>>>> 		Allocate
>>>> 		Fill
>>>> 	Then each time GuC is (re-)initialised:
>>>> 		Map to GTT
>>>> 		DMA-read from buffer into GuC private memory
>>>> 		Unmap
>>>> 	Only on unload:
>>>> 		Dispose
>>>>
>>>> So our object is write-once by the CPU (and that's always the first
>>>> operation), thereafter read-occasionally by the GuC's DMA engine.
>>>
>>> Yup. The problem is more that on atom platforms the objects aren't
>>> coherent by default and generally you need to do something. Hence we
>>> either have
>>> - an explicit set_caching call to document that this is a gpu object which
>>>   is always coherent (so also on chv/bxt), even when that's a no-op on big
>>>   core
>>> - or wrap everything in set_domain calls, even when those are no-ops too.
>>>
>>> If either of those lack, reviews tend to freak out preemptively and the
>>> reptil brain takes over ;-)
>>>
>>> Cheers, Daniel
>>
>> We don't need "coherency" as such. The buffer is filled (once only) by
>> the CPU (so I should put a set-to-cpu-domain between the allocate and
>> fill stages?) Once it's filled, the CPU need not read or write it ever
>> again.
>>
>> Then before the DMA engine accesses it, we call i915_gem_obj_ggtt_pin,
>> which I'm assuming will take care of any coherency issues (making sure
>> the data written by the CPU is now visible to the DMA engine) when it
>> puts the buffer into the GTT-readable domain. Is that not sufficient?
> 
> Pinning is orthogonal to coherency, it'll just make sure the backing
> storage is there. set_domain(CPU) before writing and set_domain(GTT)
> before each upload to the guc using the hw copy thing would be prudent.
> The coherency tracking should no-op out any calls which aren't needed for
> you.
> -Daniel

OK, done; I had already added set_domain(CPU) in the allocate-and-fill
code, now I've just added i915_gem_object_set_to_gtt_domain(obj, false)
in the dma-to-h/w code.

.Dave.