[Intel-gfx] [PATCH 01/15] drm/i915: Add i915_gem_object_write() to i915_gem.c

Wed Jun 24 02:32:16 PDT 2015

On Thu, Jun 18, 2015 at 07:28:26PM +0100, Dave Gordon wrote:
> On 18/06/15 15:31, Daniel Vetter wrote:
> > On Thu, Jun 18, 2015 at 12:49:55PM +0100, Dave Gordon wrote:
> >> On 17/06/15 13:02, Daniel Vetter wrote:
> >>> On Wed, Jun 17, 2015 at 08:23:40AM +0100, Dave Gordon wrote:
> >>>> On 15/06/15 21:09, Chris Wilson wrote:
> >>>>> On Mon, Jun 15, 2015 at 07:36:19PM +0100, Dave Gordon wrote:
> >>>>>> From: Alex Dai <yu.dai at intel.com>
> >>>>>>
> >>>>>> i915_gem_object_write() is a generic function to copy data from a plain
> >>>>>> linear buffer to a paged gem object.
> >>>>>>
> >>>>>> We will need this for the microcontroller firmware loading support code.
> >>>>>>
> >>>>>> Issue: VIZ-4884
> >>>>>> Signed-off-by: Alex Dai <yu.dai at intel.com>
> >>>>>> Signed-off-by: Dave Gordon <david.s.gordon at intel.com>
> >>>>>> ---
> >>>>>>  drivers/gpu/drm/i915/i915_drv.h |    2 ++
> >>>>>>  drivers/gpu/drm/i915/i915_gem.c |   28 ++++++++++++++++++++++++++++
> >>>>>>  2 files changed, 30 insertions(+)
> >>>>>>
> >>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >>>>>> index 611fbd8..9094c06 100644
> >>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
> >>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >>>>>> @@ -2713,6 +2713,8 @@ void *i915_gem_object_alloc(struct drm_device *dev);
> >>>>>>  void i915_gem_object_free(struct drm_i915_gem_object *obj);
> >>>>>>  void i915_gem_object_init(struct drm_i915_gem_object *obj,
> >>>>>>  			 const struct drm_i915_gem_object_ops *ops);
> >>>>>> +int i915_gem_object_write(struct drm_i915_gem_object *obj,
> >>>>>> +			  const void *data, size_t size);
> >>>>>>  struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
> >>>>>>  						  size_t size);
> >>>>>>  void i915_init_vm(struct drm_i915_private *dev_priv,
> >>>>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >>>>>> index be35f04..75d63c2 100644
> >>>>>> --- a/drivers/gpu/drm/i915/i915_gem.c
> >>>>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
> >>>>>> @@ -5392,3 +5392,31 @@ bool i915_gem_obj_is_pinned(struct drm_i915_gem_object *obj)
> >>>>>>  	return false;
> >>>>>>  }
> >>>>>>  
> >>>>>> +/* Fill the @obj with the @size amount of @data */
> >>>>>> +int i915_gem_object_write(struct drm_i915_gem_object *obj,
> >>>>>> +			const void *data, size_t size)
> >>>>>> +{
> >>>>>> +	struct sg_table *sg;
> >>>>>> +	size_t bytes;
> >>>>>> +	int ret;
> >>>>>> +
> >>>>>> +	ret = i915_gem_object_get_pages(obj);
> >>>>>> +	if (ret)
> >>>>>> +		return ret;
> >>>>>> +
> >>>>>> +	i915_gem_object_pin_pages(obj);
> >>>>>
> >>>>> You don't set the object into the CPU domain, or instead manually handle
> >>>>> the domain flushing. You don't handle objects that cannot be written
> >>>>> directly by the CPU, nor do you handle objects whose representation in
> >>>>> memory is not linear.
> >>>>> -Chris
> >>>>
> >>>> No we don't handle just any random gem object, but we do return an error
> >>>> code for any types not supported. However, as we don't really need the
> >>>> full generality of writing into a gem object of any type, I will replace
> >>>> this function with one that combines the allocation of a new object
> >>>> (which will therefore definitely be of the correct type, in the correct
> >>>> domain, etc) and filling it with the data to be preserved.
> >>
> >> The usage pattern for the particular case is going to be:
> >> 	Once-only:
> >> 		Allocate
> >> 		Fill
> >> 	Then each time GuC is (re-)initialised:
> >> 		Map to GTT
> >> 		DMA-read from buffer into GuC private memory
> >> 		Unmap
> >> 	Only on unload:
> >> 		Dispose
> >>
> >> So our object is write-once by the CPU (and that's always the first
> >> operation), thereafter read-occasionally by the GuC's DMA engine.
> > 
> > Yup. The problem is more that on atom platforms the objects aren't
> > coherent by default and generally you need to do something. Hence we
> > either have
> > - an explicit set_caching call to document that this is a gpu object which
> >   is always coherent (so also on chv/bxt), even when that's a no-op on big
> >   core
> > - or wrap everything in set_domain calls, even when those are no-ops too.
> > 
> > If either of those lack, reviews tend to freak out preemptively and the
> > reptil brain takes over ;-)
> > 
> > Cheers, Daniel
> 
> We don't need "coherency" as such. The buffer is filled (once only) by
> the CPU (so I should put a set-to-cpu-domain between the allocate and
> fill stages?) Once it's filled, the CPU need not read or write it ever
> again.
> 
> Then before the DMA engine accesses it, we call i915_gem_obj_ggtt_pin,
> which I'm assuming will take care of any coherency issues (making sure
> the data written by the CPU is now visible to the DMA engine) when it
> puts the buffer into the GTT-readable domain. Is that not sufficient?

Pinning is orthogonal to coherency, it'll just make sure the backing
storage is there. set_domain(CPU) before writing and set_domain(GTT)
before each upload to the guc using the hw copy thing would be prudent.
The coherency tracking should no-op out any calls which aren't needed for
you.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch