[Intel-gfx] [PATCH 3/3] drm/i915: Use insert_page for pwrite_fast

Chris Wilson chris at chris-wilson.co.uk
Tue Nov 10 00:44:59 PST 2015


On Tue, Nov 10, 2015 at 09:55:18AM +0200, Mika Kuoppala wrote:
> ankitprasad.r.sharma at intel.com writes:
> 
> > From: Ankitprasad Sharma <ankitprasad.r.sharma at intel.com>
> >
> > In pwrite_fast, map an object page by page if obj_ggtt_pin fails. First,
> > we try a nonblocking pin for the whole object (since that is fastest if
> > reused), then failing that we try to grab one page in the mappable
> > aperture. It also allows us to handle objects larger than the mappable
> > aperture (e.g. if we need to pwrite with vGPU restricting the aperture
> > to a measely 8MiB or something like that).
> >
> > v2: Pin pages before starting pwrite, Combined duplicate loops (Chris)
> >
> > v3: Combined loops based on local patch by Chris (Chris)
> >
> > Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma at intel.com>
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 75 +++++++++++++++++++++++++++++------------
> >  1 file changed, 53 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index f1e3fde..9d2e6e3 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -760,20 +760,33 @@ fast_user_write(struct io_mapping *mapping,
> >   * user into the GTT, uncached.
> >   */
> >  static int
> > -i915_gem_gtt_pwrite_fast(struct drm_device *dev,
> > +i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915,
> >  			 struct drm_i915_gem_object *obj,
> >  			 struct drm_i915_gem_pwrite *args,
> >  			 struct drm_file *file)
> >  {
> > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	ssize_t remain;
> > -	loff_t offset, page_base;
> > +	struct drm_mm_node node;
> > +	uint64_t remain, offset;
> >  	char __user *user_data;
> > -	int page_offset, page_length, ret;
> > +	int ret;
> >  
> >  	ret = i915_gem_obj_ggtt_pin(obj, 0, PIN_MAPPABLE | PIN_NONBLOCK);
> > -	if (ret)
> > -		goto out;
> > +	if (ret) {
> > +		memset(&node, 0, sizeof(node));
> > +		ret = drm_mm_insert_node_in_range_generic(&i915->gtt.base.mm,
> > +							  &node, 4096, 0,
> > +							  I915_CACHE_NONE, 0,
> > +							  i915->gtt.mappable_end,
> > +							  DRM_MM_SEARCH_DEFAULT,
> > +							  DRM_MM_CREATE_DEFAULT);
> > +		if (ret)
> > +			goto out;
> > +
> > +		i915_gem_object_pin_pages(obj);
> > +	} else {
> > +		node.start = i915_gem_obj_ggtt_offset(obj);
> > +		node.allocated = false;
> > +	}
> >  
> >  	ret = i915_gem_object_set_to_gtt_domain(obj, true);
> >  	if (ret)
> > @@ -783,31 +796,39 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
> >  	if (ret)
> >  		goto out_unpin;
> >  
> > -	user_data = to_user_ptr(args->data_ptr);
> > -	remain = args->size;
> > -
> > -	offset = i915_gem_obj_ggtt_offset(obj) + args->offset;
> > -
> >  	intel_fb_obj_invalidate(obj, ORIGIN_GTT);
> > +	obj->dirty = true;
> >  
> > -	while (remain > 0) {
> > +	user_data = to_user_ptr(args->data_ptr);
> > +	offset = args->offset;
> > +	remain = args->size;
> > +	while (remain) {
> >  		/* Operation in this page
> >  		 *
> >  		 * page_base = page offset within aperture
> >  		 * page_offset = offset within page
> >  		 * page_length = bytes to copy for this page
> >  		 */
> > -		page_base = offset & PAGE_MASK;
> > -		page_offset = offset_in_page(offset);
> > -		page_length = remain;
> > -		if ((page_offset + remain) > PAGE_SIZE)
> > -			page_length = PAGE_SIZE - page_offset;
> > -
> > +		u32 page_base = node.start;
> 
> You truncate here as node.start is 64bit offset into the vm area.

It's a bit of cheat since it can't be 64bit, but the code is equally
inconsistent. The io-mapping can only handle unsigned long.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list