[Intel-gfx] [PATCH v4 2/7] drm/i915: Use batch pools with the command parser

Chris Wilson chris at chris-wilson.co.uk
Wed Nov 12 10:49:19 CET 2014


On Fri, Nov 07, 2014 at 02:22:02PM -0800, bradley.d.volkin at intel.com wrote:
> From: Brad Volkin <bradley.d.volkin at intel.com>
> 
> This patch sets up all of the tracking and copying necessary to
> use batch pools with the command parser and dispatches the copied
> (shadow) batch to the hardware.
> 
> After this patch, the parser is in 'enabling' mode.
> 
> Note that performance takes a hit from the copy in some cases
> and will likely need some work. At a rough pass, the memcpy
> appears to be the bottleneck. Without having done a deeper
> analysis, two ideas that come to mind are:
> 1) Copy sections of the batch at a time, as they are reached
>    by parsing. Might improve cache locality.
> 2) Copy only up to the userspace-supplied batch length and
>    memset the rest of the buffer. Reduces the number of reads.
> 
> v2:
> - Remove setting the capacity of the pool
> - One global pool instead of per-ring pools
> - Replace batch_obj with shadow_batch_obj and hook into eb->vmas
> - Memset any space in the shadow batch beyond what gets copied
> - Rebased on execlist prep refactoring
> 
> v3:
> - Rebase on chained batch handling
> - Squash in setting the secure dispatch flag
> - Add a note about the interaction w/secure dispatch pinning
> - Check for request->batch_obj == NULL in i915_gem_free_request
> 
> v4:
> - Fix read domains for shadow_batch_obj
> - Remove the set_to_gtt_domain call from i915_parse_cmds
> - ggtt_pin/unpin in the parser block to simplify error handling
> - Check USES_FULL_PPGTT before setting DISPATCH_SECURE flag
> - Remove i915_gem_batch_pool_put calls
> 
> Signed-off-by: Brad Volkin <bradley.d.volkin at intel.com>
> ---
>  drivers/gpu/drm/i915/i915_cmd_parser.c     | 79 +++++++++++++++++++++++-------
>  drivers/gpu/drm/i915/i915_dma.c            |  1 +
>  drivers/gpu/drm/i915/i915_drv.h            |  8 +++
>  drivers/gpu/drm/i915/i915_gem.c            |  2 +
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 49 ++++++++++++++++--
>  5 files changed, 117 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> index 809bb95..5a3f4e4 100644
> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> @@ -838,6 +838,56 @@ finish:
>  	return (u32*)addr;
>  }
>  
> +/* Returns a vmap'd pointer to dest_obj, which the caller must unmap */
> +static u32 *copy_batch(struct drm_i915_gem_object *dest_obj,
> +		       struct drm_i915_gem_object *src_obj)
> +{
> +	int ret = 0;
> +	int needs_clflush = 0;
> +	u32 *src_addr, *dest_addr = NULL;
> +
> +	ret = i915_gem_obj_prepare_shmem_read(src_obj, &needs_clflush);
> +	if (ret) {
> +		DRM_DEBUG_DRIVER("CMD: failed to prep read\n");
> +		return ERR_PTR(ret);
> +	}
> +
> +	src_addr = vmap_batch(src_obj);
> +	if (!src_addr) {
> +		DRM_DEBUG_DRIVER("CMD: Failed to vmap batch\n");
> +		ret = -ENOMEM;
> +		goto unpin_src;
> +	}
> +
> +	if (needs_clflush)
> +		drm_clflush_virt_range((char *)src_addr, src_obj->base.size);
> +
> +	ret = i915_gem_object_set_to_cpu_domain(dest_obj, true);
> +	if (ret) {
> +		DRM_DEBUG_DRIVER("CMD: Failed to set batch CPU domain\n");
> +		goto unmap_src;
> +	}
> +
> +	dest_addr = vmap_batch(dest_obj);

Considering that this does a straightforward copy (if we ignore the
highly dubious memset), any chance of a kmap_atomic(); copy_page();
loop? (Modulo first/last pages which may not be aligned.)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre



More information about the Intel-gfx mailing list