[Intel-gfx] [PATCH v2 10/37] drm/i915/blt: support copying objects
Chris Wilson
chris at chris-wilson.co.uk
Thu Jun 27 23:35:27 UTC 2019
Quoting Matthew Auld (2019-06-27 21:56:06)
> We can already clear an object with the blt, so try to do the same to
> support copying from one object backing store to another. Really this is
> just object -> object, which is not that useful yet, what we really want
> is two backing stores, but that will require some vma rework first,
> otherwise we are stuck with "tmp" objects.
>
> Signed-off-by: Matthew Auld <matthew.auld at intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> Cc: Abdiel Janulgue <abdiel.janulgue at linux.intel.com
> ---
> .../gpu/drm/i915/gem/i915_gem_object_blt.c | 135 ++++++++++++++++++
> .../gpu/drm/i915/gem/i915_gem_object_blt.h | 8 ++
> .../i915/gem/selftests/i915_gem_object_blt.c | 105 ++++++++++++++
> drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 3 +-
> 4 files changed, 250 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
> index cb42e3a312e2..c2b28e06c379 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
> @@ -102,6 +102,141 @@ int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
> return err;
> }
>
> +int intel_emit_vma_copy_blt(struct i915_request *rq,
> + struct i915_vma *src,
> + struct i915_vma *dst)
> +{
> + const int gen = INTEL_GEN(rq->i915);
> + u32 *cs;
> +
> + GEM_BUG_ON(src->size != dst->size);
For a low level interface, I would suggest a little over engineering and
take src_offset, dst_offset, length. For bonus points, 2D -- but I
accept that may be too much over-engineering without a user.
> + cs = intel_ring_begin(rq, 10);
> + if (IS_ERR(cs))
> + return PTR_ERR(cs);
> +
> + if (gen >= 9) {
> + *cs++ = GEN9_XY_FAST_COPY_BLT_CMD | (10-2);
> + *cs++ = BLT_DEPTH_32 | PAGE_SIZE;
> + *cs++ = 0;
> + *cs++ = src->size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
> + *cs++ = lower_32_bits(dst->node.start);
> + *cs++ = upper_32_bits(dst->node.start);
> + *cs++ = 0;
> + *cs++ = PAGE_SIZE;
> + *cs++ = lower_32_bits(src->node.start);
> + *cs++ = upper_32_bits(src->node.start);
Reminds me that we didn't fix the earlier routines to handle more than
32k pages either. Please add a test case :)
-Chris
More information about the Intel-gfx
mailing list