[Intel-gfx] [PATCH 5/8] drm/i915: Wait for writes through the GTT to land before reading back
Joonas Lahtinen
joonas.lahtinen at linux.intel.com
Thu Jun 9 12:54:44 UTC 2016
On to, 2016-06-09 at 12:29 +0100, Chris Wilson wrote:
> If we quickly switch from writing through the GTT to a read of the
> physical page directly with the CPU (e.g. performing relocations through
> the GTT and then running the command parser), we can observe that the
> writes are not visible to the CPU. It is not a coherency problem, as
> extensive investigations with clflush have demonstrated, but a mere
> timing issue - we have to wait for the GTT to complete it's write before
> we start our read from the CPU.
>
> The issue can be illustrated in userspace with:
>
> gtt = gem_mmap__gtt(fd, handle, 0, OBJECT_SIZE, PROT_READ | PROT_WRITE);
> cpu = gem_mmap__cpu(fd, handle, 0, OBJECT_SIZE, PROT_READ | PROT_WRITE);
> gem_set_domain(fd, handle, I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
>
> for (i = 0; i < OBJECT_SIZE / 64; i++) {
> int x = 16*i + (i%16);
> gtt[x] = i;
> clflush(&cpu[x], sizeof(cpu[x]));
> assert(cpu[x] == i);
> }
>
> Experimenting with that shows that this behaviour is indeed limited to
> recent Atom-class hardware.
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 18b4a684ddde..ffe3d3e9d69d 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2898,20 +2898,30 @@ i915_gem_clflush_object(struct drm_i915_gem_object *obj,
> static void
> i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj)
> {
> + struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
> uint32_t old_write_domain;
>
> if (obj->base.write_domain != I915_GEM_DOMAIN_GTT)
> return;
>
> /* No actual flushing is required for the GTT write domain. Writes
> - * to it immediately go to main memory as far as we know, so there's
> + * to it "immediately" go to main memory as far as we know, so there's
> * no chipset flush. It also doesn't land in render cache.
> *
> * However, we do have to enforce the order so that all writes through
> * the GTT land before any writes to the device, such as updates to
> * the GATT itself.
> + *
> + * We also have to wait a bit for the writes to land from the GTT.
> + * An uncached read (i.e. mmio) seems to be ideal for the round-trip
> + * timing. This issue has only been observed when switching quickly
> + * between GTT writes and CPU reads from inside the kernel on recent hw,
> + * and it appears to only affect discrete GTT blocks (i.e. on LLC
> + * system agents we cannot reproduce this behaviour).
This screams for a Tested-by: tag before merging...
> */
> wmb();
> + if (INTEL_INFO(dev_priv)->gen >= 6 && !HAS_LLC(dev_priv))
INTEL_GEN()
This fixed, and adding the Testcase: label
Reviewed-by: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> + POSTING_READ(RING_ACTHD(dev_priv->engine[RCS].mmio_base));
>
> old_write_domain = obj->base.write_domain;
> obj->base.write_domain = 0;
--
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
More information about the Intel-gfx
mailing list