[Intel-gfx] [PATCH igt] igt/prime_vgem: Split out the fine-grain coherency check
Chris Wilson
chris at chris-wilson.co.uk
Tue Sep 19 12:21:08 UTC 2017
Quoting Chris Wilson (2017-09-07 15:30:55)
> We don't expect every machine to be able to pass the WC/GTT coherency
> check, see
>
> kernel commit 3b5724d702ef24ee41ca008a1fab1cf94f3d31b5
> Author: Chris Wilson <chris at chris-wilson.co.uk>
> Date: Thu Aug 18 17:16:49 2016 +0100
>
> drm/i915: Wait for writes through the GTT to land before reading back
>
> If we quickly switch from writing through the GTT to a read of the
> physical page directly with the CPU (e.g. performing relocations through
> the GTT and then running the command parser), we can observe that the
> writes are not visible to the CPU. It is not a coherency problem, as
> extensive investigations with clflush have demonstrated, but a mere
> timing issue - we have to wait for the GTT to complete it's write before
> we start our read from the CPU.
>
> The issue can be illustrated in userspace with:
>
> gtt = gem_mmap__gtt(fd, handle, 0, OBJECT_SIZE, PROT_READ | PROT_WRITE);
> cpu = gem_mmap__cpu(fd, handle, 0, OBJECT_SIZE, PROT_READ | PROT_WRITE);
> gem_set_domain(fd, handle, I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
>
> for (i = 0; i < OBJECT_SIZE / 64; i++) {
> int x = 16*i + (i%16);
> gtt[x] = i;
> clflush(&cpu[x], sizeof(cpu[x]));
> assert(cpu[x] == i);
> }
>
> Experimenting with that shows that this behaviour is indeed limited to
> recent Atom-class hardware.
>
> so split out the interleave coherency check from the basic
> interopability check.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102577
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
Ping?
> ---
> tests/prime_vgem.c | 36 ++++++++++++++++++++++++++++++++----
> 1 file changed, 32 insertions(+), 4 deletions(-)
>
> diff --git a/tests/prime_vgem.c b/tests/prime_vgem.c
> index 95557ef9..15de7993 100644
> --- a/tests/prime_vgem.c
> +++ b/tests/prime_vgem.c
> @@ -232,18 +232,43 @@ static void test_gtt(int vgem, int i915)
> igt_assert_eq(ptr[1024*i], ~i);
> munmap(ptr, scratch.size);
>
> + gem_close(i915, handle);
> + gem_close(vgem, scratch.handle);
> +}
> +
> +static void test_gtt_interleaved(int vgem, int i915)
> +{
> + struct vgem_bo scratch;
> + uint32_t handle;
> + uint32_t *ptr, *gtt;
> + int dmabuf, i;
> +
> + scratch.width = 1024;
> + scratch.height = 1024;
> + scratch.bpp = 32;
> + vgem_create(vgem, &scratch);
> +
> + dmabuf = prime_handle_to_fd(vgem, scratch.handle);
> + handle = prime_fd_to_handle(i915, dmabuf);
> + close(dmabuf);
> +
> + /* This assumes that GTT is perfectedly coherent. On certain machines,
> + * it is possible for a direct acces to bypass the GTT indirection.
> + *
> + * This test may fail. It tells us how far userspace can trust
> + * concurrent dmabuf/i915 access.
> + */
> ptr = vgem_mmap(vgem, &scratch, PROT_WRITE);
> gtt = gem_mmap__gtt(i915, handle, scratch.size, PROT_WRITE);
> -#if defined(__x86_64__)
> for (i = 0; i < 1024; i++) {
> gtt[1024*i] = i;
> - __builtin_ia32_sfence();
> + /* The read from WC should act as a flush for the GTT wcb */
> igt_assert_eq(ptr[1024*i], i);
> +
> ptr[1024*i] = ~i;
> - __builtin_ia32_sfence();
> + /* The read from GTT should act as a flush for the WC wcb */
> igt_assert_eq(gtt[1024*i], ~i);
> }
> -#endif
> munmap(gtt, scratch.size);
> munmap(ptr, scratch.size);
>
> @@ -753,6 +778,9 @@ igt_main
> igt_subtest("basic-gtt")
> test_gtt(vgem, i915);
>
> + igt_subtest("coherency-gtt")
> + test_gtt_interleaved(vgem, i915);
> +
> for (e = intel_execution_engines; e->name; e++) {
> igt_subtest_f("%ssync-%s",
> e->exec_id == 0 ? "basic-" : "",
> --
> 2.14.1
>
More information about the Intel-gfx
mailing list