[Intel-gfx] [i-g-t PATCH v6 1/4] lib: add igt_dummyload

Tue Nov 15 11:08:30 UTC 2016

On 15 November 2016 at 11:59, Tomeu Vizoso <tomeu at tomeuvizoso.net> wrote:
> On 14 November 2016 at 19:24, Abdiel Janulgue
> <abdiel.janulgue at linux.intel.com> wrote:
>> A lot of igt testcases need some GPU workload to make sure a race
>> window is big enough. Unfortunately having a fixed amount of
>> workload leads to spurious test failures or overtly long runtimes
>> on some fast/slow platforms. This library contains functionality
>> to submit GPU workloads that should consume exactly a specific
>> amount of time.
>>
>> v2 : Add recursive batch feature from Chris
>> v3 : Drop auto-tuned stuff. Add bo dependecy to recursive batch
>>      by adding a dummy reloc to the bo as suggested by Ville.
>> v4:  Fix dependency reloc as write instead of read (Ville).
>>      Fix wrong handling of batchbuffer start on ILK causing
>>      test failure
>> v5:  Convert kms_busy to use this api
>> v6:  Add this library to docs
>>
>> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
>> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>> Signed-off-by: Abdiel Janulgue <abdiel.janulgue at linux.intel.com>
>> ---
>>  .../intel-gpu-tools/intel-gpu-tools-docs.xml       |   1 +
>>  lib/Makefile.sources                               |   2 +
>>  lib/igt.h                                          |   1 +
>>  lib/igt_dummyload.c                                | 276 +++++++++++++++++++++
>>  lib/igt_dummyload.h                                |  42 ++++
>>  5 files changed, 322 insertions(+)
>>  create mode 100644 lib/igt_dummyload.c
>>  create mode 100644 lib/igt_dummyload.h
>>
>> diff --git a/docs/reference/intel-gpu-tools/intel-gpu-tools-docs.xml b/docs/reference/intel-gpu-tools/intel-gpu-tools-docs.xml
>> index c862f2a..55902ab 100644
>> --- a/docs/reference/intel-gpu-tools/intel-gpu-tools-docs.xml
>> +++ b/docs/reference/intel-gpu-tools/intel-gpu-tools-docs.xml
>> @@ -32,6 +32,7 @@
>>      <xi:include href="xml/intel_io.xml"/>
>>      <xi:include href="xml/igt_vc4.xml"/>
>>      <xi:include href="xml/igt_vgem.xml"/>
>> +    <xi:include href="xml/igt_dummyload.xml"/>
>>    </chapter>
>>    <xi:include href="xml/igt_test_programs.xml"/>
>>
>> diff --git a/lib/Makefile.sources b/lib/Makefile.sources
>> index e8e277b..7fc5ec2 100644
>> --- a/lib/Makefile.sources
>> +++ b/lib/Makefile.sources
>> @@ -75,6 +75,8 @@ lib_source_list =             \
>>         igt_draw.h              \
>>         igt_pm.c                \
>>         igt_pm.h                \
>> +       igt_dummyload.c         \
>> +       igt_dummyload.h         \
>>         uwildmat/uwildmat.h     \
>>         uwildmat/uwildmat.c     \
>>         $(NULL)
>> diff --git a/lib/igt.h b/lib/igt.h
>> index d751f24..a0028d5 100644
>> --- a/lib/igt.h
>> +++ b/lib/igt.h
>> @@ -32,6 +32,7 @@
>>  #include "igt_core.h"
>>  #include "igt_debugfs.h"
>>  #include "igt_draw.h"
>> +#include "igt_dummyload.h"
>>  #include "igt_fb.h"
>>  #include "igt_gt.h"
>>  #include "igt_kms.h"
>> diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c
>> new file mode 100644
>> index 0000000..b934fd5
>> --- /dev/null
>> +++ b/lib/igt_dummyload.c
>> @@ -0,0 +1,276 @@
>> +/*
>> + * Copyright © 2016 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a
>> + * copy of this software and associated documentation files (the "Software"),
>> + * to deal in the Software without restriction, including without limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the next
>> + * paragraph) shall be included in all copies or substantial portions of the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
>> + * IN THE SOFTWARE.
>> + *
>> + */
>> +
>> +#include "igt.h"
>> +#include "igt_dummyload.h"
>> +#include <time.h>
>> +#include <signal.h>
>> +#include <sys/syscall.h>
>> +
>> +/**
>> + * SECTION:igt_dummyload
>> + * @short_description: Library for submitting GPU workloads
>> + * @title: Dummyload
>> + * @include: igt.h
>> + *
>> + * A lot of igt testcases need some GPU workload to make sure a race window is
>> + * big enough. Unfortunately having a fixed amount of workload leads to
>> + * spurious test failures or overtly long runtimes on some fast/slow platforms.
>
> s/overtly/overly
>
>> + * This library contains functionality to submit GPU workloads that should
>> + * consume exactly a specific amount of time.
>> + */
>> +
>> +#define NSEC_PER_SEC 1000000000L
>
> Time to put this in a header in lib? With this series applied we would
> have 7 separate definitions.
>
>> +#define gettid() syscall(__NR_gettid)
>> +#define sigev_notify_thread_id _sigev_un._tid
>
> Guess this should be fine as well to be in a single place because IGT
> knows what it's doing regarding TIDs?
>
>> +#define LOCAL_I915_EXEC_BSD_SHIFT      (13)
>> +#define LOCAL_I915_EXEC_BSD_MASK       (3 << LOCAL_I915_EXEC_BSD_SHIFT)
>> +
>> +#define ENGINE_MASK  (I915_EXEC_RING_MASK | LOCAL_I915_EXEC_BSD_MASK)
>> +
>> +static void
>> +fill_object(struct drm_i915_gem_exec_object2 *obj, uint32_t gem_handle,
>> +           struct drm_i915_gem_relocation_entry *relocs, uint32_t count)
>> +{
>> +       memset(obj, 0, sizeof(*obj));
>> +       obj->handle = gem_handle;
>> +       obj->relocation_count = count;
>> +       obj->relocs_ptr = (uintptr_t)relocs;
>> +}
>> +
>> +static void
>> +fill_reloc(struct drm_i915_gem_relocation_entry *reloc,
>> +          uint32_t gem_handle, uint32_t offset,
>> +          uint32_t read_domains, uint32_t write_domains)
>> +{
>> +       reloc->target_handle = gem_handle;
>> +       reloc->delta = 0;
>> +       reloc->offset = offset * sizeof(uint32_t);
>> +       reloc->presumed_offset = 0;
>> +       reloc->read_domains = read_domains;
>> +       reloc->write_domain = write_domains;
>> +}
>> +
>> +
>> +static uint32_t *batch;
>
> Maybe add a comment explaining why a global is needed?
>
>> +
>> +static uint32_t emit_recursive_batch(int fd, int engine, unsigned dep_handle)
>> +{
>> +       const int gen = intel_gen(intel_get_drm_devid(fd));
>> +       struct drm_i915_gem_exec_object2 obj[2];
>> +       struct drm_i915_gem_relocation_entry relocs[2];
>> +       struct drm_i915_gem_execbuffer2 execbuf;
>> +       unsigned engines[16];
>> +       unsigned nengine, handle;
>> +       int i = 0, reloc_count = 0, buf_count = 0;
>> +
>> +       buf_count = 0;
>> +       nengine = 0;
>> +       if (engine < 0) {
>> +               for_each_engine(fd, engine)
>> +                       if (engine)
>> +                               engines[nengine++] = engine;
>> +       } else {
>> +               igt_require(gem_has_ring(fd, engine));
>
> There's gem_require_ring.
>
>> +               engines[nengine++] = engine;
>> +       }
>> +       igt_require(nengine);
>> +
>> +       memset(&execbuf, 0, sizeof(execbuf));
>> +       memset(obj, 0, sizeof(obj));
>> +       memset(relocs, 0, sizeof(relocs));
>> +
>> +       execbuf.buffers_ptr = (uintptr_t) obj;
>> +       handle = gem_create(fd, 4096);
>
> May be worth defining a constant so we don't repeat the same hardcoded
> value so many times?
>
>> +       batch = gem_mmap__gtt(fd, handle, 4096, PROT_WRITE);
>
> Would it make sense to assert that batch is empty just in case?
>
>> +       gem_set_domain(fd, handle,
>> +                       I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
>> +
>> +       if (nengine == 1 && dep_handle > 0) {
>
> What if there's a dependency and no specific engine is passed? If
> that's invalid, then an assert might be good.
>
>> +               /* dummy write to dependency */
>> +               fill_object(&obj[buf_count], dep_handle, NULL, 0);
>> +               buf_count++;
>> +
>> +               fill_reloc(&relocs[reloc_count], dep_handle, i,
>> +                          I915_GEM_DOMAIN_RENDER,
>> +                          I915_GEM_DOMAIN_RENDER);
>> +               batch[i++] = 0; /* reloc */
>> +               reloc_count++;
>> +               batch[i++] = MI_NOOP;
>> +       }
>> +
>> +       if (gen >= 8) {
>> +               batch[i++] = MI_BATCH_BUFFER_START | 1 << 8 | 1;
>> +               /* recurse */
>> +               fill_reloc(&relocs[reloc_count], handle, i,
>> +                          I915_GEM_DOMAIN_COMMAND, 0);
>> +               batch[i++] = 0;
>> +               batch[i++] = 0;
>> +       } else if (gen >= 6) {
>> +               batch[i++] = MI_BATCH_BUFFER_START | 1 << 8;
>> +               /* recurse */
>> +               fill_reloc(&relocs[reloc_count], handle, i,
>> +                          I915_GEM_DOMAIN_COMMAND, 0);
>> +               batch[i++] = 0;
>> +       } else {
>> +               batch[i++] = MI_BATCH_BUFFER_START | 2 << 6 |
>> +                       ((gen < 4) ? 1 : 0);
>> +               /* recurse */
>> +               fill_reloc(&relocs[reloc_count], handle, i,
>> +                          I915_GEM_DOMAIN_COMMAND, 0);
>> +               batch[i++] = 0;
>> +               if (gen < 4)
>> +                       relocs[reloc_count].delta = 1;
>> +       }
>> +       reloc_count++;
>> +
>> +       fill_object(&obj[buf_count], handle, relocs, reloc_count);
>> +       buf_count++;
>> +
>> +       for (i = 0; i < nengine; i++) {
>> +               execbuf.flags &= ~ENGINE_MASK;
>> +               execbuf.flags = engines[i];
>> +               execbuf.buffer_count = buf_count;
>> +               gem_execbuf(fd, &execbuf);
>> +       }
>> +
>> +       return handle;
>> +}
>> +
>> +static void sigiter(int sig, siginfo_t *info, void *arg)
>> +{
>> +       *batch = MI_BATCH_BUFFER_END;
>> +       __sync_synchronize();
>> +}
>> +
>> +static timer_t setup_batch_exit_timer(int64_t ns)
>> +{
>> +       timer_t timer;
>> +       struct sigevent sev;
>> +       struct sigaction act;
>> +       struct itimerspec its;
>> +
>> +       memset(&sev, 0, sizeof(sev));
>> +       sev.sigev_notify = SIGEV_SIGNAL | SIGEV_THREAD_ID;
>> +       sev.sigev_notify_thread_id = gettid();
>> +       sev.sigev_signo = SIGRTMIN + 1;
>> +       igt_assert(timer_create(CLOCK_MONOTONIC, &sev, &timer) == 0);
>> +       igt_assert(timer > 0);
>> +
>> +       memset(&act, 0, sizeof(act));
>> +       act.sa_sigaction = sigiter;
>> +       act.sa_flags = SA_SIGINFO;
>> +       igt_assert(sigaction(SIGRTMIN + 1, &act, NULL) == 0);
>> +
>> +       memset(&its, 0, sizeof(its));
>> +       its.it_value.tv_sec = ns / NSEC_PER_SEC;
>> +       its.it_value.tv_nsec = ns % NSEC_PER_SEC;
>> +       igt_assert(timer_settime(timer, 0, &its, NULL) == 0);
>> +
>> +       return timer;
>> +}
>> +
>> +/**
>> + * igt_spin_batch:
>> + * @fd: open i915 drm file descriptor
>> + * @ns: amount of time in nanoseconds the batch executes after terminating.
>> + *      If value is less than 0, execute batch forever.
>> + * @engine: Ring to execute batch OR'd with execbuf flags. If value is less
>> + *          than 0, execute on all available rings.
>> + * @dep_handle: handle to a buffer object dependency. If greater than 0, add a
>> + *              relocation entry to this buffer within the batch.
>> + *
>> + * Start a recursive batch on a ring that terminates after an exact amount
>> + * of time has elapsed. Immediately returns a #igt_spin_t that contains the
>> + * batch's handle that can be waited upon. The returned structure must be passed to
>> + * igt_post_spin_batch() for post-processing.
>> + *
>> + * Returns:
>> + * Structure with helper internal state for igt_post_spin_batch().
>> + */
>> +igt_spin_t igt_spin_batch(int fd, int64_t ns, int engine, unsigned dep_handle)
>> +{
>> +       timer_t timer;
>> +       uint32_t handle = emit_recursive_batch(fd, engine, dep_handle);
>> +       int64_t wait_timeout = 0;
>> +       igt_assert_eq(gem_wait(fd, handle, &wait_timeout), -ETIME);
>> +
>> +       if (ns < 1) {
>> +               if (ns == 0) {
>
> This block can be one level higher, and then you can have a condition
> for ns < 0, which will then match the function docs.
>
>> +                       *batch = MI_BATCH_BUFFER_END;
>> +                       __sync_synchronize();
>> +                       return (igt_spin_t){ handle, batch, 0};
>> +               }
>> +               return (igt_spin_t){ handle, batch, 0 };
>> +       }
>> +       timer = setup_batch_exit_timer(ns);
>> +
>> +       return (igt_spin_t){ handle, batch, timer };
>> +}
>> +
>> +/**
>> + * igt_post_spin_batch:
>> + * @fd: open i915 drm file descriptor
>> + * @arg: spin batch state from igt_spin_batch()
>> + *
>> + * This function does the necessary post-processing after starting a recursive
>> + * batch with igt_spin_batch().
>> + */
>> +void igt_post_spin_batch(int fd, igt_spin_t arg)
>> +{
>> +       if (arg.handle == 0)
>> +               return;
>> +
>> +       if (arg.timer > 0)
>> +               timer_delete(arg.timer);
>> +
>> +       gem_close(fd, arg.handle);
>> +       munmap(arg.batch, 4096);
>> +}
>> +
>> +
>> +/**
>> + * igt_spin_batch_wait:
>> + * @fd: open i915 drm file descriptor
>> + * @ns: amount of time in nanoseconds the batch executes after terminating.
>> + *      If value is less than 0, execute batch forever.
>> + * @engine: ring to execute batch OR'd with execbuf flags. If value is less
>> + *          than 0, execute on all available rings.
>> + * @dep_handle: handle to a buffer object dependency. If greater than 0, include
>> + *              this buffer on the wait dependency
>> + *
>> + * This is similar to igt_spin_batch(), but waits on the recursive batch to finish
>> + * instead of returning right away. The function also does the necessary
>> + * post-processing automatically if set to timeout.
>> + */
>> +void igt_spin_batch_wait(int fd, int64_t ns, int engine, unsigned dep_handle)
>> +{
>> +       igt_spin_t spin = igt_spin_batch(fd, ns, engine, dep_handle);
>> +       int64_t wait_timeout = ns + (0.5 * NSEC_PER_SEC);
>> +       igt_assert_eq(gem_wait(fd, spin.handle, &wait_timeout), 0);
>> +
>> +       igt_post_spin_batch(fd, spin);
>> +}
>> diff --git a/lib/igt_dummyload.h b/lib/igt_dummyload.h
>> new file mode 100644
>> index 0000000..79ead2c
>> --- /dev/null
>> +++ b/lib/igt_dummyload.h
>> @@ -0,0 +1,42 @@
>> +/*
>> + * Copyright © 2016 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a
>> + * copy of this software and associated documentation files (the "Software"),
>> + * to deal in the Software without restriction, including without limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the next
>> + * paragraph) shall be included in all copies or substantial portions of the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
>> + * IN THE SOFTWARE.
>> + *
>> + */
>> +
>> +#ifndef __IGT_DUMMYLOAD_H__
>> +#define __IGT_DUMMYLOAD_H__
>> +
>> +typedef struct igt_spin {
>> +       unsigned handle;
>> +       uint32_t *batch;
>> +       timer_t timer;
>> +} igt_spin_t;
>> +
>> +
>> +igt_spin_t igt_spin_batch(int fd, int64_t ns, int engine, unsigned dep_handle);
>> +
>> +void igt_post_spin_batch(int fd, igt_spin_t arg);
>> +
>> +void igt_spin_batch_wait(int fd, int64_t ns, int engine, unsigned dep_handle);
>
> Sorry about the bikeshedding at v6, but just in case you like the
> idea: what about calling these functions igt_spin_batch_new() and
> igt_spin_batch_free() as in igt_pipe_crc_new and igt_pipe_crc_free? Or
> _alloc and _free, as with intel_batchbuffer.

Forgot to suggest returning a pointer to a new igt_spin_t struct in
_new() and freeing it in _free(), also because that's what will be
less surprising to people reading the code.

Regards,

Tomeu

> I'm saying this because the first time I read igt_post_spin_batch I
> thought it was about posting the batchbuffer, and all the
> post-processing seems to just be regular cleanup.
>
> Also, are you completely sure that always the point where the
> batchbuffer is created is also the point where it has to be started?
> Otherwise, a separate _start function would give that flexibility and
> the API may be a bit more explicit.
>
> Regards,
>
> Tomeu
>
>> +
>> +
>> +#endif /* __IGT_DUMMYLOAD_H__ */
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx