[Intel-gfx] [igt-dev] [PATCH i-g-t 9/9] i915: Exercise I915_CONTEXT_PARAM_RINGSIZE
Janusz Krzysztofik
janusz.krzysztofik at linux.intel.com
Thu Feb 20 15:57:24 UTC 2020
Hi Chris,
On Monday, December 2, 2019 3:59:19 PM CET Chris Wilson wrote:
> Quoting Janusz Krzysztofik (2019-12-02 14:42:58)
> > Hi Chris,
> >
> > I have a few questions rather than comments. I hope they are worth spending
> > your time.
> >
> > On Wednesday, November 13, 2019 1:52:40 PM CET Chris Wilson wrote:
> > > I915_CONTEXT_PARAM_RINGSIZE specifies how large to create the command
> > > ringbuffer for logical ring contects. This directly affects the number
> >
> > s/contects/contexts/
> >
> > > of batches userspace can submit before blocking waiting for space.
> > >
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
Have you got this patch still queued somewhere? As UMD has accepted the
solution and are ready with changes on their side, I think we need to merge it
soon, and the kernel side as well.
Thanks,
Janusz
> > > ---
> > > tests/Makefile.sources | 3 +
> > > tests/i915/gem_ctx_ringsize.c | 296 ++++++++++++++++++++++++++++++++++
> > > tests/meson.build | 1 +
> > > 3 files changed, 300 insertions(+)
> > > create mode 100644 tests/i915/gem_ctx_ringsize.c
> > >
> > > diff --git a/tests/Makefile.sources b/tests/Makefile.sources
> > > index e17d43155..801fc52f3 100644
> > > --- a/tests/Makefile.sources
> > > +++ b/tests/Makefile.sources
> > > @@ -163,6 +163,9 @@ gem_ctx_param_SOURCES = i915/gem_ctx_param.c
> > > TESTS_progs += gem_ctx_persistence
> > > gem_ctx_persistence_SOURCES = i915/gem_ctx_persistence.c
> > >
> > > +TESTS_progs += gem_ctx_ringsize
> > > +gem_ctx_ringsize_SOURCES = i915/gem_ctx_ringsize.c
> > > +
> > > TESTS_progs += gem_ctx_shared
> > > gem_ctx_shared_SOURCES = i915/gem_ctx_shared.c
> > >
> > > diff --git a/tests/i915/gem_ctx_ringsize.c b/tests/i915/gem_ctx_ringsize.c
> > > new file mode 100644
> > > index 000000000..1450e8f0d
> > > --- /dev/null
> > > +++ b/tests/i915/gem_ctx_ringsize.c
> > > @@ -0,0 +1,296 @@
> > > +/*
> > > + * Copyright © 2019 Intel Corporation
> > > + *
> > > + * Permission is hereby granted, free of charge, to any person obtaining a
> > > + * copy of this software and associated documentation files (the "Software"),
> > > + * to deal in the Software without restriction, including without limitation
> > > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > > + * and/or sell copies of the Software, and to permit persons to whom the
> > > + * Software is furnished to do so, subject to the following conditions:
> > > + *
> > > + * The above copyright notice and this permission notice (including the next
> > > + * paragraph) shall be included in all copies or substantial portions of the
> > > + * Software.
> > > + *
> > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> > > + * IN THE SOFTWARE.
> > > + */
> > > +
> > > +#include <errno.h>
> > > +#include <fcntl.h>
> > > +#include <inttypes.h>
> > > +#include <sys/ioctl.h>
> > > +#include <sys/types.h>
> > > +#include <unistd.h>
> > > +
> > > +#include "drmtest.h" /* gem_quiescent_gpu()! */
> > > +#include "i915/gem_context.h"
> > > +#include "i915/gem_engine_topology.h"
> > > +#include "ioctl_wrappers.h" /* gem_wait()! */
> > > +#include "sw_sync.h"
> > > +
> > > +#define I915_CONTEXT_PARAM_RINGSIZE 0xc
> >
> > How are we going to handle symbol redefinition conflict which arises as soon
> > as this symbol is also included from kernel headers (e.g. via
> > "i915/gem_engine_topology.h")?
>
> Final version we copy the headers form the kernel. Conflicts remind us
> when we forget.
>
> >
> > > +
> > > +static bool has_ringsize(int i915)
> > > +{
> > > + struct drm_i915_gem_context_param p = {
> > > + .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > + };
> > > +
> > > + return __gem_context_get_param(i915, &p) == 0;
> > > +}
> > > +
> > > +static void test_idempotent(int i915)
> > > +{
> > > + struct drm_i915_gem_context_param p = {
> > > + .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > + };
> > > + uint32_t saved;
> > > +
> > > + /*
> > > + * Simple test to verify that we are able to read back the same
> > > + * value as we set.
> > > + */
> > > +
> > > + gem_context_get_param(i915, &p);
> > > + saved = p.value;
> > > +
> > > + for (uint32_t x = 1 << 12; x <= 128 << 12; x <<= 1) {
> >
> > I've noticed you are using two different notations for those minimum/maximum
> > constants. I think that may be confusing. How about defining and using
> > macros?
>
> A range in pages...
>
> > > + p.value = x;
> > > + gem_context_set_param(i915, &p);
> > > + gem_context_get_param(i915, &p);
> > > + igt_assert_eq_u32(p.value, x);
> > > + }
> > > +
> > > + p.value = saved;
> > > + gem_context_set_param(i915, &p);
> > > +}
> > > +
> > > +static void test_invalid(int i915)
> > > +{
> > > + struct drm_i915_gem_context_param p = {
> > > + .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > + };
> > > + uint64_t invalid[] = {
> > > + 0, 1, 4095, 4097, 8191, 8193,
> > > + /* upper limit may be HW dependent, atm it is 512KiB */
> > > + (512 << 10) - 1, (512 << 10) + 1,
> >
> > Here is an example of that different notation mentioned above.
>
> And here written in KiB to match comments.
>
> >
> > > + -1, -1u
> > > + };
> > > + uint32_t saved;
> > > +
> > > + gem_context_get_param(i915, &p);
> > > + saved = p.value;
> > > +
> > > + for (int i = 0; i < ARRAY_SIZE(invalid); i++) {
> > > + p.value = invalid[i];
> > > + igt_assert_eq(__gem_context_set_param(i915, &p), -EINVAL);
> > > + gem_context_get_param(i915, &p);
> > > + igt_assert_eq_u64(p.value, saved);
> > > + }
> > > +}
> > > +
> > > +static int create_ext_ioctl(int i915,
> > > + struct drm_i915_gem_context_create_ext *arg)
> > > +{
> > > + int err;
> > > +
> > > + err = 0;
> > > + if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT, arg)) {
> > > + err = -errno;
> > > + igt_assume(err);
> > > + }
> > > +
> > > + errno = 0;
> > > + return err;
> > > +}
> >
> > This helper looks like pretty standard for me. Why there are no library
> > functions for such generic operations?
>
> Because no one has written that yet.
>
> >
> > > +
> > > +static void test_create(int i915)
> > > +{
> > > + struct drm_i915_gem_context_create_ext_setparam p = {
> > > + .base = {
> > > + .name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> > > + .next_extension = 0, /* end of chain */
> > > + },
> > > + .param = {
> > > + .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > + .value = 512 << 10,
> > > + }
> > > + };
> > > + struct drm_i915_gem_context_create_ext create = {
> > > + .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> > > + .extensions = to_user_pointer(&p),
> > > + };
> > > +
> > > + igt_assert_eq(create_ext_ioctl(i915, &create), 0);
> > > +
> > > + p.param.ctx_id = create.ctx_id;
> > > + p.param.value = 0;
> > > + gem_context_get_param(i915, &p.param);
> > > + igt_assert_eq(p.param.value, 512 << 10);
> > > +
> > > + gem_context_destroy(i915, create.ctx_id);
> > > +}
> > > +
> > > +static void test_clone(int i915)
> > > +{
> > > + struct drm_i915_gem_context_create_ext_setparam p = {
> > > + .base = {
> > > + .name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> > > + .next_extension = 0, /* end of chain */
> > > + },
> > > + .param = {
> > > + .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > + .value = 512 << 10,
> > > + }
> > > + };
> > > + struct drm_i915_gem_context_create_ext create = {
> > > + .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> > > + .extensions = to_user_pointer(&p),
> > > + };
> > > +
> > > + igt_assert_eq(create_ext_ioctl(i915, &create), 0);
> > > +
> > > + p.param.ctx_id = gem_context_clone(i915, create.ctx_id,
> > > + I915_CONTEXT_CLONE_ENGINES, 0);
> > > + igt_assert_neq(p.param.ctx_id, create.ctx_id);
> > > + gem_context_destroy(i915, create.ctx_id);
> > > +
> > > + p.param.value = 0;
> > > + gem_context_get_param(i915, &p.param);
> > > + igt_assert_eq(p.param.value, 512 << 10);
> > > +
> > > + gem_context_destroy(i915, p.param.ctx_id);
> > > +}
> > > +
> > > +static int __execbuf(int i915, struct drm_i915_gem_execbuffer2 *execbuf)
> > > +{
> > > + int err;
> > > +
> > > + err = 0;
> > > + if (ioctl(i915, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf))
> > > + err = -errno;
> > > +
> > > + errno = 0;
> > > + return err;
> > > +}
> >
> > The above helper looks pretty the same as lib/ioctlwrappers.c:__gem_execbuf().
> > Does igt_assume(err) found in the latter matter so much that you use your own
> > version?
>
> It's very, very different from that one.
>
> > > +
> > > +static uint32_t __batch_create(int i915, uint32_t offset)
> >
> > This is always called with offset = 0, do we expect other values to be used
> > later?
>
> Why not.
>
> > > +{
> > > + const uint32_t bbe = 0xa << 23;
> > > + uint32_t handle;
> > > +
> > > + handle = gem_create(i915, ALIGN(offset + sizeof(bbe), 4096));
> >
> > Why don't we rely on the driver making the alignment for us?
>
> I'm used to being inside the kernel where it's expected to be correct.
>
> > > + gem_write(i915, handle, offset, &bbe, sizeof(bbe));
> > > +
> > > + return handle;
> > > +}
> > > +
> > > +static uint32_t batch_create(int i915)
> > > +{
> > > + return __batch_create(i915, 0);
> > > +}
> > > +
> > > +static unsigned int measure_inflight(int i915, unsigned int engine)
> > > +{
> > > + IGT_CORK_FENCE(cork);
> > > + struct drm_i915_gem_exec_object2 obj = {
> > > + .handle = batch_create(i915)
> > > + };
> > > + struct drm_i915_gem_execbuffer2 execbuf = {
> > > + .buffers_ptr = to_user_pointer(&obj),
> > > + .buffer_count = 1,
> > > + .flags = engine | I915_EXEC_FENCE_IN,
> > > + .rsvd2 = igt_cork_plug(&cork, i915),
> > > + };
> > > + unsigned int count;
> > > +
> > > + fcntl(i915, F_SETFL, fcntl(i915, F_GETFL) | O_NONBLOCK);
> > > +
> > > + gem_execbuf(i915, &execbuf);
> > > + for (count = 1; __execbuf(i915, &execbuf) == 0; count++)
> > > + ;
> >
> > Shouldn't we check if the reason for the failure is what we expect, i.e.,
> > -EWOULDBLOCK (or -EINTR)? And why don't we put a time constraint on that loop
> > in case O_NONBLOCK handling is not supported (yet)?
>
> Sure. The idea is that O_NONBLOCK is supported, otherwise we don't
> have fast and precise feedback.
>
> > > +static void test_resize(int i915,
> > > + const struct intel_execution_engine2 *e,
> > > + unsigned int flags)
> > > +#define IDLE (1 << 0)
> > > +{
> > > + struct drm_i915_gem_context_param p = {
> > > + .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > + };
> > > + unsigned int prev[2] = {};
> > > + uint32_t saved;
> > > +
> > > + gem_context_get_param(i915, &p);
> > > + saved = p.value;
> > > +
> > > + gem_quiescent_gpu(i915);
> > > + for (p.value = 1 << 12; p.value <= 128 << 12; p.value <<= 1) {
> > > + unsigned int count;
> > > +
> > > + gem_context_set_param(i915, &p);
> > > +
> > > + count = measure_inflight(i915, e->flags);
> > > + igt_info("%s: %llx -> %d\n", e->name, p.value, count);
> > > + igt_assert(count > 3 * (prev[1] - prev[0]) / 4 + prev[1]);
> >
> > Where does this formula come from? Why not just count == 2 * prev[1] ?
> > What results should we expect in "active" vs. "idle" mode?
>
> I've explained somewhere why it is not 2*prev... And there's a small
> amount of imprecision (+-1 request). In test_resize is the comment:
>
> /*
> * The ringsize directly affects the number of batches we can have
> * inflight -- when we run out of room in the ring, the client is
> * blocked (or if O_NONBLOCK is specified, -EWOULDBLOCK is reported).
> * The kernel throttles the client when they enter the last 4KiB page,
> * so as we double the size of the ring, we nearly double the number
> * of requests we can fit as 2^n-1: i.e 0, 1, 3, 7, 15, 31 pages.
> */
>
> -Chris
>
More information about the Intel-gfx
mailing list