[Intel-gfx] [igt-dev] [PATCH i-g-t 9/9] i915: Exercise I915_CONTEXT_PARAM_RINGSIZE

Janusz Krzysztofik janusz.krzysztofik at linux.intel.com
Thu Feb 20 15:57:24 UTC 2020


Hi Chris,

On Monday, December 2, 2019 3:59:19 PM CET Chris Wilson wrote:
> Quoting Janusz Krzysztofik (2019-12-02 14:42:58)
> > Hi Chris,
> > 
> > I have a few questions rather than comments.  I hope they are worth spending 
> > your time.
> > 
> > On Wednesday, November 13, 2019 1:52:40 PM CET Chris Wilson wrote:
> > > I915_CONTEXT_PARAM_RINGSIZE specifies how large to create the command
> > > ringbuffer for logical ring contects. This directly affects the number
> > 
> > s/contects/contexts/
> > 
> > > of batches userspace can submit before blocking waiting for space.
> > > 
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>

Have you got this patch still queued somewhere?  As UMD has accepted the 
solution and are ready with changes on their side, I think we need to merge it 
soon, and the kernel side as well.

Thanks,
Janusz


> > > ---
> > >  tests/Makefile.sources        |   3 +
> > >  tests/i915/gem_ctx_ringsize.c | 296 ++++++++++++++++++++++++++++++++++
> > >  tests/meson.build             |   1 +
> > >  3 files changed, 300 insertions(+)
> > >  create mode 100644 tests/i915/gem_ctx_ringsize.c
> > > 
> > > diff --git a/tests/Makefile.sources b/tests/Makefile.sources
> > > index e17d43155..801fc52f3 100644
> > > --- a/tests/Makefile.sources
> > > +++ b/tests/Makefile.sources
> > > @@ -163,6 +163,9 @@ gem_ctx_param_SOURCES = i915/gem_ctx_param.c
> > >  TESTS_progs += gem_ctx_persistence
> > >  gem_ctx_persistence_SOURCES = i915/gem_ctx_persistence.c
> > >  
> > > +TESTS_progs += gem_ctx_ringsize
> > > +gem_ctx_ringsize_SOURCES = i915/gem_ctx_ringsize.c
> > > +
> > >  TESTS_progs += gem_ctx_shared
> > >  gem_ctx_shared_SOURCES = i915/gem_ctx_shared.c
> > >  
> > > diff --git a/tests/i915/gem_ctx_ringsize.c b/tests/i915/gem_ctx_ringsize.c
> > > new file mode 100644
> > > index 000000000..1450e8f0d
> > > --- /dev/null
> > > +++ b/tests/i915/gem_ctx_ringsize.c
> > > @@ -0,0 +1,296 @@
> > > +/*
> > > + * Copyright © 2019 Intel Corporation
> > > + *
> > > + * Permission is hereby granted, free of charge, to any person obtaining a
> > > + * copy of this software and associated documentation files (the "Software"),
> > > + * to deal in the Software without restriction, including without limitation
> > > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > > + * and/or sell copies of the Software, and to permit persons to whom the
> > > + * Software is furnished to do so, subject to the following conditions:
> > > + *
> > > + * The above copyright notice and this permission notice (including the next
> > > + * paragraph) shall be included in all copies or substantial portions of the
> > > + * Software.
> > > + *
> > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> > > + * IN THE SOFTWARE.
> > > + */
> > > +
> > > +#include <errno.h>
> > > +#include <fcntl.h>
> > > +#include <inttypes.h>
> > > +#include <sys/ioctl.h>
> > > +#include <sys/types.h>
> > > +#include <unistd.h>
> > > +
> > > +#include "drmtest.h" /* gem_quiescent_gpu()! */
> > > +#include "i915/gem_context.h"
> > > +#include "i915/gem_engine_topology.h"
> > > +#include "ioctl_wrappers.h" /* gem_wait()! */
> > > +#include "sw_sync.h"
> > > +
> > > +#define I915_CONTEXT_PARAM_RINGSIZE 0xc
> > 
> > How are we going to handle symbol redefinition conflict which arises as soon 
> > as this symbol is also included from kernel headers (e.g. via 
> > "i915/gem_engine_topology.h")?
> 
> Final version we copy the headers form the kernel. Conflicts remind us
> when we forget.
> 
> > 
> > > +
> > > +static bool has_ringsize(int i915)
> > > +{
> > > +     struct drm_i915_gem_context_param p = {
> > > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > +     };
> > > +
> > > +     return __gem_context_get_param(i915, &p) == 0;
> > > +}
> > > +
> > > +static void test_idempotent(int i915)
> > > +{
> > > +     struct drm_i915_gem_context_param p = {
> > > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > +     };
> > > +     uint32_t saved;
> > > +
> > > +     /*
> > > +      * Simple test to verify that we are able to read back the same
> > > +      * value as we set.
> > > +      */
> > > +
> > > +     gem_context_get_param(i915, &p);
> > > +     saved = p.value;
> > > +
> > > +     for (uint32_t x = 1 << 12; x <= 128 << 12; x <<= 1) {
> > 
> > I've noticed you are using two different notations for those minimum/maximum 
> > constants.  I think that may be confusing.  How about defining and using 
> > macros?  
> 
> A range in pages...
>  
> > > +             p.value = x;
> > > +             gem_context_set_param(i915, &p);
> > > +             gem_context_get_param(i915, &p);
> > > +             igt_assert_eq_u32(p.value, x);
> > > +     }
> > > +
> > > +     p.value = saved;
> > > +     gem_context_set_param(i915, &p);
> > > +}
> > > +
> > > +static void test_invalid(int i915)
> > > +{
> > > +     struct drm_i915_gem_context_param p = {
> > > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > +     };
> > > +     uint64_t invalid[] = {
> > > +             0, 1, 4095, 4097, 8191, 8193,
> > > +             /* upper limit may be HW dependent, atm it is 512KiB */
> > > +             (512 << 10) - 1, (512 << 10) + 1,
> > 
> > Here is an example of that different notation mentioned above.
> 
> And here written in KiB to match comments.
> 
> > 
> > > +             -1, -1u
> > > +     };
> > > +     uint32_t saved;
> > > +
> > > +     gem_context_get_param(i915, &p);
> > > +     saved = p.value;
> > > +
> > > +     for (int i = 0; i < ARRAY_SIZE(invalid); i++) {
> > > +             p.value = invalid[i];
> > > +             igt_assert_eq(__gem_context_set_param(i915, &p), -EINVAL);
> > > +             gem_context_get_param(i915, &p);
> > > +             igt_assert_eq_u64(p.value, saved);
> > > +     }
> > > +}
> > > +
> > > +static int create_ext_ioctl(int i915,
> > > +                         struct drm_i915_gem_context_create_ext *arg)
> > > +{
> > > +     int err;
> > > +
> > > +     err = 0;
> > > +     if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT, arg)) {
> > > +             err = -errno;
> > > +             igt_assume(err);
> > > +     }
> > > +
> > > +     errno = 0;
> > > +     return err;
> > > +}
> > 
> > This helper looks like pretty standard for me.  Why there are no library 
> > functions for such generic operations?
> 
> Because no one has written that yet.
> 
> > 
> > > +
> > > +static void test_create(int i915)
> > > +{
> > > +     struct drm_i915_gem_context_create_ext_setparam p = {
> > > +             .base = {
> > > +                     .name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> > > +                     .next_extension = 0, /* end of chain */
> > > +             },
> > > +             .param = {
> > > +                     .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > +                     .value = 512 << 10,
> > > +             }
> > > +     };
> > > +     struct drm_i915_gem_context_create_ext create = {
> > > +             .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> > > +             .extensions = to_user_pointer(&p),
> > > +     };
> > > +
> > > +     igt_assert_eq(create_ext_ioctl(i915, &create),  0);
> > > +
> > > +     p.param.ctx_id = create.ctx_id;
> > > +     p.param.value = 0;
> > > +     gem_context_get_param(i915, &p.param);
> > > +     igt_assert_eq(p.param.value, 512 << 10);
> > > +
> > > +     gem_context_destroy(i915, create.ctx_id);
> > > +}
> > > +
> > > +static void test_clone(int i915)
> > > +{
> > > +     struct drm_i915_gem_context_create_ext_setparam p = {
> > > +             .base = {
> > > +                     .name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> > > +                     .next_extension = 0, /* end of chain */
> > > +             },
> > > +             .param = {
> > > +                     .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > +                     .value = 512 << 10,
> > > +             }
> > > +     };
> > > +     struct drm_i915_gem_context_create_ext create = {
> > > +             .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> > > +             .extensions = to_user_pointer(&p),
> > > +     };
> > > +
> > > +     igt_assert_eq(create_ext_ioctl(i915, &create),  0);
> > > +
> > > +     p.param.ctx_id = gem_context_clone(i915, create.ctx_id,
> > > +                                        I915_CONTEXT_CLONE_ENGINES, 0);
> > > +     igt_assert_neq(p.param.ctx_id, create.ctx_id);
> > > +     gem_context_destroy(i915, create.ctx_id);
> > > +
> > > +     p.param.value = 0;
> > > +     gem_context_get_param(i915, &p.param);
> > > +     igt_assert_eq(p.param.value, 512 << 10);
> > > +
> > > +     gem_context_destroy(i915, p.param.ctx_id);
> > > +}
> > > +
> > > +static int __execbuf(int i915, struct drm_i915_gem_execbuffer2 *execbuf)
> > > +{
> > > +     int err;
> > > +
> > > +     err = 0;
> > > +     if (ioctl(i915, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf))
> > > +             err = -errno;
> > > +
> > > +     errno = 0;
> > > +     return err;
> > > +}
> > 
> > The above helper looks pretty the same as lib/ioctlwrappers.c:__gem_execbuf().  
> > Does igt_assume(err) found in the latter matter so much that you use your own 
> > version?
> 
> It's very, very different from that one.
> 
> > > +
> > > +static uint32_t __batch_create(int i915, uint32_t offset)
> > 
> > This is always called with offset = 0, do we expect other values to be used 
> > later?
> 
> Why not.
>  
> > > +{
> > > +     const uint32_t bbe = 0xa << 23;
> > > +     uint32_t handle;
> > > +
> > > +     handle = gem_create(i915, ALIGN(offset + sizeof(bbe), 4096));
> > 
> > Why don't we rely on the driver making the alignment for us?
> 
> I'm used to being inside the kernel where it's expected to be correct.
> 
> > > +     gem_write(i915, handle, offset, &bbe, sizeof(bbe));
> > > +
> > > +     return handle;
> > > +}
> > > +
> > > +static uint32_t batch_create(int i915)
> > > +{
> > > +     return __batch_create(i915, 0);
> > > +}
> > > +
> > > +static unsigned int measure_inflight(int i915, unsigned int engine)
> > > +{
> > > +     IGT_CORK_FENCE(cork);
> > > +     struct drm_i915_gem_exec_object2 obj = {
> > > +             .handle = batch_create(i915)
> > > +     };
> > > +     struct drm_i915_gem_execbuffer2 execbuf = {
> > > +             .buffers_ptr = to_user_pointer(&obj),
> > > +             .buffer_count = 1,
> > > +             .flags = engine | I915_EXEC_FENCE_IN,
> > > +             .rsvd2 = igt_cork_plug(&cork, i915),
> > > +     };
> > > +     unsigned int count;
> > > +
> > > +     fcntl(i915, F_SETFL, fcntl(i915, F_GETFL) | O_NONBLOCK);
> > > +
> > > +     gem_execbuf(i915, &execbuf);
> > > +     for (count = 1; __execbuf(i915, &execbuf) == 0; count++)
> > > +             ;
> > 
> > Shouldn't we check if the reason for the failure is what we expect, i.e., 
> > -EWOULDBLOCK (or -EINTR)?  And why don't we put a time constraint on that loop 
> > in case O_NONBLOCK handling is not supported (yet)?
> 
> Sure. The idea is that O_NONBLOCK is supported, otherwise we don't
> have fast and precise feedback.
> 
> > > +static void test_resize(int i915,
> > > +                     const struct intel_execution_engine2 *e,
> > > +                     unsigned int flags)
> > > +#define IDLE (1 << 0)
> > > +{
> > > +     struct drm_i915_gem_context_param p = {
> > > +             .param = I915_CONTEXT_PARAM_RINGSIZE,
> > > +     };
> > > +     unsigned int prev[2] = {};
> > > +     uint32_t saved;
> > > +
> > > +     gem_context_get_param(i915, &p);
> > > +     saved = p.value;
> > > +
> > > +     gem_quiescent_gpu(i915);
> > > +     for (p.value = 1 << 12; p.value <= 128 << 12; p.value <<= 1) {
> > > +             unsigned int count;
> > > +
> > > +             gem_context_set_param(i915, &p);
> > > +
> > > +             count = measure_inflight(i915, e->flags);
> > > +             igt_info("%s: %llx -> %d\n", e->name, p.value, count);
> > > +             igt_assert(count > 3 * (prev[1] - prev[0]) / 4 + prev[1]);
> > 
> > Where does this formula come from?  Why not just count == 2 * prev[1] ?
> > What results should we expect in "active" vs. "idle" mode?
> 
> I've explained somewhere why it is not 2*prev... And there's a small
> amount of imprecision (+-1 request). In test_resize is the comment:
> 
>         /*
>          * The ringsize directly affects the number of batches we can have
>          * inflight -- when we run out of room in the ring, the client is
>          * blocked (or if O_NONBLOCK is specified, -EWOULDBLOCK is reported).
>          * The kernel throttles the client when they enter the last 4KiB page,
>          * so as we double the size of the ring, we nearly double the number
>          * of requests we can fit as 2^n-1: i.e 0, 1, 3, 7, 15, 31 pages.
>          */
> 
> -Chris
> 






More information about the Intel-gfx mailing list