[Intel-gfx] [igt-dev] [PATCH i-g-t 9/9] i915: Exercise I915_CONTEXT_PARAM_RINGSIZE
Chris Wilson
chris at chris-wilson.co.uk
Mon Dec 2 14:59:19 UTC 2019
Quoting Janusz Krzysztofik (2019-12-02 14:42:58)
> Hi Chris,
>
> I have a few questions rather than comments. I hope they are worth spending
> your time.
>
> On Wednesday, November 13, 2019 1:52:40 PM CET Chris Wilson wrote:
> > I915_CONTEXT_PARAM_RINGSIZE specifies how large to create the command
> > ringbuffer for logical ring contects. This directly affects the number
>
> s/contects/contexts/
>
> > of batches userspace can submit before blocking waiting for space.
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> > tests/Makefile.sources | 3 +
> > tests/i915/gem_ctx_ringsize.c | 296 ++++++++++++++++++++++++++++++++++
> > tests/meson.build | 1 +
> > 3 files changed, 300 insertions(+)
> > create mode 100644 tests/i915/gem_ctx_ringsize.c
> >
> > diff --git a/tests/Makefile.sources b/tests/Makefile.sources
> > index e17d43155..801fc52f3 100644
> > --- a/tests/Makefile.sources
> > +++ b/tests/Makefile.sources
> > @@ -163,6 +163,9 @@ gem_ctx_param_SOURCES = i915/gem_ctx_param.c
> > TESTS_progs += gem_ctx_persistence
> > gem_ctx_persistence_SOURCES = i915/gem_ctx_persistence.c
> >
> > +TESTS_progs += gem_ctx_ringsize
> > +gem_ctx_ringsize_SOURCES = i915/gem_ctx_ringsize.c
> > +
> > TESTS_progs += gem_ctx_shared
> > gem_ctx_shared_SOURCES = i915/gem_ctx_shared.c
> >
> > diff --git a/tests/i915/gem_ctx_ringsize.c b/tests/i915/gem_ctx_ringsize.c
> > new file mode 100644
> > index 000000000..1450e8f0d
> > --- /dev/null
> > +++ b/tests/i915/gem_ctx_ringsize.c
> > @@ -0,0 +1,296 @@
> > +/*
> > + * Copyright © 2019 Intel Corporation
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a
> > + * copy of this software and associated documentation files (the "Software"),
> > + * to deal in the Software without restriction, including without limitation
> > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > + * and/or sell copies of the Software, and to permit persons to whom the
> > + * Software is furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice (including the next
> > + * paragraph) shall be included in all copies or substantial portions of the
> > + * Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> > + * IN THE SOFTWARE.
> > + */
> > +
> > +#include <errno.h>
> > +#include <fcntl.h>
> > +#include <inttypes.h>
> > +#include <sys/ioctl.h>
> > +#include <sys/types.h>
> > +#include <unistd.h>
> > +
> > +#include "drmtest.h" /* gem_quiescent_gpu()! */
> > +#include "i915/gem_context.h"
> > +#include "i915/gem_engine_topology.h"
> > +#include "ioctl_wrappers.h" /* gem_wait()! */
> > +#include "sw_sync.h"
> > +
> > +#define I915_CONTEXT_PARAM_RINGSIZE 0xc
>
> How are we going to handle symbol redefinition conflict which arises as soon
> as this symbol is also included from kernel headers (e.g. via
> "i915/gem_engine_topology.h")?
Final version we copy the headers form the kernel. Conflicts remind us
when we forget.
>
> > +
> > +static bool has_ringsize(int i915)
> > +{
> > + struct drm_i915_gem_context_param p = {
> > + .param = I915_CONTEXT_PARAM_RINGSIZE,
> > + };
> > +
> > + return __gem_context_get_param(i915, &p) == 0;
> > +}
> > +
> > +static void test_idempotent(int i915)
> > +{
> > + struct drm_i915_gem_context_param p = {
> > + .param = I915_CONTEXT_PARAM_RINGSIZE,
> > + };
> > + uint32_t saved;
> > +
> > + /*
> > + * Simple test to verify that we are able to read back the same
> > + * value as we set.
> > + */
> > +
> > + gem_context_get_param(i915, &p);
> > + saved = p.value;
> > +
> > + for (uint32_t x = 1 << 12; x <= 128 << 12; x <<= 1) {
>
> I've noticed you are using two different notations for those minimum/maximum
> constants. I think that may be confusing. How about defining and using
> macros?
A range in pages...
> > + p.value = x;
> > + gem_context_set_param(i915, &p);
> > + gem_context_get_param(i915, &p);
> > + igt_assert_eq_u32(p.value, x);
> > + }
> > +
> > + p.value = saved;
> > + gem_context_set_param(i915, &p);
> > +}
> > +
> > +static void test_invalid(int i915)
> > +{
> > + struct drm_i915_gem_context_param p = {
> > + .param = I915_CONTEXT_PARAM_RINGSIZE,
> > + };
> > + uint64_t invalid[] = {
> > + 0, 1, 4095, 4097, 8191, 8193,
> > + /* upper limit may be HW dependent, atm it is 512KiB */
> > + (512 << 10) - 1, (512 << 10) + 1,
>
> Here is an example of that different notation mentioned above.
And here written in KiB to match comments.
>
> > + -1, -1u
> > + };
> > + uint32_t saved;
> > +
> > + gem_context_get_param(i915, &p);
> > + saved = p.value;
> > +
> > + for (int i = 0; i < ARRAY_SIZE(invalid); i++) {
> > + p.value = invalid[i];
> > + igt_assert_eq(__gem_context_set_param(i915, &p), -EINVAL);
> > + gem_context_get_param(i915, &p);
> > + igt_assert_eq_u64(p.value, saved);
> > + }
> > +}
> > +
> > +static int create_ext_ioctl(int i915,
> > + struct drm_i915_gem_context_create_ext *arg)
> > +{
> > + int err;
> > +
> > + err = 0;
> > + if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT, arg)) {
> > + err = -errno;
> > + igt_assume(err);
> > + }
> > +
> > + errno = 0;
> > + return err;
> > +}
>
> This helper looks like pretty standard for me. Why there are no library
> functions for such generic operations?
Because no one has written that yet.
>
> > +
> > +static void test_create(int i915)
> > +{
> > + struct drm_i915_gem_context_create_ext_setparam p = {
> > + .base = {
> > + .name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> > + .next_extension = 0, /* end of chain */
> > + },
> > + .param = {
> > + .param = I915_CONTEXT_PARAM_RINGSIZE,
> > + .value = 512 << 10,
> > + }
> > + };
> > + struct drm_i915_gem_context_create_ext create = {
> > + .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> > + .extensions = to_user_pointer(&p),
> > + };
> > +
> > + igt_assert_eq(create_ext_ioctl(i915, &create), 0);
> > +
> > + p.param.ctx_id = create.ctx_id;
> > + p.param.value = 0;
> > + gem_context_get_param(i915, &p.param);
> > + igt_assert_eq(p.param.value, 512 << 10);
> > +
> > + gem_context_destroy(i915, create.ctx_id);
> > +}
> > +
> > +static void test_clone(int i915)
> > +{
> > + struct drm_i915_gem_context_create_ext_setparam p = {
> > + .base = {
> > + .name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> > + .next_extension = 0, /* end of chain */
> > + },
> > + .param = {
> > + .param = I915_CONTEXT_PARAM_RINGSIZE,
> > + .value = 512 << 10,
> > + }
> > + };
> > + struct drm_i915_gem_context_create_ext create = {
> > + .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
> > + .extensions = to_user_pointer(&p),
> > + };
> > +
> > + igt_assert_eq(create_ext_ioctl(i915, &create), 0);
> > +
> > + p.param.ctx_id = gem_context_clone(i915, create.ctx_id,
> > + I915_CONTEXT_CLONE_ENGINES, 0);
> > + igt_assert_neq(p.param.ctx_id, create.ctx_id);
> > + gem_context_destroy(i915, create.ctx_id);
> > +
> > + p.param.value = 0;
> > + gem_context_get_param(i915, &p.param);
> > + igt_assert_eq(p.param.value, 512 << 10);
> > +
> > + gem_context_destroy(i915, p.param.ctx_id);
> > +}
> > +
> > +static int __execbuf(int i915, struct drm_i915_gem_execbuffer2 *execbuf)
> > +{
> > + int err;
> > +
> > + err = 0;
> > + if (ioctl(i915, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf))
> > + err = -errno;
> > +
> > + errno = 0;
> > + return err;
> > +}
>
> The above helper looks pretty the same as lib/ioctlwrappers.c:__gem_execbuf().
> Does igt_assume(err) found in the latter matter so much that you use your own
> version?
It's very, very different from that one.
> > +
> > +static uint32_t __batch_create(int i915, uint32_t offset)
>
> This is always called with offset = 0, do we expect other values to be used
> later?
Why not.
> > +{
> > + const uint32_t bbe = 0xa << 23;
> > + uint32_t handle;
> > +
> > + handle = gem_create(i915, ALIGN(offset + sizeof(bbe), 4096));
>
> Why don't we rely on the driver making the alignment for us?
I'm used to being inside the kernel where it's expected to be correct.
> > + gem_write(i915, handle, offset, &bbe, sizeof(bbe));
> > +
> > + return handle;
> > +}
> > +
> > +static uint32_t batch_create(int i915)
> > +{
> > + return __batch_create(i915, 0);
> > +}
> > +
> > +static unsigned int measure_inflight(int i915, unsigned int engine)
> > +{
> > + IGT_CORK_FENCE(cork);
> > + struct drm_i915_gem_exec_object2 obj = {
> > + .handle = batch_create(i915)
> > + };
> > + struct drm_i915_gem_execbuffer2 execbuf = {
> > + .buffers_ptr = to_user_pointer(&obj),
> > + .buffer_count = 1,
> > + .flags = engine | I915_EXEC_FENCE_IN,
> > + .rsvd2 = igt_cork_plug(&cork, i915),
> > + };
> > + unsigned int count;
> > +
> > + fcntl(i915, F_SETFL, fcntl(i915, F_GETFL) | O_NONBLOCK);
> > +
> > + gem_execbuf(i915, &execbuf);
> > + for (count = 1; __execbuf(i915, &execbuf) == 0; count++)
> > + ;
>
> Shouldn't we check if the reason for the failure is what we expect, i.e.,
> -EWOULDBLOCK (or -EINTR)? And why don't we put a time constraint on that loop
> in case O_NONBLOCK handling is not supported (yet)?
Sure. The idea is that O_NONBLOCK is supported, otherwise we don't
have fast and precise feedback.
> > +static void test_resize(int i915,
> > + const struct intel_execution_engine2 *e,
> > + unsigned int flags)
> > +#define IDLE (1 << 0)
> > +{
> > + struct drm_i915_gem_context_param p = {
> > + .param = I915_CONTEXT_PARAM_RINGSIZE,
> > + };
> > + unsigned int prev[2] = {};
> > + uint32_t saved;
> > +
> > + gem_context_get_param(i915, &p);
> > + saved = p.value;
> > +
> > + gem_quiescent_gpu(i915);
> > + for (p.value = 1 << 12; p.value <= 128 << 12; p.value <<= 1) {
> > + unsigned int count;
> > +
> > + gem_context_set_param(i915, &p);
> > +
> > + count = measure_inflight(i915, e->flags);
> > + igt_info("%s: %llx -> %d\n", e->name, p.value, count);
> > + igt_assert(count > 3 * (prev[1] - prev[0]) / 4 + prev[1]);
>
> Where does this formula come from? Why not just count == 2 * prev[1] ?
> What results should we expect in "active" vs. "idle" mode?
I've explained somewhere why it is not 2*prev... And there's a small
amount of imprecision (+-1 request). In test_resize is the comment:
/*
* The ringsize directly affects the number of batches we can have
* inflight -- when we run out of room in the ring, the client is
* blocked (or if O_NONBLOCK is specified, -EWOULDBLOCK is reported).
* The kernel throttles the client when they enter the last 4KiB page,
* so as we double the size of the ring, we nearly double the number
* of requests we can fit as 2^n-1: i.e 0, 1, 3, 7, 15, 31 pages.
*/
-Chris
More information about the Intel-gfx
mailing list