[Intel-gfx] [RFC i-g-t] igt/gem_ringfill: Adds full ringbuffer preemption test
Chris Wilson
chris at chris-wilson.co.uk
Tue Aug 15 22:35:46 UTC 2017
Quoting Antonio Argenziano (2017-08-15 22:44:21)
> Sending as RFC to get feedback on what would be the correct behaviour of
> the driver and, therefore, if the test is valid.
It's not a preemption specific bug. It is fair to say that any client
blocking any other client over a non-contended resource is an issue.
Skip to end for a really easy way to demonstrate this.
> We do a wait while holding the mutex if we are adding a request and figure
> out that there is no more space in the ring buffer.
> Is that considered a bug?
Yes, but it is one of many priority inversion problems we have because
we have a BKL. Timeframe for fixing it is a few more years.
> +static void wait_batch(int fd, uint32_t handle)
> +{
> + int64_t timeout = 1ull * NSEC_PER_SEC; //1 sec
> +
> + if(gem_wait(fd, handle, &timeout) != 0) {
> + //Force reset and fail the test
> + igt_force_gpu_reset(fd);
Just terminate the spin batches.
> + igt_assert_f(0, "[0x%x] batch did not complete!", handle);
> + }
> +}
> +
> +/*
> + * This test checks that is possible for a high priority request to trigger a
> + * preemption if another context has filled its ringbuffer.
> + * The aim of the test is to make sure that high priority requests cannot be
> + * stalled by low priority tasks.
> + * */
> +static void preempt_while_ringbuffer_full(int fd, uint32_t engine)
> +{
> + uint32_t hp_ctx, lp_ctx;
> + uint32_t hp_batch;
> + igt_spin_t *lp_batch;
> +
> + struct drm_i915_gem_exec_object2 obj[2];
> + struct drm_i915_gem_relocation_entry reloc[1024];
That's a bit excessive for this pi test, no ?
> + struct drm_i915_gem_execbuffer2 execbuf;
> + const unsigned timeout = 10;
> +
> + lp_ctx = gem_context_create(fd);
> + ctx_set_priority(fd, lp_ctx, -MAX_PRIO);
> +
> + hp_ctx = gem_context_create(fd);
> + ctx_set_priority(fd, hp_ctx, MAX_PRIO);
> +
> + igt_require(setup_execbuf(fd, &execbuf, obj, reloc, engine) == 0);
> + execbuf.rsvd1 = lp_ctx;
> +
> + /*Stall execution and fill ring*/
> + lp_batch = igt_spin_batch_new(fd, lp_ctx, engine, 0);
> + igt_fork(child_no, 1) {
> + fill_ring(fd, &execbuf, 0, timeout);
> + }
> +
> + /*We don't know when the ring will be full so keep sending in a loop*/
Yes we do. See measure_ring_size.
static void bind_to_cpu(int cpu)
{
const int ncpus = sysconf(_SC_NPROCESSORS_ONLN);
struct sched_param rt = {.sched_priority = 99 };
cpu_set_t allowed;
igt_assert(sched_setscheduler(getpid(), SCHED_RR | SCHED_RESET_ON_FORK, &rt) == 0);
CPU_ZERO(&allowed);
CPU_SET(cpu % ncpus, &allowed);
igt_assert(sched_setaffinity(getpid(), sizeof(cpu_set_t), &allowed) == 0);
}
setup timer
execbuf.rsvd1 = ctx_lo;
while (__raw_gem_execbuf(fd, &execbuf) == 0)
;
/* steal cpu */
bind_to_cpu(0);
igt_fork(child, 1) {
/* this child is forced to wait for parent to sleep */
execbuf.rsvd1 = ctx_hi;
setup timer;
*success = __raw_gem_execbuf(fd, &execbuf) == 0;
}
setup 2*timer
__raw_gem_execbuf(fd, &execbuf); /* sleeps under mutex, releasing child
*/
igt_terminate_spin_batches();
igt_waitchildren();
igt_assert(*success);
Timer can be safely 10ms.
Similarly:
race set-domain (pretty much any GEM ioctl ends up in set-domain) vs
spin-batch, when successful then try any set-domain ioctl from a second
client and observe that it too is blocked on the first client hogging.
end:
For the purpose of testing, just create a debugfs that acquires
struct_mutex on opening. Then test every ioctl and trap from a second
client.
-Chris
More information about the Intel-gfx
mailing list