[Intel-gfx] [RFC i-g-t] igt/gem_ringfill: Adds full ringbuffer preemption test

Thu Aug 17 15:14:04 UTC 2017

On 15/08/17 15:35, Chris Wilson wrote:
> Quoting Antonio Argenziano (2017-08-15 22:44:21)
>> Sending as RFC to get feedback on what would be the correct behaviour of
>> the driver and, therefore, if the test is valid.
> 
> It's not a preemption specific bug. It is fair to say that any client
> blocking any other client over a non-contended resource is an issue.
> Skip to end for a really easy way to demonstrate this.

Ok, I'll push a patch then.

> 
>> We do a wait while holding the mutex if we are adding a request and figure
>> out that there is no more space in the ring buffer.
>> Is that considered a bug?
> 
> Yes, but it is one of many priority inversion problems we have because
> we have a BKL. Timeframe for fixing it is a few more years.
> 
>> +static void wait_batch(int fd, uint32_t handle)
>> +{
>> +       int64_t timeout = 1ull * NSEC_PER_SEC; //1 sec
>> +
>> +       if(gem_wait(fd, handle, &timeout) != 0) {
>> +               //Force reset and fail the test
>> +               igt_force_gpu_reset(fd);
> 
> Just terminate the spin batches.
> 
>> +               igt_assert_f(0, "[0x%x] batch did not complete!", handle);
>> +       }
>> +}
>> +
>> +/*
>> + * This test checks that is possible for a high priority request to trigger a
>> + * preemption if another context has filled its ringbuffer.
>> + * The aim of the test is to make sure that high priority requests cannot be
>> + * stalled by low priority tasks.
>> + * */
>> +static void preempt_while_ringbuffer_full(int fd, uint32_t engine)
>> +{
>> +       uint32_t hp_ctx, lp_ctx;
>> +       uint32_t hp_batch;
>> +       igt_spin_t *lp_batch;
>> +
>> +       struct drm_i915_gem_exec_object2 obj[2];
>> +       struct drm_i915_gem_relocation_entry reloc[1024];
> 
> That's a bit excessive for this pi test, no ?

Just wanted to reuse the utility functions in the test with minimal changes.

> 
>> +       struct drm_i915_gem_execbuffer2 execbuf;
>> +       const unsigned timeout = 10;
>> +
>> +       lp_ctx = gem_context_create(fd);
>> +       ctx_set_priority(fd, lp_ctx, -MAX_PRIO);
>> +
>> +       hp_ctx = gem_context_create(fd);
>> +       ctx_set_priority(fd, hp_ctx, MAX_PRIO);
>> +
>> +       igt_require(setup_execbuf(fd, &execbuf, obj, reloc, engine) == 0);
>> +       execbuf.rsvd1 = lp_ctx;
>> +
>> +       /*Stall execution and fill ring*/
>> +       lp_batch = igt_spin_batch_new(fd, lp_ctx, engine, 0);
>> +       igt_fork(child_no, 1) {
>> +               fill_ring(fd, &execbuf, 0, timeout);
>> +       }
>> +
>> +       /*We don't know when the ring will be full so keep sending in a loop*/
> 
> Yes we do. See measure_ring_size.
> 
> static void bind_to_cpu(int cpu)
> {
> 	const int ncpus = sysconf(_SC_NPROCESSORS_ONLN);
> 	struct sched_param rt = {.sched_priority = 99 };
> 	cpu_set_t allowed;
> 
> 	igt_assert(sched_setscheduler(getpid(), SCHED_RR | SCHED_RESET_ON_FORK, &rt) == 0);
> 
> 	CPU_ZERO(&allowed);
> 	CPU_SET(cpu % ncpus, &allowed);
> 	igt_assert(sched_setaffinity(getpid(), sizeof(cpu_set_t), &allowed) == 0);
> }
> 
> setup timer
> execbuf.rsvd1 = ctx_lo;
> while (__raw_gem_execbuf(fd, &execbuf) == 0)
> 	;
> 
> /* steal cpu */
> bind_to_cpu(0);
> igt_fork(child, 1) {
> 	/* this child is forced to wait for parent to sleep */
> 	execbuf.rsvd1 = ctx_hi;
> 	setup timer;
> 	*success = __raw_gem_execbuf(fd, &execbuf) == 0;
> }
> setup 2*timer
> __raw_gem_execbuf(fd, &execbuf); /* sleeps under mutex, releasing child
> */
> 
> igt_terminate_spin_batches();
> igt_waitchildren();
> 
> igt_assert(*success);
> 
> Timer can be safely 10ms.

Isn't this a bit too complicated? Wouldn't a "keep poking at it for a 
while" approach just do the same while being more readable?

-Antonio

> 
> Similarly:
> 
> race set-domain (pretty much any GEM ioctl ends up in set-domain) vs
> spin-batch, when successful then try any set-domain ioctl from a second
> client and observe that it too is blocked on the first client hogging.
> 
> end:
> For the purpose of testing, just create a debugfs that acquires
> struct_mutex on opening. Then test every ioctl and trap from a second
> client.
> -Chris
>