[Intel-gfx] [PATCH i-g-t 7/9] tests/perf_pmu: Tests for i915 PMU API

Chris Wilson chris at chris-wilson.co.uk
Tue Oct 10 13:46:23 UTC 2017


Quoting Tvrtko Ursulin (2017-10-10 14:38:10)
> 
> On 10/10/2017 13:37, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2017-10-10 10:30:06)
> >> +static void
> >> +event_wait(int gem_fd, const struct intel_execution_engine2 *e)
> >> +{
> >> +       struct drm_i915_gem_exec_object2 obj = { };
> >> +       struct drm_i915_gem_execbuffer2 eb = { };
> >> +       data_t data;
> >> +       igt_display_t *display = &data.display;
> >> +       const uint32_t DERRMR = 0x44050;
> >> +       unsigned int valid_tests = 0;
> >> +       uint32_t batch[8], *b;
> >> +       igt_output_t *output;
> >> +       uint32_t bb_handle;
> >> +       uint32_t reg;
> >> +       enum pipe p;
> >> +       int fd;
> >> +
> >> +       igt_require(intel_gen(intel_get_drm_devid(gem_fd)) >= 6);
> >> +       igt_require(intel_register_access_init(intel_get_pci_device(),
> >> +                                              false, gem_fd) == 0);
> >> +
> >> +       /**
> >> +        * We will use the display to render event forwarind so need to
> >> +        * program the DERRMR register and restore it at exit.
> > 
> > DERRMR is always masked until we need it. If you really wanted to
> > preserve the old value, SRM, LRM around the test. Not that fussed, just
> > a general dislike of direct poking of registers.
> 
> Thought I can get away with it since the handle is drm master and also 
> there doesn't seem to be a patch in i915 which touches this register.

Oh, we used to set it around pageflips. Good point about drmMaster.
That's worth a igt_require(igt_set_master()) to document the requirement
that we need it here for SECURE dispatch.

I'm actually a bit concerned that DERRMR isn't ~0u by default. (Though
maybe they only assert the bits that are connected to events).

> >> +        *
> >> +        * We will emit a MI_WAIT_FOR_EVENT listening for vblank events,
> >> +        * have a background helper to indirectly enable vblank irqs, and
> >> +        * listen to the recorded time spent in engine wait state as reported
> >> +        * by the PMU.
> >> +        */
> >> +       reg = intel_register_read(DERRMR);
> >> +
> >> +       kmstest_set_vt_graphics_mode();
> >> +       igt_display_init(&data.display, gem_fd);
> >> +
> >> +       bb_handle = gem_create(gem_fd, 4096);
> >> +
> >> +       b = batch;
> >> +       *b++ = MI_LOAD_REGISTER_IMM;
> >> +       *b++ = DERRMR;
> >> +       *b++ = reg & ~((1 << 3) | (1 << 11) | (1 << 21));
> >> +       *b++ = MI_WAIT_FOR_EVENT | MI_WAIT_FOR_PIPE_A_VBLANK;
> >> +       *b++ = MI_LOAD_REGISTER_IMM;
> >> +       *b++ = DERRMR;
> >> +       *b++ = reg;
> >> +       *b++ = MI_BATCH_BUFFER_END;
> > 
> >> +static int chain_nop(int gem_fd, unsigned long sz, int in_fence, bool sync)
> >> +{
> >> +       struct drm_i915_gem_exec_object2 obj = {};
> >> +       struct drm_i915_gem_execbuffer2 eb =
> >> +               { .buffer_count = 1, .buffers_ptr = (uintptr_t)&obj};
> >> +       const uint32_t bbe = 0xa << 23;
> >> +
> >> +       sz = ALIGN(sz, sizeof(uint32_t));
> >> +
> >> +       obj.handle = gem_create(gem_fd, sz);
> >> +       gem_write(gem_fd, obj.handle, sz - sizeof(bbe), &bbe, sizeof(bbe));
> >> +
> >> +       eb.flags = I915_EXEC_RENDER | I915_EXEC_FENCE_OUT;
> >> +
> >> +       if (in_fence >= 0) {
> >> +               eb.flags |= I915_EXEC_FENCE_IN;
> >> +               eb.rsvd2 = in_fence;
> >> +       }
> >> +
> >> +       gem_execbuf_wr(gem_fd, &eb);
> > 
> > On the same ctx/engine, this shouldn't be generating interrupts between
> > the batches. The fence should be resolved to an i915 request and then we
> > see that the requests are naturally ordered so the fence is elided.
> >> +
> >> +       if (sync)
> >> +               gem_sync(gem_fd, obj.handle);
> > 
> > So it looks like this will remain the only interrupt generator.
> > 
> > If we exported the out-fence and then polled that, that is currently
> > hooked up to an interrupt only path.
> 
> So you think I'm counting ctx switches and other stuff? Maybe we should 
> have separate PMU counters for all the different interrupts. :)

Yeah, each request will be generating a new lite-restore. Hmm, how about
if we guarded that with i915_gem_request_started(). Interesting...
 
> What do you mean by exporting the out fence? Looping it through via 
> something external to hide the source? Not sure how to do it. Dup the 
> fd? Or create a swfence and merge the out fence to it? (Talking from 
> memory here.)

We can just use the poll((struct pfd){fd, POLLIN}, 1, -1). Instead of
"exporting the out_fence" read "having exported the out_fence", i.e. the
sync_file currently forces use of the interrupt (so long as it is not
already completed!).
-Chris


More information about the Intel-gfx mailing list