[Intel-gfx] [PATCH 03/12] drm/i915: Framework for capturing command stream based OA reports and ctx id info.

Kamble, Sagar A sagar.a.kamble at intel.com
Wed Aug 2 04:44:30 UTC 2017


Hi Chis,

Understood the need to handle request reordering. 
Are you suggesting following paths:
1. cs samples list for stream be read based on the order of submission from submit timestamps/OA capture timestamps.
2. put the commands to capture during eb_submit and patch the offset in vma where data is to be captured, populate cs sample list during __i915_gem_request_submit

For preemption, it would then simplify by just discarding the cs sample and relying on corresponding next  __i915_gem_request_submit.

Thanks
Sagar

-----Original Message-----
From: Chris Wilson [mailto:chris at chris-wilson.co.uk] 
Sent: Monday, July 31, 2017 3:42 PM
To: Kamble, Sagar A <sagar.a.kamble at intel.com>; intel-gfx at lists.freedesktop.org
Cc: Sourab Gupta <sourab.gupta at intel.com>
Subject: Re: [Intel-gfx] [PATCH 03/12] drm/i915: Framework for capturing command stream based OA reports and ctx id info.

Quoting Chris Wilson (2017-07-31 09:34:30)
> Quoting Sagar Arun Kamble (2017-07-31 08:59:36)
> > +/**
> > + * i915_perf_stream_emit_sample_capture - Insert the commands to 
> > +capture perf
> > + * metrics into the GPU command stream
> > + * @stream: An i915-perf stream opened for GPU metrics
> > + * @request: request in whose context the metrics are being collected.
> > + * @preallocate: allocate space in ring for related sample.
> > + */
> > +static void i915_perf_stream_emit_sample_capture(
> > +                                       struct i915_perf_stream *stream,
> > +                                       struct drm_i915_gem_request *request,
> > +                                       bool preallocate) {
> > +       struct reservation_object *resv = stream->cs_buffer.vma->resv;
> > +       struct i915_perf_cs_sample *sample;
> > +       unsigned long flags;
> > +       int ret;
> > +
> > +       sample = kzalloc(sizeof(*sample), GFP_KERNEL);
> > +       if (sample == NULL) {
> > +               DRM_ERROR("Perf sample alloc failed\n");
> > +               return;
> > +       }
> > +
> > +       sample->request = i915_gem_request_get(request);
> > +       sample->ctx_id = request->ctx->hw_id;
> > +
> > +       insert_perf_sample(stream, sample);
> > +
> > +       if (stream->sample_flags & SAMPLE_OA_REPORT) {
> > +               ret = i915_emit_oa_report_capture(request,
> > +                                                 preallocate,
> > +                                                 sample->offset);
> > +               if (ret)
> > +                       goto err_unref;
> > +       }
> 
> This is incorrect as the requests may be reordered. You either need to 
> declare the linear ordering of requests through the sample buffer, or 
> we have to delay setting sample->offset until execution, and even then 
> we need to disable preemption when using SAMPLE_OA_REPORT.

Thinking about it, you do need to serialise based on stream->vma, or else where a stream->vma per capture context.
-Chris


More information about the Intel-gfx mailing list