[Intel-gfx] [PATCH v16 08/17] drm/i915/perf: Add OA unit support for Gen 8+

Lionel Landwerlin lionel.g.landwerlin at intel.com
Mon Jun 12 17:33:42 UTC 2017


On 12/06/17 18:23, Matthew Auld wrote:
> On 5 June 2017 at 15:48, Lionel Landwerlin
> <lionel.g.landwerlin at intel.com> wrote:
>> From: Robert Bragg <robert at sixbynine.org>
>>
>> Enables access to OA unit metrics for BDW, CHV, SKL and BXT which all
>> share (more-or-less) the same OA unit design.
>>
>> Of particular note in comparison to Haswell: some OA unit HW config
>> state has become per-context state and as a consequence it is somewhat
>> more complicated to manage synchronous state changes from the cpu while
>> there's no guarantee of what context (if any) is currently actively
>> running on the gpu.
>>
>> The periodic sampling frequency which can be particularly useful for
>> system-wide analysis (as opposed to command stream synchronised
>> MI_REPORT_PERF_COUNT commands) is perhaps the most surprising state to
>> have become per-context save and restored (while the OABUFFER
>> destination is still a shared, system-wide resource).
>>
>> This support for gen8+ takes care to consider a number of timing
>> challenges involved in synchronously updating per-context state
>> primarily by programming all config state from the cpu and updating all
>> current and saved contexts synchronously while the OA unit is still
>> disabled.
>>
>> The driver intentionally avoids depending on command streamer
>> programming to update OA state considering the lack of synchronization
>> between the automatic loading of OACTXCONTROL state (that includes the
>> periodic sampling state and enable state) on context restore and the
>> parsing of any general purpose BB the driver can control. I.e. this
>> implementation is careful to avoid the possibility of a context restore
>> temporarily enabling any out-of-date periodic sampling state. In
>> addition to the risk of transiently-out-of-date state being loaded
>> automatically; there are also internal HW latencies involved in the
>> loading of MUX configurations which would be difficult to account for
>> from the command streamer (and we only want to enable the unit when once
>> the MUX configuration is complete).
>>
>> Since the Gen8+ OA unit design no longer supports clock gating the unit
>> off for a single given context (which effectively stopped any progress
>> of counters while any other context was running) and instead supports
>> tagging OA reports with a context ID for filtering on the CPU, it means
>> we can no longer hide the system-wide progress of counters from a
>> non-privileged application only interested in metrics for its own
>> context. Although we could theoretically try and subtract the progress
>> of other contexts before forwarding reports via read() we aren't in a
>> position to filter reports captured via MI_REPORT_PERF_COUNT commands.
>> As a result, for Gen8+, we always require the
>> dev.i915.perf_stream_paranoid to be unset for any access to OA metrics
>> if not root.
>>
>> v5: Drain submitted requests when enabling metric set to ensure no
>>      lite-restore erases the context image we just updated (Lionel)
>>
>> v6: In addition to drain, switch to kernel context & update all
>>      context in place (Chris)
>>
>> v7: Add missing mutex_unlock() if switching to kernel context fails
>>      (Matthew)
>>
>> v8: Simplify OA period/flex-eu-counters programming by using the
>>      batchbuffer instead of modifying ctx-image (Lionel)
>>
>> v9: Back to updating the context image (due to erroneous testing,
>>      batchbuffer programming the OA unit doesn't actually work)
>>      (Lionel)
>>      Pin context before updating context image (Chris)
>>      Drop MMIO programming now that we switch to a kernel context with
>>      right values in initial context image (Chris)
>>
>> v10: Just pin_map the contexts we want to modify or let the
>>       configuration happen on first use (Chris)
>>
>> v11: Update kernel context OA config through the batchbuffer rather
>>       than on the fly ctx-image update (Lionel)
>>
>> v12: Rework OA context registers update again by swithing away from
>>       user contexts and reconfiguring the kernel context through the
>>       batchbuffer and updating all the other contexts' context image.
>>       Also take care to lock slice/subslice configuration when OA is
>>       on. (Lionel)
>>
>> v13: Request rpcs updates on all engine when updating the OA config
>>       (Lionel)
>>
>> v14: Drop any kind of rpcs management now that we monitor sseu
>>       configuration changes in a later patch (Lionel)
>>       Remove usleep after programming the NOA configs on Gen8+, this
>>       doesn't seem to be needed (Lionel)
>>
>> v15: Respect coding style for block comments (Chris)
>>
>> Signed-off-by: Robert Bragg <robert at sixbynine.org>
>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
>> Reviewed-by: Matthew Auld <matthew.auld at intel.com> \o/
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h  |   46 +-
>>   drivers/gpu/drm/i915/i915_perf.c | 1032 +++++++++++++++++++++++++++++++++++---
>>   drivers/gpu/drm/i915/i915_reg.h  |   22 +
>>   drivers/gpu/drm/i915/intel_lrc.c |    2 +
>>   drivers/gpu/drm/i915/intel_lrc.h |    2 +
>>   include/uapi/drm/i915_drm.h      |   19 +-
>>   6 files changed, 1028 insertions(+), 95 deletions(-)
> <SNIP>
>
>> +
>> +static int gen8_switch_to_updated_kernel_context(struct drm_i915_private *dev_priv)
>> +{
>> +       struct intel_engine_cs *engine = dev_priv->engine[RCS];
>> +       struct i915_gem_timeline *timeline;
>> +       struct drm_i915_gem_request *req;
>> +       int ret;
>> +
>> +       lockdep_assert_held(&dev_priv->drm.struct_mutex);
>> +
>> +       i915_gem_retire_requests(dev_priv);
>> +
>> +       req = i915_gem_request_alloc(engine, dev_priv->kernel_context);
>> +       if (IS_ERR(req))
>> +               return PTR_ERR(req);
>> +
>> +       ret = gen8_emit_oa_config(req);
>> +       if (ret)
>> +               return ret;
> If we do request_alloc, aren't we always supposed to follow up with an
> add_request, otherwise the kernel isn't tracking it?
>
Again you're right! Thanks for spotting this.


More information about the Intel-gfx mailing list