[Intel-gfx] [PATCH v16 08/17] drm/i915/perf: Add OA unit support for Gen 8+
Matthew Auld
matthew.william.auld at gmail.com
Mon Jun 12 14:02:02 UTC 2017
On 5 June 2017 at 15:48, Lionel Landwerlin
<lionel.g.landwerlin at intel.com> wrote:
> From: Robert Bragg <robert at sixbynine.org>
>
> Enables access to OA unit metrics for BDW, CHV, SKL and BXT which all
> share (more-or-less) the same OA unit design.
>
> Of particular note in comparison to Haswell: some OA unit HW config
> state has become per-context state and as a consequence it is somewhat
> more complicated to manage synchronous state changes from the cpu while
> there's no guarantee of what context (if any) is currently actively
> running on the gpu.
>
> The periodic sampling frequency which can be particularly useful for
> system-wide analysis (as opposed to command stream synchronised
> MI_REPORT_PERF_COUNT commands) is perhaps the most surprising state to
> have become per-context save and restored (while the OABUFFER
> destination is still a shared, system-wide resource).
>
> This support for gen8+ takes care to consider a number of timing
> challenges involved in synchronously updating per-context state
> primarily by programming all config state from the cpu and updating all
> current and saved contexts synchronously while the OA unit is still
> disabled.
>
> The driver intentionally avoids depending on command streamer
> programming to update OA state considering the lack of synchronization
> between the automatic loading of OACTXCONTROL state (that includes the
> periodic sampling state and enable state) on context restore and the
> parsing of any general purpose BB the driver can control. I.e. this
> implementation is careful to avoid the possibility of a context restore
> temporarily enabling any out-of-date periodic sampling state. In
> addition to the risk of transiently-out-of-date state being loaded
> automatically; there are also internal HW latencies involved in the
> loading of MUX configurations which would be difficult to account for
> from the command streamer (and we only want to enable the unit when once
> the MUX configuration is complete).
>
> Since the Gen8+ OA unit design no longer supports clock gating the unit
> off for a single given context (which effectively stopped any progress
> of counters while any other context was running) and instead supports
> tagging OA reports with a context ID for filtering on the CPU, it means
> we can no longer hide the system-wide progress of counters from a
> non-privileged application only interested in metrics for its own
> context. Although we could theoretically try and subtract the progress
> of other contexts before forwarding reports via read() we aren't in a
> position to filter reports captured via MI_REPORT_PERF_COUNT commands.
> As a result, for Gen8+, we always require the
> dev.i915.perf_stream_paranoid to be unset for any access to OA metrics
> if not root.
>
> v5: Drain submitted requests when enabling metric set to ensure no
> lite-restore erases the context image we just updated (Lionel)
>
> v6: In addition to drain, switch to kernel context & update all
> context in place (Chris)
>
> v7: Add missing mutex_unlock() if switching to kernel context fails
> (Matthew)
>
> v8: Simplify OA period/flex-eu-counters programming by using the
> batchbuffer instead of modifying ctx-image (Lionel)
>
> v9: Back to updating the context image (due to erroneous testing,
> batchbuffer programming the OA unit doesn't actually work)
> (Lionel)
> Pin context before updating context image (Chris)
> Drop MMIO programming now that we switch to a kernel context with
> right values in initial context image (Chris)
>
> v10: Just pin_map the contexts we want to modify or let the
> configuration happen on first use (Chris)
>
> v11: Update kernel context OA config through the batchbuffer rather
> than on the fly ctx-image update (Lionel)
>
> v12: Rework OA context registers update again by swithing away from
> user contexts and reconfiguring the kernel context through the
> batchbuffer and updating all the other contexts' context image.
> Also take care to lock slice/subslice configuration when OA is
> on. (Lionel)
>
> v13: Request rpcs updates on all engine when updating the OA config
> (Lionel)
>
> v14: Drop any kind of rpcs management now that we monitor sseu
> configuration changes in a later patch (Lionel)
> Remove usleep after programming the NOA configs on Gen8+, this
> doesn't seem to be needed (Lionel)
>
> v15: Respect coding style for block comments (Chris)
>
> Signed-off-by: Robert Bragg <robert at sixbynine.org>
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
> Reviewed-by: Matthew Auld <matthew.auld at intel.com> \o/
> ---
<SNIP>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
> index 52b3a1fd4059..eb7bb51c0a4a 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/intel_lrc.h
> @@ -62,6 +62,8 @@ enum {
> INTEL_CONTEXT_SCHEDULE_OUT,
> };
>
> +struct sseu_dev_info;
> +
Seems bogus, no?
More information about the Intel-gfx
mailing list