<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Nov 8, 2016 at 6:19 AM, sourab gupta <span dir="ltr"><<a href="mailto:sourab.gupta@intel.com" target="_blank">sourab.gupta@intel.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Mon, 2016-11-07 at 11:49 -0800, Robert Bragg wrote:<br>
> The maximum OA sampling frequency is now configurable via a<br>
> dev.i915.oa_max_sample_rate sysctl parameter.<br>
><br>
> Following the precedent set by perf's similar<br>
> kernel.perf_event_max_sample_<wbr>rate the default maximum rate is 100000Hz<br>
><br>
> Signed-off-by: Robert Bragg <<a href="mailto:robert@sixbynine.org">robert@sixbynine.org</a>><br>
> ---<br>
> drivers/gpu/drm/i915/i915_<wbr>perf.c | 61 ++++++++++++++++++++++++++++++<wbr>++--------<br>
> 1 file changed, 50 insertions(+), 11 deletions(-)<br>
><br>
> diff --git a/drivers/gpu/drm/i915/i915_<wbr>perf.c b/drivers/gpu/drm/i915/i915_<wbr>perf.c<br>
> index e51c1d8..1a87fe9 100644<br>
> --- a/drivers/gpu/drm/i915/i915_<wbr>perf.c<br>
> +++ b/drivers/gpu/drm/i915/i915_<wbr>perf.c<br>
> @@ -82,6 +82,21 @@ static u32 i915_perf_stream_paranoid = true;<br>
> #define INVALID_CTX_ID 0xffffffff<br>
><br>
><br>
> +/* For sysctl proc_dointvec_minmax of i915_oa_max_sample_rate<br>
> + *<br>
> + * 160ns is the smallest sampling period we can theoretically program the OA<br>
> + * unit with on Haswell, corresponding to 6.25MHz.<br>
> + */<br>
> +static int oa_sample_rate_hard_limit = 6250000;<br>
</span>There's no check for 'oa_sample_rate_hard_limit' anywhere below.<br></blockquote><div><br></div><div>It's in the struct ctl_table oa_table[] declaration of the "oa_max_sample_rate" paramater, assigned to .extra2 which is referenced by the proc_dointvec_minmax validation handler for the parameter.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><div class="h5"><br>
> +<br>
> +/* Theoretically we can program the OA unit to sample every 160ns but don't<br>
> + * allow that by default unless root...<br>
> + *<br>
> + * The default threshold of 100000Hz is based on perf's similar<br>
> + * kernel.perf_event_max_sample_<wbr>rate sysctl parameter.<br>
> + */<br>
> +static u32 i915_oa_max_sample_rate = 100000;<br>
> +<br>
> /* XXX: beware if future OA HW adds new report formats that the current<br>
> * code assumes all reports have a power-of-two size and ~(size - 1) can<br>
> * be used as a mask to align the OA tail pointer.<br>
> @@ -1314,6 +1329,7 @@ static int read_properties_unlocked(<wbr>struct drm_i915_private *dev_priv,<br>
> }<br>
><br>
> for (i = 0; i < n_props; i++) {<br>
> + u64 oa_period, oa_freq_hz;<br>
> u64 id, value;<br>
> int ret;<br>
><br>
> @@ -1359,21 +1375,35 @@ static int read_properties_unlocked(<wbr>struct drm_i915_private *dev_priv,<br>
> return -EINVAL;<br>
> }<br>
><br>
> - /* NB: The exponent represents a period as follows:<br>
> - *<br>
> - * 80ns * 2^(period_exponent + 1)<br>
> - *<br>
> - * Theoretically we can program the OA unit to sample<br>
> + /* Theoretically we can program the OA unit to sample<br>
> * every 160ns but don't allow that by default unless<br>
> * root.<br>
> *<br>
> - * Referring to perf's<br>
> - * kernel.perf_event_max_sample_<wbr>rate for a precedent<br>
> - * (100000 by default); with an OA exponent of 6 we get<br>
> - * a period of 10.240 microseconds -just under 100000Hz<br>
> + * On Haswell the period is derived from the exponent<br>
> + * as:<br>
> + *<br>
> + * period = 80ns * 2^(exponent + 1)<br>
> + */<br>
> + BUILD_BUG_ON(sizeof(oa_period) != 8);<br>
> + oa_period = 80ull * (2ull << value);<br>
</div></div>I assume now that there'll be a platform specific check for 80ull, while<br>
programming oa_period, for subquent Gen8+ platforms, which should be<br>
fine.<br></blockquote><div><br></div><div>Yeah, this code will need adapting for gen9+. I guess we'll change it to work in terms of ((2<<exp) * NSEC_PER_SEC) / timestamp_frequency.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="HOEnZb"><div class="h5"><br>
> +<br>
> + /* This check is primarily to ensure that oa_period <=<br>
> + * UINT32_MAX (before passing to do_div which only<br>
> + * accepts a u32 denominator), but we can also skip<br>
> + * checking anything < 1Hz which implicitly can't be<br>
> + * limited via an integer oa_max_sample_rate.<br>
> */<br>
> - if (value < 6 && !capable(CAP_SYS_ADMIN)) {<br>
> - DRM_ERROR("Minimum OA sampling exponent is 6 without root privileges\n");<br>
> + if (oa_period <= NSEC_PER_SEC) {<br>
> + u64 tmp = NSEC_PER_SEC;<br>
> + do_div(tmp, oa_period);<br>
> + oa_freq_hz = tmp;<br>
> + } else<br>
> + oa_freq_hz = 0;<br>
> +<br>
> + if (oa_freq_hz > i915_oa_max_sample_rate &&<br>
> + !capable(CAP_SYS_ADMIN)) {<br>
> + DRM_ERROR("OA exponent would exceed the max sampling frequency (sysctl dev.i915.oa_max_sample_rate) %uHz without root privileges\n",<br>
> + i915_oa_max_sample_rate);<br>
> return -EACCES;<br>
> }<br>
><br>
> @@ -1481,6 +1511,15 @@ static struct ctl_table oa_table[] = {<br>
> .extra1 = &zero,<br>
> .extra2 = &one,<br>
> },<br>
> + {<br>
> + .procname = "oa_max_sample_rate",<br>
> + .data = &i915_oa_max_sample_rate,<br>
> + .maxlen = sizeof(i915_oa_max_sample_<wbr>rate),<br>
> + .mode = 0644,<br>
> + .proc_handler = proc_dointvec_minmax,<br>
> + .extra1 = &zero,<br>
> + .extra2 = &oa_sample_rate_hard_limit,<br>
> + },<br>
> {}<br>
> };<br>
><br>
<br>
<br>
</div></div></blockquote></div><br></div></div>