[Intel-gfx] [RFC] drm/i915: Add a new modparam for customized ring multiplier

Rogozhkin, Dmitry V dmitry.v.rogozhkin at intel.com
Tue Dec 26 17:39:08 UTC 2017


>> To clarify, the HW will flip between the two GT/IA requests rather than stick to the highest?

Yes, it will flip on Gen9. On Gen8 there was some mechanism (HW) which flattened that. But it was removed/substituted in Gen9. In Gen10 it was tuned  to close the mentioned issue.

>> Do you know anything about the opposite position. I heard a suggestion that simply increasing the ringfreq universally caused thermal throttling in some other workloads. Do you have any knowledge of those?

Initially we tried to just increase GT multiplier to x3 and stepped into the throttling. Thus, we introduced parameter to be able to mitigate all that depending on the SKU and user needs. I definitely asked what will be if GT request will be bigger than IA request. But that was a year ago and I don't remember the answer. Let me ask again. I will mail back in few days.

>> You are thinking of plugging into intel_pstate to make it smarter for ia freq transitions?

Yep. This seems a correct step to give some automatic support instead of parameter/hardcoded multiplier.

Dmitry.

-----Original Message-----
From: Chris Wilson [mailto:chris at chris-wilson.co.uk] 
Sent: Tuesday, December 26, 2017 8:59 AM
To: Rogozhkin, Dmitry V <dmitry.v.rogozhkin at intel.com>; Li, Yaodong <yaodong.li at intel.com>; intel-gfx at lists.freedesktop.org
Cc: Gong, Zhipeng <zhipeng.gong at intel.com>; Widawsky, Benjamin <benjamin.widawsky at intel.com>; Mateo Lozano, Oscar <oscar.mateo at intel.com>; Kamble, Sagar A <sagar.a.kamble at intel.com>; Li, Yaodong <yaodong.li at intel.com>
Subject: RE: [RFC] drm/i915: Add a new modparam for customized ring multiplier

Quoting Rogozhkin, Dmitry V (2017-12-26 16:39:23)
> Clarification on the issue. Consider that you have a massive load on GT and just tiny one on IA. If GT will program the RING frequency to be lower than IA frequency, then you will fall into the situation when RING frequency constantly transits from GT to IA level and back. Each transition of a RING frequency is a full system stall. If you will have "good" transition rate with few transitions per few milliseconds you will lose ~10% of performance. That's the case for media workloads when you easily can step into this since 1) media utilizes few GPU engines and with few parallel workloads you can make sure that at least 1 engine is _always_ doing something, 2) media BB are relatively small, so you have regular wakeups of the IA to manage requests. This will affect Gen9 platforms due to HW design change (we've spot this in SKL). This will not happen in Gen8 (old HW design). This will be fixed in Gen10+ (CNL+).

To clarify, the HW will flip between the two GT/IA requests rather than stick to the highest? Iirc, the expectation was that we were setting a requested minimum frequency for the ring/ia based off the gpu freq.

> On SKL we ran into this with the GPU frequency pinned to 700MHz, CPU to 2GHz. Multipliers were x2 for GT, x1 for IA.

Basically, with the GPU clocked to mid frequency, memory throughput is insufficient to keep the fixed functions occupied, and you need to increase the ring frequency. Is there ever a case where we don't need max ring frequency? (Perhaps we still need to set low frequency for GT
idle?) I guess media is more susceptible to this as that workload should be sustainable at reduced clocks, GL et al are much more likely to keep the clocks ramped all the way up.

Do you know anything about the opposite position. I heard a suggestion that simply increasing the ringfreq universally caused thermal throttling in some other workloads. Do you have any knowledge of those?
 
> So, effectively, what we need to do is to make sure that RING frequency request from GT is _not_ below the request from IA. If IA requests 2GHz, we can't request 1.4GHz, we need request at least 2GHz. Multiplier patch was intended to do exactly that, but manually. Can  we somehow automate that managing IA frequency requests to the RING?

You are thinking of plugging into intel_pstate to make it smarter for ia freq transitions? That seems possible, certainly. I'm not sure if the ring frequency is actually poked from anywhere else in the kernel, would be interesting to find out.
-Chris


More information about the Intel-gfx mailing list