[Intel-gfx] [PATCH 15/17] drm/i915: Increase GuC log buffer size to reduce flush interrupts

Fri Jul 15 15:07:10 UTC 2016

On 15/07/16 15:42, Goel, Akash wrote:
> On 7/15/2016 5:27 PM, Tvrtko Ursulin wrote:
>>
>> On 10/07/16 14:41, akash.goel at intel.com wrote:
>>> From: Akash Goel <akash.goel at intel.com>
>>>
>>> In cases where GuC generate logs at a very high rate, correspondingly
>>> the rate of flush interrupts is also very high.
>>> So far total 8 pages were allocated for storing both ISR & DPC logs.
>>> As per the half-full draining protocol followed by GuC, by doubling
>>> the number of pages, the frequency of flush interrupts can be cut down
>>> to almost half, which then helps in reducing the logging overhead.
>>> So now allocating 8 pages apiece for ISR & DPC logs.
>>>
>>> Suggested-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>> Signed-off-by: Akash Goel <akash.goel at intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/intel_guc_fwif.h | 8 ++++----
>>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h
>>> b/drivers/gpu/drm/i915/intel_guc_fwif.h
>>> index 1de6928..7521ed5 100644
>>> --- a/drivers/gpu/drm/i915/intel_guc_fwif.h
>>> +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
>>> @@ -104,9 +104,9 @@
>>>   #define   GUC_LOG_ALLOC_IN_MEGABYTE    (1 << 3)
>>>   #define   GUC_LOG_CRASH_PAGES        1
>>>   #define   GUC_LOG_CRASH_SHIFT        4
>>> -#define   GUC_LOG_DPC_PAGES        3
>>> +#define   GUC_LOG_DPC_PAGES        7
>>>   #define   GUC_LOG_DPC_SHIFT        6
>>> -#define   GUC_LOG_ISR_PAGES        3
>>> +#define   GUC_LOG_ISR_PAGES        7
>>>   #define   GUC_LOG_ISR_SHIFT        9
>>>   #define   GUC_LOG_BUF_ADDR_SHIFT    12
>>>
>>> @@ -436,9 +436,9 @@ enum guc_log_buffer_type {
>>>    *        |   Crash dump state header     |
>>>    * Page1  +-------------------------------+
>>>    *        |           ISR logs            |
>>> - * Page5  +-------------------------------+
>>> - *        |           DPC logs            |
>>>    * Page9  +-------------------------------+
>>> + *        |           DPC logs            |
>>> + * Page17 +-------------------------------+
>>>    *        |         Crash Dump logs       |
>>>    *        +-------------------------------+
>>>    *
>>>
>>
>> I don't mind - but does it help? And how much and for what? Haven't you
>> later found that the uncached reads were the main issue?
> This change along with kthread patch, helped reduce the overflow counts
> and even eliminate them for some benchmarks.
>
> Though with the impending optimization for Uncached reads there should
> be further improvements but in my view, notwithstanding the improvement
> w.r.t overflow count, its still a better configuration to work with as
> flush interrupt frequency is cut down to half and not able to see any
> apparent downsides to it.

I was primarily thinking to go with a minimal and simplest set of 
patches to implement the feature.

Logic was that apparently none of the smart and complex optimisations 
managed to solve the dropped interrupt issue, until the slowness of the 
uncached read was discovered to be the real/main issue.

So it seems that is something that definitely needs to be implemented. 
(Whether or not it will be possible to use SSE instructions to do the 
read I don't know.)

Assuming it is possible, then the question is whether there is need for 
all the other optimisations. Ie. do we need the kthread with rtprio or 
would a simple worker be enough? Do we need the new i915 param for 
tweaking the relay sub-buffers? Do we need the increase of the log 
buffer size? The extra patch to do smarter reads?

If we do not have the issue of the dropped interrupts with none of these 
extra patches applied, then we could afford to not bother with them now. 
Would make the series shorter and review easier and the feature in quicker.

Or maybe we do need all the advanced stuff, I don't know, I am just 
asking the question and would like to see some data.

Regards,

Tvrtko