[Intel-gfx] [PATCH 04/11] drm/i915: Support for GuC interrupts

Tue Jun 28 13:44:54 UTC 2016

On 28/06/16 12:12, Goel, Akash wrote:
>
>
> On 6/28/2016 3:33 PM, Tvrtko Ursulin wrote:
>>
>> On 27/06/16 13:16, akash.goel at intel.com wrote:
>>> From: Sagar Arun Kamble <sagar.a.kamble at intel.com>
>>>
>>> There are certain types of interrupts which Host can recieve from GuC.
>>> GuC ukernel sends an interrupt to Host for certain events, like for
>>> example retrieve/consume the logs generated by ukernel.
>>> This patch adds support to receive interrupts from GuC but currently
>>> enables & partially handles only the interrupt sent by GuC ukernel.
>>> Future patches will add support for handling other interrupt types.
>>>
>>> v2: Use common low level routines for PM IER/IIR programming (Chris)
>>>      Rename interrupt functions to gen9_xxx from gen8_xxx (Chris)
>>>      Replace disabling of wake ref asserts with rpm get/put (Chris)
>>>
>>> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble at intel.com>
>>> Signed-off-by: Akash Goel <akash.goel at intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/i915_drv.h            |  1 +
>>>   drivers/gpu/drm/i915/i915_guc_submission.c |  5 ++
>>>   drivers/gpu/drm/i915/i915_irq.c            | 95
>>> ++++++++++++++++++++++++++++--
>>>   drivers/gpu/drm/i915/i915_reg.h            | 11 ++++
>>>   drivers/gpu/drm/i915/intel_drv.h           |  3 +
>>>   drivers/gpu/drm/i915/intel_guc.h           |  5 ++
>>>   drivers/gpu/drm/i915/intel_guc_loader.c    |  4 ++
>>>   7 files changed, 120 insertions(+), 4 deletions(-)
>>>>>
>>> +static void gen9_guc2host_events_work(struct work_struct *work)
>>> +{
>>> +    struct drm_i915_private *dev_priv =
>>> +        container_of(work, struct drm_i915_private, guc.events_work);
>>> +
>>> +    spin_lock_irq(&dev_priv->irq_lock);
>>> +    /* Speed up work cancelation during disabling guc interrupts. */
>>> +    if (!dev_priv->guc.interrupts_enabled) {
>>> +        spin_unlock_irq(&dev_priv->irq_lock);
>>> +        return;
>>> +    }
>>> +
>>> +    /* Though this work item gets synced during rpm suspend, but
>>> still need
>>> +     * a rpm get/put to avoid the warning, as it could get executed
>>> in a
>>> +     * window, where rpm ref count has dropped to zero but rpm
>>> suspend has
>>> +     * not kicked in. Generally device is expected to be active only
>>> at this
>>> +     * time so get/put should be really quick.
>>> +     */
>>> +    intel_runtime_pm_get(dev_priv);
>>> +
>>> +    gen6_enable_pm_irq(dev_priv, GEN9_GUC_TO_HOST_INT_EVENT);
>>> +    spin_unlock_irq(&dev_priv->irq_lock);
>>> +
>>> +    /* TODO: Handle the events for which GuC interrupted host */
>>> +
>>> +    intel_runtime_pm_put(dev_priv);
>>> +}
>>>
>>>   /**
>>>    * ivybridge_parity_work - Workqueue called when a parity error
>>> interrupt
>>> @@ -1371,11 +1435,13 @@ static irqreturn_t gen8_gt_irq_ack(struct
>>> drm_i915_private *dev_priv,
>>>               DRM_ERROR("The master control interrupt lied (GT3)!\n");
>>>       }
>>>
>>> -    if (master_ctl & GEN8_GT_PM_IRQ) {
>>> +    if (master_ctl & (GEN8_GT_PM_IRQ | GEN8_GT_GUC_IRQ)) {
>>>           gt_iir[2] = I915_READ_FW(GEN8_GT_IIR(2));
>>> -        if (gt_iir[2] & dev_priv->pm_rps_events) {
>>> +        if (gt_iir[2] & (dev_priv->pm_rps_events |
>>> +                 dev_priv->guc_events)) {
>>>               I915_WRITE_FW(GEN8_GT_IIR(2),
>>> -                      gt_iir[2] & dev_priv->pm_rps_events);
>>> +                      gt_iir[2] & (dev_priv->pm_rps_events |
>>> +                           dev_priv->guc_events));
>>>               ret = IRQ_HANDLED;
>>>           } else
>>>               DRM_ERROR("The master control interrupt lied (PM)!\n");
>>> @@ -1407,6 +1473,9 @@ static void gen8_gt_irq_handler(struct
>>> drm_i915_private *dev_priv,
>>>
>>>       if (gt_iir[2] & dev_priv->pm_rps_events)
>>>           gen6_rps_irq_handler(dev_priv, gt_iir[2]);
>>> +
>>> +    if (gt_iir[2] & dev_priv->guc_events)
>>> +        gen9_guc_irq_handler(dev_priv, gt_iir[2]);
>>>   }
>>>
>>>   static bool bxt_port_hotplug_long_detect(enum port port, u32 val)
>>> @@ -1653,6 +1722,20 @@ static void gen6_rps_irq_handler(struct
>>> drm_i915_private *dev_priv, u32 pm_iir)
>>>       }
>>>   }
>>>
>>> +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
>>> u32 gt_iir)
>>> +{
>>> +    if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) {
>>> +        spin_lock(&dev_priv->irq_lock);
>>> +        if (dev_priv->guc.interrupts_enabled) {
>>
>> So it is expected interrupts will always be enabled when
>> i915.guc_log_level is set, correct?
>>
> Yes currently only when guc_log_level > 0, interrupt should be enabled.
>
> But we need to disable/enable the interrupt upon suspend/resume and
> across GPU reset.
> So interrupt may not be always in a enabled state when guc_log_level>0.
>
>> Also do you need to check against dev_priv->guc.interrupts_enabled at
>> all then? Or from an opposite angle, would you instead need to log the
>> fact unexpected interrupt was received here?
>
> I think this check is needed, to avoid the race in disabling interrupt.
> Please refer the sequence in interrupt disabling function (same as rps
> disabling), there we first set the interrupts_enabled flag to false,
> then wait for the work item to finish execution and then program the IMR
> register.

Right I see now that it is copy-pasted existing sequence. In this case I 
won't question it further. :)

>>
>>> +            /* Process all the GuC to Host events in bottom half */
>>> +            gen6_disable_pm_irq(dev_priv,
>>> +                GEN9_GUC_TO_HOST_INT_EVENT);
>>
>> Why it is important to disable the interrupt here? Not for the queue
>> work I think.
>
> We want to & can handle one interrupt at a time, unless the queued work
> item is executed we can't process the next interrupt, so better to keep
> the interrupt masked.
> Sorry this is what is my understanding.

So it is queued in hardware and will get asserted when unmasked?

>
>>
>> Also, is it safe with regards to potentially losing the interrupt?
>>
> Particularly for the FLUSH_LOG_BUFFER case, GuC won't send a new flush
> interrupt unless its gets an acknowledgement (flush signal) of the
> previous one from Host.

Ah so the previous comment is really impossible? I mean the need to mask?

Possibly just put a comment up there explaining that.

>
>>> +            queue_work(dev_priv->wq, &dev_priv->guc.events_work);
>>
>> Because dev_priv->wq is a one a time in order wq so if something else is
>> running on it and taking time, can that also be a cause of dropping an
>> interrupt or being late with sending the flush signal to the guc and so
>> losing some logs?
>
> Its a Driver's private workqueue and Turbo work item is also queued
> inside this workqueue which too needs to be executed without much delay.
> But yes the flush work item can get substantially delayed in case if
> there are other work items queued before it, especially the
> mm.retire_work (but generally executes every ~1 second).
>
> Best would be if the log buffer (44KB data) can be sampled in IRQ
> context (or Tasklet context) itself.

I was just trying to understand if you perhaps need a dedicated wq. I 
don't have a feel at all on how much data guc logging generates per 
second. If the interrupt is low frequency even with a lot of cmd 
submission happening it could be fine like it is.

Regards,

Tvrtko