Fwd: [PATCH v1] drm/i915/guc: Always disable interrupt ahead of synchronize_irq
Dong, Zhanjun
zhanjun.dong at intel.com
Mon Feb 3 23:37:10 UTC 2025
Just found my previous response click on "reply", not the "reply all",
so add Cc list.
Regards,
Zhanjun Dong
-------- Forwarded Message --------
Subject: Re: [PATCH v1] drm/i915/guc: Always disable interrupt ahead of
synchronize_irq
Date: Mon, 27 Jan 2025 17:17:33 -0500
From: Dong, Zhanjun <zhanjun.dong at intel.com>
To: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
Please see my comments below.
Daniele raised a good point on GuC interrupt, I will dig deeper for more
info.
Regards,
Zhanjun Dong
On 2025-01-27 12:12 p.m., Daniele Ceraolo Spurio wrote:
>
>
>
> On 1/23/2025 8:23 AM, Zhanjun Dong wrote:
>> The purpose of synchronize_irq is to wait for any pending IRQ handlers
>> for the
>> interrupt to complete, if synchronize_irq called before interrupt
>> disabled, an
>> tiny timing window created, where no more pending IRQ, but interrupt not
>> disabled yet. Meanwhile, if the interrupt event happened in this
>> timing window,
>> an unexpected IRQ handling will be triggered.
>>
>> Fixed by always disable interrupt ahead of synchronize_irq.
>>
>> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13454
>> Fixes: 26705e20752a ("drm/i915: Support for GuC interrupts")
>> Fixes: 54c52a841250 ("drm/i915/guc: Correctly handle GuC interrupts on
>> Gen11")
>> Fixes: 2ae096872a2c ("drm/i915/pxp: Implement PXP irq handler")
>> Fixes: 3e7abf814193 ("drm/i915: Extract GT render power state
>> management")
>>
>> Signed-off-by: Zhanjun Dong <zhanjun.dong at intel.com>
>>
>> ---
>> Cc: Alan Previn <alan.previn.teres.alexis at intel.com>
>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
>> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
>> Cc: Michal Wajdeczko <michal.wajdeczko at intel.com>
>> Cc: Andi Shyti <andi.shyti at intel.com>
>> ---
>> drivers/gpu/drm/i915/gt/intel_rps.c | 3 +--
>> drivers/gpu/drm/i915/gt/uc/intel_guc.c | 4 ++--
>> drivers/gpu/drm/i915/pxp/intel_pxp_irq.c | 2 +-
>> 3 files changed, 4 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/
>> i915/gt/intel_rps.c
>> index fa304ea088e4..0fe7a8d7f460 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_rps.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c
>> @@ -244,8 +244,8 @@ static void rps_disable_interrupts(struct
>> intel_rps *rps)
>> gen6_gt_pm_disable_irq(gt, GEN6_PM_RPS_EVENTS);
>> spin_unlock_irq(gt->irq_lock);
>> + rps_reset_interrupts(rps);
>> intel_synchronize_irq(gt->i915);
>
> I don't think this is an issue, because we set the irq mask in
> gen6_gt_pm_disable_irq, so there is no chance of getting any new
> interrupts after that. Not saying that we shouldn't do the re-order, but
> we don't need a fixes tag for this.
Good point, the interrupt already disabled by gen6_gt_pm_disable_irq,
the rps_reset_interrupts is just clear interrupt bits, so no risk here,
therefor no issue at all.
Anyway, re-order is still a good practice, the interrupt bits set/and
clear is all about the interrupt bits, these 2 actions could be consider
logically tight relationship, the intel_synchronize_irq considered to be
the next step.
So fixes tag should be removed. As not an issue fix, this work could be
done separately.
>
>> -
>> /*
>> * Now that we will not be generating any more work, flush any
>> * outstanding tasks. As we are called on the RPS idle path,
>> @@ -254,7 +254,6 @@ static void rps_disable_interrupts(struct
>> intel_rps *rps)
>> */
>> cancel_work_sync(&rps->work);
>> - rps_reset_interrupts(rps);
>> GT_TRACE(gt, "interrupts:off\n");
>> }
>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/
>> i915/gt/uc/intel_guc.c
>> index 5949ff0b0161..3e7b2c6cdca4 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
>> @@ -116,9 +116,9 @@ static void gen9_disable_guc_interrupts(struct
>> intel_guc *guc)
>> gen6_gt_pm_disable_irq(gt, gt->pm_guc_events);
>> spin_unlock_irq(gt->irq_lock);
>> - intel_synchronize_irq(gt->i915);
>> gen9_reset_guc_interrupts(guc);
>> + intel_synchronize_irq(gt->i915);
>
> Same as above with gen6_gt_pm_disable_irq
>
>> }
>> static bool __gen11_reset_guc_interrupts(struct intel_gt *gt)
>> @@ -154,9 +154,9 @@ static void gen11_disable_guc_interrupts(struct
>> intel_guc *guc)
>> struct intel_gt *gt = guc_to_gt(guc);
>> guc->interrupts.enabled = false;
>> - intel_synchronize_irq(gt->i915);
>> gen11_reset_guc_interrupts(guc);
>> + intel_synchronize_irq(gt->i915);
>
> No early disabling here, but I don't think this change helps either.
> AFAICS gen11_reset_guc_interrupts() only calls gen11_gt_reset_one_iir(),
> which just clears the IIR bits for the GuC. There are no changes in
> interrupt enable/mask status, so interrupts can still be generated. The
> way interrupts are stopped for gen11+ is by setting guc-
> >interrupts.enabled, because that's checked from both guc_irq_handler()
> and intel_guc_to_host_event_handler(), so any new interrupts generated
> after we set that should be immediately dropped.
Good point! Because there are no changes in interrupt enable/mask
status, guc interrupt won't stops, therefor synchronize irq won't make
changes.
Let me dig deeper and get back to you once I have more info.
>
>> }
>> static void guc_dead_worker_func(struct work_struct *w)
>> diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c b/drivers/gpu/
>> drm/i915/pxp/intel_pxp_irq.c
>> index d81750b9bdda..b82a667e7ac0 100644
>> --- a/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c
>> +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c
>> @@ -101,9 +101,9 @@ void intel_pxp_irq_disable(struct intel_pxp *pxp)
>> __pxp_set_interrupts(gt, 0);
>> spin_unlock_irq(gt->irq_lock);
>> - intel_synchronize_irq(gt->i915);
>> pxp_irq_reset(gt);
>> + intel_synchronize_irq(gt->i915);
>
> Again not a bug here, __pxp_set_interrupts is doing the interrupts
> disabling and that is already happening before intel_synchronize_irq().
Same to above rps case, not a bug.
>
> Daniele
>
>> flush_work(&pxp->session_work);
>> }
>
More information about the Intel-gfx
mailing list