[RFC 7/9] drm/xe/gt_tlb_invalidation_ggtt: Call xe_force_wake_put if xe_force_wake_get succeds

Rodrigo Vivi rodrigo.vivi at intel.com
Fri Sep 6 16:29:11 UTC 2024


On Fri, Sep 06, 2024 at 01:21:41AM +0530, Ghimiray, Himal Prasad wrote:
> 
> 
> On 06-09-2024 01:07, Rodrigo Vivi wrote:
> > On Fri, Aug 30, 2024 at 10:53:24AM +0530, Himal Prasad Ghimiray wrote:
> > > A failure in xe_force_wake_get() no longer increments the domain's
> > > refcount, so xe_force_wake_put() should not be called in such cases
> > > 
> > > Cc: Matthew Brost <matthew.brost at intel.com>
> > > Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > > Cc: Lucas De Marchi <lucas.demarchi at intel.com>
> > > Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>
> > > ---
> > >   drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c | 9 ++++++---
> > >   1 file changed, 6 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> > > index cca9cf536f76..3f86ab704c4f 100644
> > > --- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> > > +++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> > > @@ -259,11 +259,11 @@ static int xe_gt_tlb_invalidation_guc(struct xe_gt *gt,
> > >   int xe_gt_tlb_invalidation_ggtt(struct xe_gt *gt)
> > >   {
> > >   	struct xe_device *xe = gt_to_xe(gt);
> > > +	int ret;
> > >   	if (xe_guc_ct_enabled(&gt->uc.guc.ct) &&
> > >   	    gt->uc.guc.submission_state.enabled) {
> > >   		struct xe_gt_tlb_invalidation_fence fence;
> > > -		int ret;
> > >   		xe_gt_tlb_invalidation_fence_init(gt, &fence, true);
> > >   		ret = xe_gt_tlb_invalidation_guc(gt, &fence);
> > > @@ -277,7 +277,9 @@ int xe_gt_tlb_invalidation_ggtt(struct xe_gt *gt)
> > >   		if (IS_SRIOV_VF(xe))
> > >   			return 0;
> > > -		xe_gt_WARN_ON(gt, xe_force_wake_get(gt_to_fw(gt), XE_FW_GT));
> > > +		ret =  xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
> > > +		xe_gt_WARN_ON(gt, ret);
> > > +
> > >   		if (xe->info.platform == XE_PVC || GRAPHICS_VER(xe) >= 20) {
> > >   			xe_mmio_write32(gt, PVC_GUC_TLB_INV_DESC1,
> > >   					PVC_GUC_TLB_INV_DESC1_INVALIDATE);
> > > @@ -287,7 +289,8 @@ int xe_gt_tlb_invalidation_ggtt(struct xe_gt *gt)
> > >   			xe_mmio_write32(gt, GUC_TLB_INV_CR,
> > >   					GUC_TLB_INV_CR_INVALIDATE);
> > >   		}
> > > -		xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
> > > +		if (!ret)
> > > +			xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
> > 
> > looking all these cases now I honestly prefer the other way around.
> > 
> > If we called the get, we call the put.
> > get always increase the reference and put does the clean-up.
> > 
> > fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
> > 
> > xe_force_wake_put(gt_to_fw(gt), fw_ref);
> > 
> > so, the fw_ref is a mask of the woken up cases which require
> > the ref drop and sleep call.
> 
> Hi Rodrigo,
> 
> Thanks for the input. AFAIU using this approach creates issue in the
> subsequent force_wake_get/put in callee function. Which I have tried to
> explain in cover letter.
> 
> [1] subsequent forcewake call by callee function assumes domains are
> already awake, which might not be true. This shows perfectly balanced
> xe_force_wake_get/_put can also cause problem.
> 
> [1] func_a() {
> 	XE_WARN(xe_force_wake_get()) <---> fails but increments refcount
> 
> 	func_b();
> 
> 	XE_WARN(xe_force_wake_put());<---> decrements refcounts
>  }
> 
>     func_b() {
> 	if(xe_force_wake_get()) <---> succeeds due to refcount of caller
> 		return;
> 
> 	does mmio_operations(); <---> Domain might not be awake
> 
> 	xe_force_wake_put(); <---> decrement refcount
>  }

Well, to be honest, this is what bugs me in this whole series.

If func_a failed, why would function b succeed? It that's the
case should we include more redundancy and retries so the
func_a would succeed like the func_b is expected in your
scenario?

But other then that, I'm afraid that you didn't fully understand
my idea. Sorry for not being clear.

My thought is, you do what you are doing in this series.
If the get doesn't succeed you drop the ref count and call the
disable.

The return of the get is just for the domains that have succeeded.
then the put returns only the ones that had succeeded.
The function B will then try to wake-up whatever had failed in
func_a.

Something like:


func_a() {
	fw_ref = xe_force_wake_get(ALL_DOMAINS) <---> fails GT-domain but return a mask with all the domains except GT.

	XE_WARN(!fw_ref);

	func_b();

	XE_WARN(xe_force_wake_put(fw_ref));<---> decrements refcounts of the domains which were actually woken up.
}

   func_b() {
        fw_ref = xe_force_wake_get(GT_DOMAIN);
	if(fw_ref & GT_DOMAIN) <---> likely fail anyway since func_a has failed, but it at least tries it out because you have handled it in your series...
		return;

	does mmio_operations(); <---> Domain might not be awake

	xe_force_wake_put(fw_ref); <---> decrement refcount of the domains you woked up.
}

does it make sense now?

> 
> BR
> Himal
> 
> > 
> > >   	}
> > >   	return 0;
> > > -- 
> > > 2.34.1
> > > 


More information about the Intel-xe mailing list