[RFC 7/9] drm/xe/gt_tlb_invalidation_ggtt: Call xe_force_wake_put if xe_force_wake_get succeds

Rodrigo Vivi rodrigo.vivi at intel.com
Tue Sep 10 17:39:07 UTC 2024


On Tue, Sep 10, 2024 at 08:07:01PM +0530, Nilawar, Badal wrote:
> 
> 
> On 09-09-2024 14:59, Ghimiray, Himal Prasad wrote:
> > 
> > 
> > On 06-09-2024 21:59, Rodrigo Vivi wrote:
> > > On Fri, Sep 06, 2024 at 01:21:41AM +0530, Ghimiray, Himal Prasad wrote:
> > > > 
> > > > 
> > > > On 06-09-2024 01:07, Rodrigo Vivi wrote:
> > > > > On Fri, Aug 30, 2024 at 10:53:24AM +0530, Himal Prasad Ghimiray wrote:
> > > > > > A failure in xe_force_wake_get() no longer increments the domain's
> > > > > > refcount, so xe_force_wake_put() should not be called in such cases
> > > > > > 
> > > > > > Cc: Matthew Brost <matthew.brost at intel.com>
> > > > > > Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > > > > > Cc: Lucas De Marchi <lucas.demarchi at intel.com>
> > > > > > Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>
> > > > > > ---
> > > > > >    drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c | 9 ++++++---
> > > > > >    1 file changed, 6 insertions(+), 3 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> > > > > > b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> > > > > > index cca9cf536f76..3f86ab704c4f 100644
> > > > > > --- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> > > > > > +++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> > > > > > @@ -259,11 +259,11 @@ static int
> > > > > > xe_gt_tlb_invalidation_guc(struct xe_gt *gt,
> > > > > >    int xe_gt_tlb_invalidation_ggtt(struct xe_gt *gt)
> > > > > >    {
> > > > > >        struct xe_device *xe = gt_to_xe(gt);
> > > > > > +    int ret;
> > > > > >        if (xe_guc_ct_enabled(&gt->uc.guc.ct) &&
> > > > > >            gt->uc.guc.submission_state.enabled) {
> > > > > >            struct xe_gt_tlb_invalidation_fence fence;
> > > > > > -        int ret;
> > > > > >            xe_gt_tlb_invalidation_fence_init(gt, &fence, true);
> > > > > >            ret = xe_gt_tlb_invalidation_guc(gt, &fence);
> > > > > > @@ -277,7 +277,9 @@ int xe_gt_tlb_invalidation_ggtt(struct xe_gt *gt)
> > > > > >            if (IS_SRIOV_VF(xe))
> > > > > >                return 0;
> > > > > > -        xe_gt_WARN_ON(gt, xe_force_wake_get(gt_to_fw(gt), XE_FW_GT));
> > > > > > +        ret =  xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
> > > > > > +        xe_gt_WARN_ON(gt, ret);
> > > > > > +
> > > > > >            if (xe->info.platform == XE_PVC ||
> > > > > > GRAPHICS_VER(xe) >= 20) {
> > > > > >                xe_mmio_write32(gt, PVC_GUC_TLB_INV_DESC1,
> > > > > >                        PVC_GUC_TLB_INV_DESC1_INVALIDATE);
> > > > > > @@ -287,7 +289,8 @@ int xe_gt_tlb_invalidation_ggtt(struct xe_gt *gt)
> > > > > >                xe_mmio_write32(gt, GUC_TLB_INV_CR,
> > > > > >                        GUC_TLB_INV_CR_INVALIDATE);
> > > > > >            }
> > > > > > -        xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
> > > > > > +        if (!ret)
> > > > > > +            xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
> > > > > 
> > > > > looking all these cases now I honestly prefer the other way around.
> > > > > 
> > > > > If we called the get, we call the put.
> > > > > get always increase the reference and put does the clean-up.
> > > > > 
> > > > > fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
> > > > > 
> > > > > xe_force_wake_put(gt_to_fw(gt), fw_ref);
> > > > > 
> > > > > so, the fw_ref is a mask of the woken up cases which require
> > > > > the ref drop and sleep call.
> > > > 
> > > > Hi Rodrigo,
> > > > 
> > > > Thanks for the input. AFAIU using this approach creates issue in the
> > > > subsequent force_wake_get/put in callee function. Which I have tried to
> > > > explain in cover letter.
> > > > 
> > > > [1] subsequent forcewake call by callee function assumes domains are
> > > > already awake, which might not be true. This shows perfectly balanced
> > > > xe_force_wake_get/_put can also cause problem.
> > > > 
> > > > [1] func_a() {
> > > >     XE_WARN(xe_force_wake_get()) <---> fails but increments refcount
> > > > 
> > > >     func_b();
> > > > 
> > > >     XE_WARN(xe_force_wake_put());<---> decrements refcounts
> > > >   }
> > > > 
> > > >      func_b() {
> > > >     if(xe_force_wake_get()) <---> succeeds due to refcount of caller
> > > >         return;
> > > > 
> > > >     does mmio_operations(); <---> Domain might not be awake
> > > > 
> > > >     xe_force_wake_put(); <---> decrement refcount
> > > >   }
> > > 
> > > Well, to be honest, this is what bugs me in this whole series.
> > > 
> > > If func_a failed, why would function b succeed? It that's the
> > > case should we include more redundancy and retries so the
> > > func_a would succeed like the func_b is expected in your
> > > scenario?
> > 
> > 
> > Hi Rodrigo,
> > 
> > This is current behavior, which patch [1] resolves. I misunderstood your
> > comment as dropping of that patch and simply balancing all _gets with
> > respective _puts.
> > 
> > 
> > > 
> > > But other then that, I'm afraid that you didn't fully understand
> > > my idea. Sorry for not being clear.
> > > 
> > > My thought is, you do what you are doing in this series.
> > > If the get doesn't succeed you drop the ref count and call the
> > > disable.
> > 
> > 
> > OK. IMO, just reducing refcount is better for failing domain and not to
> > disable it explicitly
> > 
> > 
> > > 
> > > The return of the get is just for the domains that have succeeded.
> > > then the put returns only the ones that had succeeded.
> > > The function B will then try to wake-up whatever had failed in
> > > func_a.
> > 
> > I assumw with this, the return of xe_force_wake_get will return the
> > mask, hence the caller will need to verify whether the returned mask is
> > correct or failed.
> > 
> > 
> > > 
> > > Something like:
> > > 
> > > 
> > > func_a() {
> > >     fw_ref = xe_force_wake_get(ALL_DOMAINS) <---> fails GT-domain
> > > but return a mask with all the domains except GT.
> > > 
> > >     XE_WARN(!fw_ref);
> > 
> > 
> > XE_WARN(!fw_ref); will work for all individual domains but not  ALL_DOMAINS
> > 
> > XE_WARN(fw_ref != ALL_DOMAINS); <-- If user wants to continue -->
> > 
> > if (fw_ref != ALL_DOMAINS)  <--If user wants to return on failure -->
> >      xe_force_wake_put(fw_ref); <-- ensure to put awake domain -->
> > 
> >      return;
> > }
> > 
> > 
> > > 
> > >     func_b();
> > > 
> > >     XE_WARN(xe_force_wake_put(fw_ref));<---> decrements refcounts of
> > > the domains which were actually woken up.
> > 
> > Makes sense.
> > 
> > > }
> > > 
> > >     func_b() {
> > >          fw_ref = xe_force_wake_get(GT_DOMAIN);
> > >     if(fw_ref & GT_DOMAIN) <---> likely fail anyway since func_a has
> > > failed, but it at least tries it out because you have handled it in
> > > your series...
> > >         return;
> > > 
> > >     does mmio_operations(); <---> Domain might not be awake
> > > 
> > >     xe_force_wake_put(fw_ref); <---> decrement refcount of the
> > > domains you woked up.
> > > }
> > > 
> > > does it make sense now?
> > 
> > 
> > Yes, this is indeed a much better approach for FORCEWAKE_ALL. Thank you
> > for the suggestion. To summarize, rather than disabling the successfully
> > awakened domain in the event of a failure, we will use forcewake_put to
> > handle the disabling of them and user will decide when to call it.
> 
> This way of implementing looks ok to me. Only concern is what if the
> func_b() calls xe_force_wake_assert_held(), this will raise the assert as it
> will not find expected domain awake. This doesn't align the idea of
> continuing in case of ack failure. IMO user decide to continue even after
> set ack failure by assuming domain woken up but ack didn't arrive in  time.

yeap, and then we fix this case.
If the assert is in place is because the _get wasn't properly handled.

> 
> Regards,
> Badal
> > 
> > 
> > > 
> > > > 
> > > > BR
> > > > Himal
> > > > 
> > > > > 
> > > > > >        }
> > > > > >        return 0;
> > > > > > -- 
> > > > > > 2.34.1
> > > > > > 


More information about the Intel-xe mailing list