[RFC 7/9] drm/xe/gt_tlb_invalidation_ggtt: Call xe_force_wake_put if xe_force_wake_get succeds

Tue Sep 10 14:37:01 UTC 2024

On 09-09-2024 14:59, Ghimiray, Himal Prasad wrote:
> 
> 
> On 06-09-2024 21:59, Rodrigo Vivi wrote:
>> On Fri, Sep 06, 2024 at 01:21:41AM +0530, Ghimiray, Himal Prasad wrote:
>>>
>>>
>>> On 06-09-2024 01:07, Rodrigo Vivi wrote:
>>>> On Fri, Aug 30, 2024 at 10:53:24AM +0530, Himal Prasad Ghimiray wrote:
>>>>> A failure in xe_force_wake_get() no longer increments the domain's
>>>>> refcount, so xe_force_wake_put() should not be called in such cases
>>>>>
>>>>> Cc: Matthew Brost <matthew.brost at intel.com>
>>>>> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
>>>>> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
>>>>> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>
>>>>> ---
>>>>>    drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c | 9 ++++++---
>>>>>    1 file changed, 6 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c 
>>>>> b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
>>>>> index cca9cf536f76..3f86ab704c4f 100644
>>>>> --- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
>>>>> +++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
>>>>> @@ -259,11 +259,11 @@ static int xe_gt_tlb_invalidation_guc(struct 
>>>>> xe_gt *gt,
>>>>>    int xe_gt_tlb_invalidation_ggtt(struct xe_gt *gt)
>>>>>    {
>>>>>        struct xe_device *xe = gt_to_xe(gt);
>>>>> +    int ret;
>>>>>        if (xe_guc_ct_enabled(&gt->uc.guc.ct) &&
>>>>>            gt->uc.guc.submission_state.enabled) {
>>>>>            struct xe_gt_tlb_invalidation_fence fence;
>>>>> -        int ret;
>>>>>            xe_gt_tlb_invalidation_fence_init(gt, &fence, true);
>>>>>            ret = xe_gt_tlb_invalidation_guc(gt, &fence);
>>>>> @@ -277,7 +277,9 @@ int xe_gt_tlb_invalidation_ggtt(struct xe_gt *gt)
>>>>>            if (IS_SRIOV_VF(xe))
>>>>>                return 0;
>>>>> -        xe_gt_WARN_ON(gt, xe_force_wake_get(gt_to_fw(gt), XE_FW_GT));
>>>>> +        ret =  xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
>>>>> +        xe_gt_WARN_ON(gt, ret);
>>>>> +
>>>>>            if (xe->info.platform == XE_PVC || GRAPHICS_VER(xe) >= 
>>>>> 20) {
>>>>>                xe_mmio_write32(gt, PVC_GUC_TLB_INV_DESC1,
>>>>>                        PVC_GUC_TLB_INV_DESC1_INVALIDATE);
>>>>> @@ -287,7 +289,8 @@ int xe_gt_tlb_invalidation_ggtt(struct xe_gt *gt)
>>>>>                xe_mmio_write32(gt, GUC_TLB_INV_CR,
>>>>>                        GUC_TLB_INV_CR_INVALIDATE);
>>>>>            }
>>>>> -        xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
>>>>> +        if (!ret)
>>>>> +            xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
>>>>
>>>> looking all these cases now I honestly prefer the other way around.
>>>>
>>>> If we called the get, we call the put.
>>>> get always increase the reference and put does the clean-up.
>>>>
>>>> fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
>>>>
>>>> xe_force_wake_put(gt_to_fw(gt), fw_ref);
>>>>
>>>> so, the fw_ref is a mask of the woken up cases which require
>>>> the ref drop and sleep call.
>>>
>>> Hi Rodrigo,
>>>
>>> Thanks for the input. AFAIU using this approach creates issue in the
>>> subsequent force_wake_get/put in callee function. Which I have tried to
>>> explain in cover letter.
>>>
>>> [1] subsequent forcewake call by callee function assumes domains are
>>> already awake, which might not be true. This shows perfectly balanced
>>> xe_force_wake_get/_put can also cause problem.
>>>
>>> [1] func_a() {
>>>     XE_WARN(xe_force_wake_get()) <---> fails but increments refcount
>>>
>>>     func_b();
>>>
>>>     XE_WARN(xe_force_wake_put());<---> decrements refcounts
>>>   }
>>>
>>>      func_b() {
>>>     if(xe_force_wake_get()) <---> succeeds due to refcount of caller
>>>         return;
>>>
>>>     does mmio_operations(); <---> Domain might not be awake
>>>
>>>     xe_force_wake_put(); <---> decrement refcount
>>>   }
>>
>> Well, to be honest, this is what bugs me in this whole series.
>>
>> If func_a failed, why would function b succeed? It that's the
>> case should we include more redundancy and retries so the
>> func_a would succeed like the func_b is expected in your
>> scenario?
> 
> 
> Hi Rodrigo,
> 
> This is current behavior, which patch [1] resolves. I misunderstood your
> comment as dropping of that patch and simply balancing all _gets with
> respective _puts.
> 
> 
>>
>> But other then that, I'm afraid that you didn't fully understand
>> my idea. Sorry for not being clear.
>>
>> My thought is, you do what you are doing in this series.
>> If the get doesn't succeed you drop the ref count and call the
>> disable.
> 
> 
> OK. IMO, just reducing refcount is better for failing domain and not to
> disable it explicitly
> 
> 
>>
>> The return of the get is just for the domains that have succeeded.
>> then the put returns only the ones that had succeeded.
>> The function B will then try to wake-up whatever had failed in
>> func_a.
> 
> I assumw with this, the return of xe_force_wake_get will return the 
> mask, hence the caller will need to verify whether the returned mask is 
> correct or failed.
> 
> 
>>
>> Something like:
>>
>>
>> func_a() {
>>     fw_ref = xe_force_wake_get(ALL_DOMAINS) <---> fails GT-domain but 
>> return a mask with all the domains except GT.
>>
>>     XE_WARN(!fw_ref);
> 
> 
> XE_WARN(!fw_ref); will work for all individual domains but not  ALL_DOMAINS
> 
> XE_WARN(fw_ref != ALL_DOMAINS); <-- If user wants to continue -->
> 
> if (fw_ref != ALL_DOMAINS)  <--If user wants to return on failure -->
>      xe_force_wake_put(fw_ref); <-- ensure to put awake domain -->
> 
>      return;
> }
> 
> 
>>
>>     func_b();
>>
>>     XE_WARN(xe_force_wake_put(fw_ref));<---> decrements refcounts of 
>> the domains which were actually woken up.
> 
> Makes sense.
> 
>> }
>>
>>     func_b() {
>>          fw_ref = xe_force_wake_get(GT_DOMAIN);
>>     if(fw_ref & GT_DOMAIN) <---> likely fail anyway since func_a has 
>> failed, but it at least tries it out because you have handled it in 
>> your series...
>>         return;
>>
>>     does mmio_operations(); <---> Domain might not be awake
>>
>>     xe_force_wake_put(fw_ref); <---> decrement refcount of the domains 
>> you woked up.
>> }
>>
>> does it make sense now?
> 
> 
> Yes, this is indeed a much better approach for FORCEWAKE_ALL. Thank you 
> for the suggestion. To summarize, rather than disabling the successfully 
> awakened domain in the event of a failure, we will use forcewake_put to 
> handle the disabling of them and user will decide when to call it.

This way of implementing looks ok to me. Only concern is what if the 
func_b() calls xe_force_wake_assert_held(), this will raise the assert 
as it will not find expected domain awake. This doesn't align the idea 
of continuing in case of ack failure. IMO user decide to continue even 
after set ack failure by assuming domain woken up but ack didn't arrive 
in  time.

Regards,
Badal
> 
> 
>>
>>>
>>> BR
>>> Himal
>>>
>>>>
>>>>>        }
>>>>>        return 0;
>>>>> -- 
>>>>> 2.34.1
>>>>>