[PATCH v2] drm/xe/client: Check return value of xe_force_wake_get

Ghimiray, Himal Prasad himal.prasad.ghimiray at intel.com
Mon Jun 3 11:34:51 UTC 2024



On 03-06-2024 16:28, Nirmoy Das wrote:
> 
> On 6/3/2024 11:07 AM, Ghimiray, Himal Prasad wrote:
>>
>> On 03-06-2024 14:01, Nirmoy Das wrote:
>>> xe_force_wake_get() can return error so check it's return value
>>> before reading gpu_timestamp value.
>>>
>>> v2: set HWE to NULL instead of setting timestamp to 0(Lucas)
>>>      Add a warn on for xe_force_wake_put(Himal)
>>>
>>> Fixes: 188ced1e0ff8 ("drm/xe/client: Print runtime to fdinfo")
>>> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>
>>> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
>>> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
>>> Signed-off-by: Nirmoy Das <nirmoy.das at intel.com>
>>> ---
>>>   drivers/gpu/drm/xe/xe_drm_client.c | 8 ++++++--
>>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_drm_client.c 
>>> b/drivers/gpu/drm/xe/xe_drm_client.c
>>> index 4a19b771e3a0..4bdb909eabfc 100644
>>> --- a/drivers/gpu/drm/xe/xe_drm_client.c
>>> +++ b/drivers/gpu/drm/xe/xe_drm_client.c
>>> @@ -264,9 +264,13 @@ static void show_run_ticks(struct drm_printer 
>>> *p, struct drm_file *file)
>>>           if (!hwe)
>>>               continue;
>>>   -        xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
>>> +        if (xe_force_wake_get(gt_to_fw(gt), XE_FW_GT)) {
>>
>> Since xe_force_wake_get will increase the ref count irrespective of 
>> acknowledgment from forcewake ack register fails or succeeds.
> 
> General pattern is to skip xe_force_wake_put() though at some places we 
> do call xe_force_wake_put() when the caller is unabortable.
> 
> I don't know the exact reason but I  guess failure to wake a domain is 
> serious enough so we want to  keep already woken domains  in wake state 
> for system stability.

If you break without xe_force_Wake_put, subsequent force_wake_get will 
assume domain is awake(which might not be true, because there was no 
acknowledgment)

Please consider below scenario:

function a:
xe_force_wake_get ---> increased domain ref from 0 to 1 -----> wake 
domain ---> timedout the acknowledgment -> failed and returned without 
decreasing the ref count

read/write operations  -> not called
xe_force_wake_put -> not called

function b : called after function a
xe_force_wake_get -----> domain ref increased to 2 ----> exit success
read/write operations -> executed but domain was not awake
xe_force_wake_put ----> domain ref decreased to 1 again.

BR
Himal


> 
> 
> Regards,
> 
> Nirmoy
> 
> 
>>
>> It is required to decrement the ref count taken by xe_force_wake_get 
>> by calling xe_force_wake_put before break.
>>
>> something like :
>>
>> if (xe_force_wake_get(gt_to_fw(gt), XE_FW_GT))
>>
>> hwe = NULL;
>>
>> else
>>
>> gpu_timestamp = xe_hw_engine_read_timestamp(hwe);
>>
>> XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FW_GT));
>>
>>
>>> +            hwe = NULL;
>>> +            break;
>>> +        }
>>> +
>>>           gpu_timestamp = xe_hw_engine_read_timestamp(hwe);
>>> -        xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
>>> +        XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FW_GT));
>>>           break;
>>>       }


More information about the Intel-xe mailing list