[PATCH] drm/amd/pm: Send message when resp status is 0xFC
Lazar, Lijo
lijo.lazar at amd.com
Fri Feb 25 07:43:14 UTC 2022
On 2/25/2022 1:02 PM, Quan, Evan wrote:
> [AMD Official Use Only]
>
>
>
>> -----Original Message-----
>> From: Lazar, Lijo <Lijo.Lazar at amd.com>
>> Sent: Friday, February 25, 2022 2:03 PM
>> To: Quan, Evan <Evan.Quan at amd.com>; amd-gfx at lists.freedesktop.org
>> Cc: Zhang, Hawking <Hawking.Zhang at amd.com>; Deucher, Alexander
>> <Alexander.Deucher at amd.com>; Wang, Yang(Kevin)
>> <KevinYang.Wang at amd.com>
>> Subject: Re: [PATCH] drm/amd/pm: Send message when resp status is 0xFC
>>
>>
>>
>> On 2/25/2022 11:25 AM, Quan, Evan wrote:
>>> [AMD Official Use Only]
>>>
>>>
>>>
>>>> -----Original Message-----
>>>> From: Lazar, Lijo <Lijo.Lazar at amd.com>
>>>> Sent: Friday, February 25, 2022 1:47 PM
>>>> To: Quan, Evan <Evan.Quan at amd.com>; amd-gfx at lists.freedesktop.org
>>>> Cc: Zhang, Hawking <Hawking.Zhang at amd.com>; Deucher, Alexander
>>>> <Alexander.Deucher at amd.com>; Wang, Yang(Kevin)
>>>> <KevinYang.Wang at amd.com>
>>>> Subject: RE: [PATCH] drm/amd/pm: Send message when resp status is
>>>> 0xFC
>>>>
>>>> [AMD Official Use Only]
>>>>
>>>>> That is the caller can perform something like issuing the same
>>>>> message again without prerequisites check on PMFW busy
>>>>
>>>> This patch expects this method. Caller may try to resend message
>>>> again. As part of message sending, driver first checks response
>>>> register. Current logic blocks sending any message if it sees 0xFC in
>>>> response register, this patch is to address that.
>>> [Quan, Evan] Yes, I know. But the caller here could be another one. I mean
>> there may be another caller stepped in.
>>>
>>
>> That shouldn't cause an issue to the second caller if it got message mutex.
>> The second caller also should be able to send message if PMFW got free by
>> that time. The first caller can retry when it gets back the message mutex. FW
>> doesn't maintain any state for 0xFC response. Any other message may be
>> sent after that. If driver keeps the state based on two callers, that is a logic
>> problem in driver. I don't think we have any flow like that.
> [Quan, Evan] Yeah, but there may be some case that messages issued by the two callers have dependence.
> That means the message issued by the 2nd caller should be only after the 1st one.
> The one i can think of is "EnableAllSmuFeatures" message should be after "SetAllowedFeatures" message.
> Although that should not cause any problem, I'm not sure whether there is other similar case.
>
> What I suggest is something like below. We just do it again in smu_cmn_send_smc_msg_with_param() on PMFW busy.
>
> int smu_cmn_send_smc_msg_with_param(struct smu_context *smu,
> enum smu_message_type msg,
> uint32_t param,
> uint32_t *read_arg)
> {
> ...
> ...
> mutex_lock(&smu->message_lock);
> reg = __smu_cmn_poll_stat(smu);
> res = __smu_cmn_reg2errno(smu, reg);
> if (reg == SMU_RESP_NONE ||
> reg == SMU_RESP_BUSY_OTHER ||
> res == -EREMOTEIO) {
> __smu_cmn_reg_print_error(smu, reg, index, param, msg);
> goto Out;
> }
> +retry:
> __smu_cmn_send_msg(smu, (uint16_t) index, param);
> reg = __smu_cmn_poll_stat(smu);
> res = __smu_cmn_reg2errno(smu, reg);
> + if (reg == SMU_RESP_BUSY_OTHER) {
> + mdelay(1);
> + goto retry;
> + }
I suppose the retry option should be left to caller. Regardless of retry
or not, the patch is still valid.
Example situation -
rocm-smi is trying to get metrics and another app is trying set
performance profile. If metrics fetch fail and even retry of metrics
fetch fail after some loops, rocm-smi is free to fetch the metrics again
after say 5s. That also shouldn't prevent the second app to send
performance profile message and that app also may retry the same later.
Thanks,
Lijo
> ...
> ...
> }
>
> BR
> Evan
>>
>> Basically, 0xFC is not valid pre-condition check for sending any message. As
>> per PMFW team - it only means that PMFW was busy when a previous
>> message was sent and PMFW won't change the response status when it
>> becomes free.
>>
>> Thanks,
>> Lijo
>>
>>> BR
>>> Evan
>>>>
>>>> Thanks,
>>>> Lijo
>>>>
>>>> -----Original Message-----
>>>> From: Quan, Evan <Evan.Quan at amd.com>
>>>> Sent: Friday, February 25, 2022 11:07 AM
>>>> To: Lazar, Lijo <Lijo.Lazar at amd.com>; amd-gfx at lists.freedesktop.org
>>>> Cc: Zhang, Hawking <Hawking.Zhang at amd.com>; Deucher, Alexander
>>>> <Alexander.Deucher at amd.com>; Wang, Yang(Kevin)
>>>> <KevinYang.Wang at amd.com>
>>>> Subject: RE: [PATCH] drm/amd/pm: Send message when resp status is
>>>> 0xFC
>>>>
>>>> [AMD Official Use Only]
>>>>
>>>> This may introduce some problems for two callers scenarios. That is
>>>> the 2nd one will still proceed even if the 1st one was already blocked.
>>>> Maybe the logics here should be performed by the caller. That is the
>>>> caller can perform something like issuing the same message again
>>>> without prerequisites check on PMFW busy.
>>>> Or we can just update the smu_cmn_send_smc_msg APIs to give it
>>>> another try on PMFW busy.
>>>>
>>>> BR
>>>> Evan
>>>>> -----Original Message-----
>>>>> From: Lazar, Lijo <Lijo.Lazar at amd.com>
>>>>> Sent: Friday, February 25, 2022 12:22 PM
>>>>> To: amd-gfx at lists.freedesktop.org
>>>>> Cc: Zhang, Hawking <Hawking.Zhang at amd.com>; Deucher, Alexander
>>>>> <Alexander.Deucher at amd.com>; Wang, Yang(Kevin)
>>>>> <KevinYang.Wang at amd.com>; Quan, Evan <Evan.Quan at amd.com>
>>>>> Subject: [PATCH] drm/amd/pm: Send message when resp status is 0xFC
>>>>>
>>>>> When PMFW is really busy, it will respond with 0xFC. However, it
>>>>> doesn't change the response register state when it becomes free.
>>>>> Driver should retry and proceed to send message if the response
>>>>> status is
>>>> 0xFC.
>>>>>
>>>>> Signed-off-by: Lijo Lazar <lijo.lazar at amd.com>
>>>>> ---
>>>>> drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 2 --
>>>>> 1 file changed, 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
>>>>> b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
>>>>> index 590a6ed12d54..92161b9d8c1a 100644
>>>>> --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
>>>>> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
>>>>> @@ -297,7 +297,6 @@ int smu_cmn_send_msg_without_waiting(struct
>>>>> smu_context *smu,
>>>>> reg = __smu_cmn_poll_stat(smu);
>>>>> res = __smu_cmn_reg2errno(smu, reg);
>>>>> if (reg == SMU_RESP_NONE ||
>>>>> - reg == SMU_RESP_BUSY_OTHER ||
>>>>> res == -EREMOTEIO)
>>>>> goto Out;
>>>>> __smu_cmn_send_msg(smu, msg_index, param); @@ -391,7 +390,6
>>>> @@ int
>>>>> smu_cmn_send_smc_msg_with_param(struct
>>>>> smu_context *smu,
>>>>> reg = __smu_cmn_poll_stat(smu);
>>>>> res = __smu_cmn_reg2errno(smu, reg);
>>>>> if (reg == SMU_RESP_NONE ||
>>>>> - reg == SMU_RESP_BUSY_OTHER ||
>>>>> res == -EREMOTEIO) {
>>>>> __smu_cmn_reg_print_error(smu, reg, index, param, msg);
>>>>> goto Out;
>>>>> --
>>>>> 2.25.1
More information about the amd-gfx
mailing list