[Freedreno] [PATCH v2 5/7] arm64: dts: qcom: sc7280: Update gpu register list

Akhil P Oommen quic_akhilpo at quicinc.com
Thu Jul 21 16:04:35 UTC 2022


On 7/20/2022 11:34 AM, Akhil P Oommen wrote:
> On 7/19/2022 3:26 PM, Rajendra Nayak wrote:
>>
>>
>> On 7/19/2022 12:49 PM, Stephen Boyd wrote:
>>> Quoting Akhil P Oommen (2022-07-18 23:37:16)
>>>> On 7/19/2022 11:19 AM, Stephen Boyd wrote:
>>>>> Quoting Akhil P Oommen (2022-07-18 21:07:05)
>>>>>> On 7/14/2022 11:10 AM, Akhil P Oommen wrote:
>>>>>>> IIUC, qcom gdsc driver doesn't ensure hardware is collapsed 
>>>>>>> since they
>>>>>>> are vote-able switches. Ideally, we should ensure that the hw has
>>>>>>> collapsed for gpu recovery because there could be transient 
>>>>>>> votes from
>>>>>>> other subsystems like hypervisor using their vote register.
>>>>>>>
>>>>>>> I am not sure how complex the plumbing to gpucc driver would be 
>>>>>>> to allow
>>>>>>> gpu driver to check hw status. OTOH, with this patch, gpu driver 
>>>>>>> does a
>>>>>>> read operation on a gpucc register which is in always-on domain. 
>>>>>>> That
>>>>>>> means we don't need to vote any resource to access this register.
>>>
>>> Reading between the lines here, you're saying that you have to read the
>>> gdsc register to make sure that the gdsc is in some state? Can you
>>> clarify exactly what you're doing? And how do you know that something
>>> else in the kernel can't cause the register to change after it is read?
>>> It certainly seems like we can't be certain because there is voting
>>> involved.
> From gpu driver, cx_gdscr.bit[31] (power off status) register can be 
> polled to ensure that it *collapsed at least once*. We don't need to 
> care if something turns ON gdsc after that.
>
>>
>> yes, this looks like the best case effort to get the gpu to recover, but
>> the kernel driver really has no control to make sure this condition can
>> always be met (because it depends on other entities like hyp, 
>> trustzone etc right?)
>> Why not just put a worst case polling delay?
>
> I didn't get you entirely. Where do you mean to keep the polling delay?
>>
>>>
>>>>>>>
>>>>>>> Stephen/Rajendra/Taniya, any suggestion?
>>>>> Why can't you assert a gpu reset signal with the reset APIs? This 
>>>>> series
>>>>> seems to jump through a bunch of hoops to get the gdsc and power 
>>>>> domain
>>>>> to "reset" when I don't know why any of that is necessary. Can't we
>>>>> simply assert a reset to the hardware after recovery completes so the
>>>>> device is back into a good known POR (power on reset) state?
>>>> That is because there is no register interface to reset GPU CX domain.
>>>> The recommended sequence from HW design folks is to collapse both 
>>>> cx and
>>>> gx gdsc to properly reset gpu/gmu.
>>>>
>>>
>>> Ok. One knee jerk reaction is to treat the gdsc as a reset then and
>>> possibly mux that request along with any power domain on/off so that if
>>> the reset is requested and the power domain is off nothing happens.
>>> Otherwise if the power domain is on then it manually sequences and
>>> controls the two gdscs so that the GPU is reset and then restores the
>>> enable state of the power domain.
> It would be fatal to asynchronously pull the plug on CX gdsc 
> forcefully because there might be another gpu/smmu driver thread 
> accessing registers in cx domain.
>
> -Akhil.
>
But, we can move the cx collapse polling to gpucc and expose it to gpu 
driver using 'reset' framework. I am not very familiar with clk driver, 
but I did a rough prototype here (untested): 
https://zerobin.net/?d34b5f958be3b9b8#NKGzdPy9fgcuOqXZ/XqjI7b8JWcivqe+oSTf4yWHSOU=

If this approach is acceptable, I will send it out as a separate series.

-Akhil.


More information about the dri-devel mailing list