[PATCH v1] drm/msm/dp: use dp_hpd_plug_handle() and dp_hpd_unplug_handle() directly

Fri Mar 29 02:15:55 UTC 2024

On 3/28/2024 5:10 PM, Dmitry Baryshkov wrote:
> On Fri, 29 Mar 2024 at 01:42, Abhinav Kumar <quic_abhinavk at quicinc.com> wrote:
>>
>>
>>
>> On 3/28/2024 3:50 PM, Dmitry Baryshkov wrote:
>>> On Thu, 28 Mar 2024 at 23:21, Abhinav Kumar <quic_abhinavk at quicinc.com> wrote:
>>>>
>>>>
>>>>
>>>> On 3/28/2024 1:58 PM, Stephen Boyd wrote:
>>>>> Quoting Abhinav Kumar (2024-03-28 13:24:34)
>>>>>> + Johan and Bjorn for FYI
>>>>>>
>>>>>> On 3/28/2024 1:04 PM, Kuogee Hsieh wrote:
>>>>>>> For internal HPD case, hpd_event_thread is created to handle HPD
>>>>>>> interrupts generated by HPD block of DP controller. It converts
>>>>>>> HPD interrupts into events and executed them under hpd_event_thread
>>>>>>> context. For external HPD case, HPD events is delivered by way of
>>>>>>> dp_bridge_hpd_notify() under thread context. Since they are executed
>>>>>>> under thread context already, there is no reason to hand over those
>>>>>>> events to hpd_event_thread. Hence dp_hpd_plug_handle() and
>>>>>>> dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify().
>>>>>>>
>>>>>>> Signed-off-by: Kuogee Hsieh <quic_khsieh at quicinc.com>
>>>>>>> ---
>>>>>>>      drivers/gpu/drm/msm/dp/dp_display.c | 5 +++--
>>>>>>>      1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>
>>>>>> Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()")
>>>>>
>>>>> Is this a bug fix or an optimization? The commit text doesn't tell me.
>>>>>
>>>>
>>>> I would say both.
>>>>
>>>> optimization as it avoids the need to go through the hpd_event thread
>>>> processing.
>>>>
>>>> bug fix because once you go through the hpd event thread processing it
>>>> exposes and often breaks the already fragile hpd handling state machine
>>>> which can be avoided in this case.
>>>
>>> Please add a description for the particular issue that was observed
>>> and how it is fixed by the patch.
>>>
>>> Otherwise consider there to be an implicit NAK for all HPD-related
>>> patches unless it is a series that moves link training to the enable
>>> path and drops the HPD state machine completely.
>>>
>>> I really mean it. We should stop beating a dead horse unless there is
>>> a grave bug that must be fixed.
>>>
>>
>> I think the commit message is explaining the issue well enough.
>>
>> This was not fixing any issue we saw to explain you the exact scenario
>> of things which happened but this is just from code walkthrough.
>>
>> Like kuogee wrote, hpd event thread was there so handle events coming
>> out of the hpd_isr for internal hpd cases. For the hpd_notify coming
>> from pmic_glink or any other extnernal hpd cases, there is no need to
>> put this through the hpd event thread because this will only make things
>> worse of exposing the race conditions of the state machine.
>>
>> Moving link training to enable and removal of hpd event thread will be
>> worked on but delaying obvious things we can fix does not make sense.
> 
>  From the commit message this feels like an optimisation rather than a
> fix. And granted the fragility of the HPD state machine, I'd prefer to
> stay away from optimisations. As far as I understood from the history
> of the last revert, we'd better make sure that HPD handling goes only
> through the HPD event thread.
> 

I think you are mixing the two. We tried to send the events through 
DRM's hpd_notify which ended up in a bad way and btw, thats still not 
resolved even though I have seen reports that things are fine with the 
revert, we are consistently able to see us ending up in a disconnected 
state with all the reverts and fixes in our x1e80100 DP setup.

I plan to investigate that issue properly in the next week and try to 
make some sense of it all.

In fact, this patch is removing one more user of the hpd event thread 
which is the direction in which we all want to head towards.

On whether this is an optimization or a bug fix. I think by avoiding hpd 
event thread (which should have never been used for hpd_notify updates, 
hence a bug) we are avoiding the possibility of more race conditions.

So, this has my R-b and it holds. Upto you.