[PATCH] drm/msm/dp: move link_ready out of HPD event thread
Abhinav Kumar
quic_abhinavk at quicinc.com
Mon Mar 18 18:01:25 UTC 2024
On 3/15/2024 8:57 AM, Johan Hovold wrote:
> On Thu, Mar 14, 2024 at 09:30:57AM -0700, Abhinav Kumar wrote:
>> On 3/14/2024 8:38 AM, Johan Hovold wrote:
>>> On Wed, Mar 13, 2024 at 10:24:08AM -0700, Abhinav Kumar wrote:
>
>>> Perhaps I'm missing something in the race that you are trying to
>>> describe (and which I've asked you to describe in more detail so that I
>>> don't have to spend more time trying to come up with a reproducer
>>> myself).
>
>> The race condition is between the time we get disconnect event and set
>> link_ready to false, a commit can come in. Because setting link_ready to
>> false happens in the event thread so it could be slightly delayed.
>
> I get this part, just not why, or rather when, that becomes a problem.
>
> Once the disconnect event is processed, dp_hpd_unplug_handle() will
> update the state to ST_DISCONNECT_PENDING, and queue a notification
> event. link_ready is (before this patch) still set to 1.
>
This is the case I am thinking of:
1) Disconnect event happens which will call dp_hpd_unplug_handle() but
link_ready is not false yet.
2) There is a commit with a modeset, which shall trigger
atomic_disable() followed by an atomic_enable()
atomic_disable() will go through disable clocks and set hpd_state to
ST_DISCONNECTED.
3) atomic_enable() will not go through because we will bail out because
state was ST_DISCONNECTED.
if (state != ST_DISPLAY_OFF && state != ST_MAINLINK_READY) {
mutex_unlock(&dp_display->event_mutex);
return;
}
4) Now, if there is another commit with a modeset, it will go and crash
at atomic_disable()
> Here a commit comes in; what exactly are you suggesting would trigger
> that? And in such a way that it breaks the state machine?
>
Like we have seen, the commit can either come directly from userspace as
one last frame (the original bug I had given the link to) or from the
__drm_fb_helper_restore_fbdev_mode_unlocked() which happened in
sc8280xp's case. This is totally independent of the hpd_thread() with no
mutual exclusion.
This commit() can come before the link_ready was set to false. If it had
come after link_ready was set to false, atomic_check() would have failed
and no issue would have been seen.
My change is making the link_ready false sooner in the disconnect case.
> One way this could cause trouble is if you end up with a call to
> dp_bridge_atomic_post_disable() which updates the state to
> ST_DISCONNECTED. (1)
>
> This would then need to be followed by another call to
> dp_bridge_atomic_enable() which bails out early with the link clock
> disabled. (2) (And if link_ready were to be set to 0 sooner, the
> likelihood of this is reduced.)
>
> This in turn, would trigger a reset when dp_bridge_atomic_disable() is
> later called.
>
Yes, this is exactly what I have written above.
> This is the kind of description of the race I expect to see in the
> commit message, and I'm still not sure what would trigger the call to
> dp_bridge_atomic_post_disable() and dp_bridge_atomic_enable() (i.e. (1)
> and (2) above) and whether this is a real issue or not.
>
I have explained what triggers the disable/enable call below.
> Also note that the above scenario is quite different from the one I've
> hit and described earlier.
>
Why is that so? Eventually it will also translate to the same scenario.
I would like to understand why this is different. I think in your case,
probably we do not know what triggers the modeset, but its a minor
detail like I have written before.
>> It will be hard to reproduce this. Only way I can think of is to delay
>> the EV_NOTIFICATION for sometime and see in dp_bridge_hpd_notify()
>>
>> else if (dp_display->link_ready && status ==
>> connector_status_disconnected)
>> dp_add_event(dp, EV_HPD_UNPLUG_INT, 0, 0);
>>
>> as dp_add_event() will add the event, then wakeup the event_q.
>
> Sure that would increase the race window with the current code, but that
> alone isn't enough to trigger the bug AFAICT.
>
>> Before the event thread wakes up and processes this unplug event, the
>> commit can come in. This is the race condition i was thinking of.
>
> Yes, but what triggers the commit? And why would it lead to a mode set
> that disables the bridge?
>
Commit was triggered from the userspace as it did not process the
disconnect event on time and the userspace was triggering a couple of
modesets by by changing the mode on the CRTC from 1080P to NONE to 1080P.
[drm:drm_atomic_helper_check_modeset] [CRTC:60:crtc-1] mode changed
> Johan
More information about the dri-devel
mailing list