[PATCH v3 6/7] drm/msm/dp: add pm_runtime_force_suspend()/resume()

Thu Sep 28 22:35:49 UTC 2023

On 9/27/2023 2:41 PM, Stephen Boyd wrote:
> Quoting Abhinav Kumar (2023-09-22 18:35:27)
>> On 9/22/2023 2:54 PM, Stephen Boyd wrote:
>>> Quoting Dmitry Baryshkov (2023-09-19 02:50:12)
>>>>
>>>> This should be hpd_notify, who starts link training, not some event.
>>>
>>> I think this driver should train the link during atomic_enable(), not
>>> hpd_notify() (or directly from the irq handler). The drm_bridge_funcs
>>> talk a bit about when the clocks and timing signals are supposed to be
>>> enabled. For example, struct drm_bridge_funcs::atomic_pre_enable() says
>>> the "display pipe (i.e.  clocks and timing signals) feeding this bridge
>>> will not yet be running when this callback is called". And struct
>>> drm_bridge_funcs::atomic_enable() says "this callback must enable the
>>> display link feeding the next bridge in the chain if there is one."
>>>
>>> That looks to me like link training, i.e. the display link, should
>>> happen in the enable path and not hpd_notify. It looks like link
>>> training could fail, but when that happens I believe the driver should
>>> call drm_connector_set_link_status_property() with
>>> DRM_MODE_LINK_STATUS_BAD. The two callers of that which exist in the
>>> tree also call drm_kms_helper_hotplug_event() or
>>> drm_kms_helper_connector_hotplug_event() after updating the link so that
>>> userspace knows to try again.
>>>
>>> It would be nice if there was some drm_bridge_set_link_status_bad() API
>>> that bridge drivers could use to signal that the link status is bad and
>>> call the hotplug helper. Maybe it could also record some diagnostics
>>> about which bridge failed to setup the link and stop the atomic_enable()
>>> chain for that connector.
>>
>> Doing link training when we get hpd instead of atomic_enable() is a
>> design choice we have been following for a while because for the case
>> when link training fails in atomic_enable() and setting the link status
>> property as you mentioned, the compositor needs to be able to handle
>> that and also needs to try with a different resolution or take some
>> other corrective action. We have seen many compositors not able to
>> handle this complexity.
> 
> The chrome compositor handles this case[1]. If the link status is set to
> bad and there are non-zero number of modes on a connected connector it
> resets the status to good to try again.
> 

Thanks for the link. Just resetting the property alone and trying again 
is going to lead to the same failure again. So that alone is 
insufficient and doesn't sound right.

As documented here:

https://www.kernel.org/doc/html/latest/gpu/drm-kms.html#standard-connector-properties

"When user-space receives the hotplug uevent and detects a "BAD" 
link-status, the sink doesn't receive pixels anymore (e.g. the screen 
becomes completely black). The list of available modes may have changed. 
User-space is expected to pick a new mode if the current one has 
disappeared and perform a new modeset with link-status set to "GOOD" to 
re-enable the connector."

Picking a different mode is a reasonable attempt to try again but even 
that might fail again if it picks a mode which falls in the same link rate.

Thats why, to re-iterate what i mentioned, we need to see if some sort 
of a handshake fallback exists or can be implemented. It will need 
compositor support as well as driver change to maybe remove that mode etc.

We prioritized user experience here to make sure display_enable() wont 
fail otherwise the user might keep waiting for the screen to come up 
forever. With the driver ensuring link is trained and then reporting to 
usermode, its a safer option as the driver will train with the highest 
link rate / lane combination supported and also remove modes which dont 
fall in this bucket in dp_bridge_mode_valid.

Till we validate this, I would prefer to keep this change out of this 
series.

>> So the design sends the hotplug to usermode only
>> after link training succeeds.
> 
> I suspect this is why my trogdor device turns off after rebooting when I
> apply a ChromeOS update with the lid closed and DP connected. Userspace
> wants to know that a display is connected, but this driver is still link
> training by the time userspace boots up so we don't see any drm
> connector indicating status is connected. No drm connectors connected
> means the OS should shutdown.
> 

Interesting use case but I am not sure if thats whats happening till we 
debug that. Why should OS shutdown if connectors are not in "connected" 
state? Agreed, display will be off. But shouldnt compositor be alive in 
case it receives hotplug? The user (in this case you) never turned the 
device off so why should the OS shutdown?

>>
>> I do not think we should change this design unless prototyped with an
>> existing compositor such as chrome or android at this point.
> 
> Is this driver used with android?
> 

There are some internal efforts ongoing with prototyping this but I 
cannot comment more on this right now.

> [1] https://source.chromium.org/chromium/chromium/src/+/main:ui/ozone/platform/drm/gpu/hardware_display_plane_manager_atomic.cc;l=114;drc=67520ac99db89395b10f2ab728b540eef0da8292