[PATCH v2 3/3] drm/msm/dp: check main link status before start aux read

khsieh at codeaurora.org khsieh at codeaurora.org
Thu Apr 15 22:02:40 UTC 2021


On 2021-04-15 13:06, Stephen Boyd wrote:
> Quoting khsieh at codeaurora.org (2021-04-15 10:37:29)
>> On 2021-04-14 14:09, Stephen Boyd wrote:
>> > Quoting Kuogee Hsieh (2021-04-13 16:11:44)
>> >> Make sure main link is in connection state before start aux
>> >> read/write operation to avoid unnecessary long delay due to
>> >> main link had been unplugged.
>> >>
>> >> Signed-off-by: Kuogee Hsieh <khsieh at codeaurora.org>
>> >> ---
>> >>  drivers/gpu/drm/msm/dp/dp_aux.c  |  5 +++++
>> >>  drivers/gpu/drm/msm/dp/dp_link.c | 20 +++++++++++++++-----
>> >>  2 files changed, 20 insertions(+), 5 deletions(-)
>> >>
>> >> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c
>> >> b/drivers/gpu/drm/msm/dp/dp_aux.c
>> >> index 7c22bfe..fae3806 100644
>> >> --- a/drivers/gpu/drm/msm/dp/dp_aux.c
>> >> +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
>> >> @@ -343,6 +343,11 @@ static ssize_t dp_aux_transfer(struct drm_dp_aux
>> >> *dp_aux,
>> >>
>> >>         mutex_lock(&aux->mutex);
>> >>
>> >> +       if (!dp_catalog_link_is_connected(aux->catalog)) {
>> >> +               ret = -ETIMEDOUT;
>> >> +               goto unlock_exit;
>> >> +       }
>> >
>> > I get a crash here sometimes when plugging and unplugging an apple HDMI
>> > dongle during suspend/resume. I guess device power management
>> > (dp_pm_suspend()) is happening in parallel with aux transfers from the
>> > kthread. Why doesn't the aux communication start reporting NAKs once
>> > the
>> > cable is disconnected?
>> >
>> > [  366.210058] hdmi-audio-codec hdmi-audio-codec.15.auto: calling
>> > platform_pm_suspend+0x0/0x60 @ 7175, parent:
>> > ae90000.displayport-controller
>> > [  366.222825] hdmi-audio-codec hdmi-audio-codec.15.auto:
>> > platform_pm_suspend+0x0/0x60 returned 0 after 1 usecs
>> > [  366.232939] msm-dp-display ae90000.displayport-controller: calling
>> > dp_pm_suspend+0x0/0x80 @ 7175, parent: ae00000.mdss
>> > [  366.244006] msm-dp-display ae90000.displayport-controller:
>> > dp_pm_suspend+0x0/0x80 returned 0 after 79 usecs
>> > [  366.254025] msm_dsi ae94000.dsi: calling
>> > pm_runtime_force_suspend+0x0/0xb4 @ 7175, parent: ae00000.mdss
>> > [  366.263669] msm_dsi ae94000.dsi: pm_runtime_force_suspend+0x0/0xb4
>> > returned 0 after 0 usecs
>> > [  366.272290] panel-simple panel: calling
>> > platform_pm_suspend+0x0/0x60 @ 7175, parent: platform
>> > [  366.272501] ti_sn65dsi86 2-002d: calling
>> > pm_runtime_force_suspend+0x0/0xb4 @ 176, parent: i2c-2
>> > [  366.281055] panel-simple panel: platform_pm_suspend+0x0/0x60
>> > returned 0 after 0 usecs
>> > [  366.281081] leds mmc1::: calling led_suspend+0x0/0x4c @ 7175,
>> > parent: 7c4000.sdhci
>> > [  366.290065] ti_sn65dsi86 2-002d: pm_runtime_force_suspend+0x0/0xb4
>> > returned 0 after 0 usecs
>> > [  366.298046] leds mmc1::: led_suspend+0x0/0x4c returned 0 after 1
>> > usecs
>> > [  366.302994] Internal error: synchronous external abort: 96000010
>> > [#1] PREEMPT SMP
>> > [  366.303006] Modules linked in: vhost_vsock
>> > vmw_vsock_virtio_transport_common vsock vhost rfcomm algif_hash
>> > algif_skcipher af_alg xt_cgroup uinput xt_MASQUERADE venus_enc
>> > hci_uart venus_dec btqca cros_ec_typec typec bluetooth qcom_spmi_adc5
>> > snd_soc_sc7180 qcom_spmi_temp_alarm ecdh_generic qcom_spmi_adc_tm5
>> > qcom_vadc_common snd_soc_qcom_common ecc snd_soc_rt5682_i2c
>> > snd_soc_rt5682 snd_soc_rl6231 venus_core v4l2_mem2mem
>> > snd_soc_lpass_sc7180 snd_soc_lpass_hdmi snd_soc_lpass_cpu
>> > snd_soc_lpass_platform snd_soc_max98357a fuse iio_trig_sysfs
>> > cros_ec_sensors cros_ec_sensors_ring cros_ec_lid_angle
>> > cros_ec_sensors_core industrialio_triggered_buffer kfifo_buf
>> > cros_ec_sensorhub lzo_rle lzo_compress zram ath10k_snoc ath10k_core
>> > ath mac80211 cfg80211 cdc_ether usbnet r8152 mii uvcvideo
>> > videobuf2_vmalloc joydev
>> > [  366.303211] CPU: 0 PID: 224 Comm: dp_hpd_handler Not tainted 5.4.109
>> > #27
>> > [  366.303216] Hardware name: Google Lazor (rev3+) with KB Backlight
>> > (DT)
>> > [  366.303225] pstate: 60c00009 (nZCv daif +PAN +UAO)
>> > [  366.303234] pc : dp_catalog_link_is_connected+0x20/0x34
>> > [  366.303244] lr : dp_aux_transfer+0x44/0x284
>> > [  366.303250] sp : ffffffc011bfbbe0
>> > [  366.303254] x29: ffffffc011bfbbe0 x28: aaaaaaaaaaaaaaaa
>> > [  366.303262] x27: 000000000000000c x26: ffffff896f8212bc
>> > [  366.303269] x25: 0000000000000001 x24: 0000000000000001
>> > [  366.303275] x23: 0000000000000020 x22: ffffff896ff82118
>> > [  366.303282] x21: ffffffc011bfbc50 x20: ffffff896ff82090
>> > [  366.303289] x19: ffffff896ffc3598 x18: 0000000000000000
>> > [  366.303295] x17: 0000000000000000 x16: 0000000000000001
>> > [  366.303303] x15: 0000000000000000 x14: 00000000000002ef
>> > [  366.303311] x13: 0000000000000400 x12: ffffffeb896ea060
>> > [  366.303317] x11: 0000000000000000 x10: 0000000000000000
>> > [  366.303324] x9 : ffffff896f061100 x8 : ffffffc010582204
>> > [  366.303331] x7 : 000000b2b5593519 x6 : 00000000003033ff
>> > [  366.303338] x5 : 0000000000000000 x4 : 0000000000000001
>> > [  366.303345] x3 : ffffff896ff432a1 x2 : 0000000000000000
>> > [  366.303352] x1 : 0000000000000119 x0 : ffffff896ffc3598
>> > [  366.303360] Call trace:
>> > [  366.303367]  dp_catalog_link_is_connected+0x20/0x34
>> > [  366.303374]  dp_aux_transfer+0x44/0x284mutex.
>> > [  366.303387]  drm_dp_dpcd_access+0x8c/0x11c
>> > [  366.303393]  drm_dp_dpcd_read+0x64/0x10c
>> > [  366.303399]  dp_link_process_request+0x94/0xaf8
>> > [  366.303405]  dp_display_usbpd_attention_cb+0x38/0x140
>> > [  366.303411]  hpd_event_thread+0x3f0/0x48c
>> > [  366.303426]  kthread+0x140/0x17c
>> > [  366.303439]  ret_from_fork+0x10/0x18
>> > [  366.303458] Code: d503201f f85f0268 f9400508 91081108 (b9400108)
>> >
>> it needs to make sure core clock enabled before access dp ctrl 
>> registers
>> I am look into it
> 
> Is it really necessary to have this patch at all? I think there are
> bigger problems with suspend/resume of the DP driver in relation to the
> kthread stopping. I hope that the aux channel would start NAKing
> transfers once the cable is disconnected too, so that we don't need to
> do an extra check for each aux transfer.

I am working on duplicate this problem, but it is not happen on me yet 
so far.
 From kernel dump, i can see it crash at dp_irq_hdp_handle() after 
suspended.
dp_irq_hpd_handle and dp_pm_suspend() are serialized by event_mutex.

After suspend, ahb clock is disabled.
Hence next dp_catalog_link_is_connected() crash at acess dp ctrl 
registers.


aux channel does not do NAKing immediately if unplugged. Therefore 
aux_transfer will wait until timeout (HZ/4).
worst, drm_dp_dpcd_access() will retry 32 times before return dpcd 
read/write failed.
This patch try to eliminate the time spinning on waiting for timeout 32 
times.

This patch does not necessary need.
I can drop it.




More information about the dri-devel mailing list