[PATCH] drm/amd/display: take dc_lock in short pulse handler only
Aurabindo Pillai
aurabindo.pillai at amd.com
Wed May 19 21:01:51 UTC 2021
On 2021-05-19 4:59 p.m., Kazlauskas, Nicholas wrote:
> On 2021-05-19 4:55 p.m., Aurabindo Pillai wrote:
>> [Why]
>> Conditions that end up modifying the global dc state must be locked.
>> However, during mst allocate payload sequence, lock is already taken.
>> With StarTech 1.2 DP hub, we get an HPD RX interrupt for a reason other
>> than to indicate down reply availability right after sending payload
>> allocation. The handler again takes dc lock before calling the
>> dc's HPD RX handler. Due to this contention, the DRM thread which waits
>> for MST down reply never gets a chance to finish its waiting
>> successfully and ends up timing out. Once the lock is released, the hpd
>> rx handler fires and goes ahead to read from the MST HUB, but now its
>> too late and the HUB doesnt lightup all displays since DRM lacks error
>> handling when payload allocation fails.
>>
>> [How]
>> Take lock only if there is a change in link status or if automated test
>> pattern bit is set. The latter fixes the null pointer dereference when
>> running certain DP Link Layer Compliance test.
>>
>> Signed-off-by: Aurabindo Pillai <aurabindo.pillai at amd.com>
>
> Discussed this a bit offline and I'd *really* like the proper interface in sooner rather than later.
>
> Conditional locking is almost always a sign of a bug, in this case we know it's OK but someone can
> change the function underneath later without understanding that we're duplicating some of the
> checking logic in the upper layer.
>
> I don't think the code changes enough in this area for this to happen (as it's spec based), but
> please be mindful and consider splitting the checking logic (which is thread safe) out with the link
> loss logic (the functional bit, that isn't thread safe).
>
> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas at amd.com>
Thanks for the review! I'll get back to fixing this properly in a few weeks.
>
> Regards,
> Nicholas Kazlauskas
>
>> ---
>> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 19 +++++++++++++++++--
>> .../gpu/drm/amd/display/dc/core/dc_link_dp.c | 2 +-
>> .../gpu/drm/amd/display/dc/inc/dc_link_dp.h | 4 ++++
>> 3 files changed, 22 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> index e79910cc179c..2c9d099adfc2 100644
>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> @@ -28,6 +28,7 @@
>> #include "dm_services_types.h"
>> #include "dc.h"
>> +#include "dc_link_dp.h"
>> #include "dc/inc/core_types.h"
>> #include "dal_asic_id.h"
>> #include "dmub/dmub_srv.h"
>> @@ -2740,6 +2741,7 @@ static void handle_hpd_rx_irq(void *param)
>> enum dc_connection_type new_connection_type = dc_connection_none;
>> struct amdgpu_device *adev = drm_to_adev(dev);
>> union hpd_irq_data hpd_irq_data;
>> + bool lock_flag = 0;
>> memset(&hpd_irq_data, 0, sizeof(hpd_irq_data));
>> @@ -2769,15 +2771,28 @@ static void handle_hpd_rx_irq(void *param)
>> }
>> }
>> - if (!amdgpu_in_reset(adev)) {
>> + /*
>> + * TODO: We need the lock to avoid touching DC state while it's being
>> + * modified during automated compliance testing, or when link loss
>> + * happens. While this should be split into subhandlers and proper
>> + * interfaces to avoid having to conditionally lock like this in the
>> + * outer layer, we need this workaround temporarily to allow MST
>> + * lightup in some scenarios to avoid timeout.
>> + */
>> + if (!amdgpu_in_reset(adev) &&
>> + (hpd_rx_irq_check_link_loss_status(dc_link, &hpd_irq_data) ||
>> + hpd_irq_data.bytes.device_service_irq.bits.AUTOMATED_TEST)) {
>> mutex_lock(&adev->dm.dc_lock);
>> + lock_flag = 1;
>> + }
>> +
>> #ifdef CONFIG_DRM_AMD_DC_HDCP
>> result = dc_link_handle_hpd_rx_irq(dc_link, &hpd_irq_data, NULL);
>> #else
>> result = dc_link_handle_hpd_rx_irq(dc_link, NULL, NULL);
>> #endif
>> + if (!amdgpu_in_reset(adev) && lock_flag)
>> mutex_unlock(&adev->dm.dc_lock);
>> - }
>> out:
>> if (result && !is_mst_root_connector) {
>> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
>> b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
>> index 9e08410bfdfd..32fb9cdbd980 100644
>> --- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
>> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
>> @@ -2070,7 +2070,7 @@ enum dc_status read_hpd_rx_irq_data(
>> return retval;
>> }
>> -static bool hpd_rx_irq_check_link_loss_status(
>> +bool hpd_rx_irq_check_link_loss_status(
>> struct dc_link *link,
>> union hpd_irq_data *hpd_irq_dpcd_data)
>> {
>> diff --git a/drivers/gpu/drm/amd/display/dc/inc/dc_link_dp.h
>> b/drivers/gpu/drm/amd/display/dc/inc/dc_link_dp.h
>> index ffc3f2c63db8..7dd8bca542b9 100644
>> --- a/drivers/gpu/drm/amd/display/dc/inc/dc_link_dp.h
>> +++ b/drivers/gpu/drm/amd/display/dc/inc/dc_link_dp.h
>> @@ -68,6 +68,10 @@ bool perform_link_training_with_retries(
>> enum signal_type signal,
>> bool do_fallback);
>> +bool hpd_rx_irq_check_link_loss_status(
>> + struct dc_link *link,
>> + union hpd_irq_data *hpd_irq_dpcd_data);
>> +
>> bool is_mst_supported(struct dc_link *link);
>> bool detect_dp_sink_caps(struct dc_link *link);
>>
>
--
Regards,
Aurabindo Pillai
More information about the amd-gfx
mailing list