[PATCH] drm/amd/display: Fix pageflip event race condition for DCN.

Kazlauskas, Nicholas nicholas.kazlauskas at amd.com
Mon Mar 2 13:57:05 UTC 2020


On 2020-03-02 1:17 a.m., Mario Kleiner wrote:
> Commit '16f17eda8bad ("drm/amd/display: Send vblank and user
> events at vsartup for DCN")' introduces a new way of pageflip
> completion handling for DCN, and some trouble.
> 
> The current implementation introduces a race condition, which
> can cause pageflip completion events to be sent out one vblank
> too early, thereby confusing userspace and causing flicker:
> 
> prepare_flip_isr():
> 
> 1. Pageflip programming takes the ddev->event_lock.
> 2. Sets acrtc->pflip_status == AMDGPU_FLIP_SUBMITTED
> 3. Releases ddev->event_lock.
> 
> --> Deadline for surface address regs double-buffering passes on
>      target pipe.
> 
> 4. dc_commit_updates_for_stream() MMIO programs the new pageflip
>     into hw, but too late for current vblank.
> 
> => pflip_status == AMDGPU_FLIP_SUBMITTED, but flip won't complete
>     in current vblank due to missing the double-buffering deadline
>     by a tiny bit.
> 
> 5. VSTARTUP trigger point in vblank is reached, VSTARTUP irq fires,
>     dm_dcn_crtc_high_irq() gets called.
> 
> 6. Detects pflip_status == AMDGPU_FLIP_SUBMITTED and assumes the
>     pageflip has been completed/will complete in this vblank and
>     sends out pageflip completion event to userspace and resets
>     pflip_status = AMDGPU_FLIP_NONE.
> 
> => Flip completion event sent out one vblank too early.
> 
> This behaviour has been observed during my testing with measurement
> hardware a couple of time.
> 
> The commit message says that the extra flip event code was added to
> dm_dcn_crtc_high_irq() to prevent missing to send out pageflip events
> in case the pflip irq doesn't fire, because the "DCH HUBP" component
> is clock gated and doesn't fire pflip irqs in that state. Also that
> this clock gating may happen if no planes are active. This suggests
> that the problem addressed by that commit can't happen if planes
> are active.
> 
> The proposed solution is therefore to only execute the extra pflip
> completion code iff the count of active planes is zero and otherwise
> leave pflip completion handling to the pflip irq handler, for a
> more race-free experience.
> 
> Note that i don't know if this fixes the problem the original commit
> tried to address, as i don't know what the test scenario was. It
> does fix the observed too early pageflip events though and points
> out the problem introduced.

This looks like a valid race condition that should be addressed.

Unfortunately this also doesn't fix the problem the original commit was 
trying to address.

HUBP interrupts only trigger when it's not clock gated. But there are 
cases (for example, PSR) where the HUBP can be clock gated but the 
active plane count is greater than zero.

The clock gating switch can typically happens outside of x86 control 
flow so we're not really going to understand in advance whether or not 
we'll be able to receive the pflip IRQ.

While we do have plane status/flip pending bits available to check in 
the VSTARTUP IRQ handler, this obviously doesn't work for resolving the 
core of the issue - that we probably don't know whether or not the HUBP 
will be on before sending out the event.

Maybe we can guess based on what features are enabled.

Regards,
Nicholas Kazlauskas

> 
> Fixes: 16f17eda8bad ("drm/amd/display: Send vblank and user events at vsartup for DCN")
> Signed-off-by: Mario Kleiner <mario.kleiner.de at gmail.com>
> ---
>   .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 18 +++++++++++++++---
>   1 file changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 63e8a12a74bc..3502d6d52160 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -522,8 +522,9 @@ static void dm_dcn_crtc_high_irq(void *interrupt_params)
>   
>   	acrtc_state = to_dm_crtc_state(acrtc->base.state);
>   
> -	DRM_DEBUG_DRIVER("crtc:%d, vupdate-vrr:%d\n", acrtc->crtc_id,
> -				amdgpu_dm_vrr_active(acrtc_state));
> +	DRM_DEBUG_DRIVER("crtc:%d, vupdate-vrr:%d, planes:%d\n", acrtc->crtc_id,
> +			 amdgpu_dm_vrr_active(acrtc_state),
> +			 acrtc_state->active_planes);
>   
>   	amdgpu_dm_crtc_handle_crc_irq(&acrtc->base);
>   	drm_crtc_handle_vblank(&acrtc->base);
> @@ -543,7 +544,18 @@ static void dm_dcn_crtc_high_irq(void *interrupt_params)
>   			&acrtc_state->vrr_params.adjust);
>   	}
>   
> -	if (acrtc->pflip_status == AMDGPU_FLIP_SUBMITTED) {
> +	/*
> +	 * If there aren't any active_planes then DCH HUBP may be clock-gated.
> +	 * In that case, pageflip completion interrupts won't fire and pageflip
> +	 * completion events won't get delivered. Prevent this by sending
> +	 * pending pageflip events from here if a flip is still pending.
> +	 *
> +	 * If any planes are enabled, use dm_pflip_high_irq() instead, to
> +	 * avoid race conditions between flip programming and completion,
> +	 * which could cause too early flip completion events.
> +	 */
> +	if (acrtc->pflip_status == AMDGPU_FLIP_SUBMITTED &&
> +	    acrtc_state->active_planes == 0) {
>   		if (acrtc->event) {
>   			drm_crtc_send_vblank_event(&acrtc->base, acrtc->event);
>   			acrtc->event = NULL;
> 



More information about the dri-devel mailing list