[PATCH v2] drm/msm/dpu: make "vblank timeout" more useful

Abhinav Kumar quic_abhinavk at quicinc.com
Wed Feb 14 20:35:58 UTC 2024



On 2/14/2024 11:20 AM, Dmitry Baryshkov wrote:
> On Wed, 14 Feb 2024 at 20:02, Abhinav Kumar <quic_abhinavk at quicinc.com> wrote:
>>
>>
>>
>> On 2/8/2024 6:50 AM, Dmitry Baryshkov wrote:
>>> We have several reports of vblank timeout messages. However after some
>>> debugging it was found that there might be different causes to that.
>>> To allow us to identify the DPU block that gets stuck, include the
>>> actual CTL_FLUSH value into the timeout message and trigger the devcore
>>> snapshot capture.
>>>
>>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov at linaro.org>
>>> ---
>>> Changes in v2:
>>> - Added a call to msm_disp_snapshot_state() to trigger devcore dump
>>>     (Abhinav)
>>> - Link to v1: https://lore.kernel.org/r/20240106-fd-dpu-debug-timeout-v1-1-6d9762884641@linaro.org
>>> ---
>>>    drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c | 3 ++-
>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
>>> index d0f56c5c4cce..a8d6165b3c0a 100644
>>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
>>> @@ -489,7 +489,8 @@ static int dpu_encoder_phys_vid_wait_for_commit_done(
>>>                (hw_ctl->ops.get_flush_register(hw_ctl) == 0),
>>>                msecs_to_jiffies(50));
>>>        if (ret <= 0) {
>>> -             DPU_ERROR("vblank timeout\n");
>>> +             DPU_ERROR("vblank timeout: %x\n", hw_ctl->ops.get_flush_register(hw_ctl));
>>> +             msm_disp_snapshot_state(phys_enc->parent->dev);
>>
>>
>> There is no rate limiting in this piece of code unfortunately. So this
>> will flood the number of snapshots.
> 
> Well... Yes and no. The devcoredump will destroy other snapshots if
> there is a pending one. So only the console will be flooded and only
> in case when MSM_DISP_SNAPSHOT_DUMP_IN_CONSOLE is enabled.
> 

Yes, true but at the same time this makes it hard to capture a good dump 
as potentially every vblank you could timeout so this destroy/create 
cycle wont end.

>>
>> Short-term solution is you can go with a vblank_timeout_cnt and reset it
>> in the enable() like other similar error counters.
>>
>> long-term solution is we need to centralize these error locations to one
>> single dpu_encoder_error_handler() with a single counter and the error
>> handler will print out the error code along with the snapshot instead of
>> the snapshot being called from all over the place.
>>
>>
>>
>>>                return -ETIMEDOUT;
>>>        }
>>>
>>>
>>> ---
>>> base-commit: 39676dfe52331dba909c617f213fdb21015c8d10
>>> change-id: 20240106-fd-dpu-debug-timeout-e917f0bc8063
>>>
>>> Best regards,
> 
> 
> 


More information about the dri-devel mailing list