[PATCH v1] drm/ci: enable lockdep detection

Vignesh Raman vignesh.raman at collabora.com
Fri Sep 13 12:19:02 UTC 2024


Hi Rob, Helen,

On 14/08/24 23:11, Rob Clark wrote:
> On Wed, Aug 14, 2024 at 2:42 AM Vignesh Raman
> <vignesh.raman at collabora.com> wrote:
>>
>> Hi Helen,
>>
>> On 14/08/24 01:44, Helen Mae Koike Fornazier wrote:
>>>
>>>
>>>
>>>
>>> ---- On Tue, 13 Aug 2024 02:26:48 -0300 Vignesh Raman  wrote ---
>>>
>>>    > Hi Helen,
>>>    >
>>>    > On 13/08/24 01:47, Helen Mae Koike Fornazier wrote:
>>>    > >
>>>    > > Hi Vignesh,
>>>    > >
>>>    > > Thanks for your patch.
>>>    > >
>>>    > >
>>>    > > ---- On Mon, 12 Aug 2024 08:20:28 -0300 Vignesh Raman  wrote ---
>>>    > >
>>>    > >   > We have enabled PROVE_LOCKING (which enables LOCKDEP) in drm-ci.
>>>    > >   > This will output warnings when kernel locking errors are encountered
>>>    > >   > and will continue executing tests. To detect if lockdep has been
>>>    > >   > triggered, check the debug_locks value in /proc/lockdep_stats after
>>>    > >   > the tests have run. When debug_locks is 0, it indicates that lockdep
>>>    > >   > has detected issues and turned itself off. So check this value and
>>>    > >   > exit with an error if lockdep is detected.
>>>    > >
>>>    > > Should we exit with an error? Or with a warning? (GitLab-CI supports that).
>>>    > > Well, I guess it is serious enough.
>>>    >
>>>    > I think we can exit with an error since we check the status at the end
>>>    > of the tests.
>>>
>>> I mean, we can exit with a specific error and configure this specific error in gitlab-ci to be a warning,
>>> so the job will be yellow and not red.
>>>
>>> But maybe the lockdep issue should be a strong error.
>>
>> Yes agree. We can exit with an error for lockdep issue instead of a warning.
> 
> I think that is too strong, lockdep can warn about things which can
> never happen in practice.  (We've never completely solved some of the
> things that lockdep complains about in runpm vs shrinker reclaim.)
> 
> Surfacing it as a warning is fine.

Will send another patch which will exit with an error if lockdep is 
detected and configure it as a warning in GitLab CI.

Regards,
Vignesh

> 
> BR,
> -R
> 
>>>
>>>    >
>>>    > >
>>>    > > Should we also track on the xfail folder? So we can annotate those errors as well?
>>>    >
>>>    > Do you mean reporting this error in expectation files?
>>>
>>> I wonder if there will be cases were we are getting this error and we should ignore it, so in the code
>>> we should check the xfail files to see if we should exit with an error or ignore it.
>>>
>>> For instance, if we have a case where we are having this error, and it is flaky, we might want to add it
>>> to the flakes file list.
>>>
>>> Maybe this is not the case, I'm just wondering.
>>
>>
>> The tests are passing but log shows lockdep warning
>> (https://gitlab.freedesktop.org/vigneshraman/linux/-/jobs/62177711).
>>
>> Moreover if the lockdep warning is emitted, lockdep will not continue to
>> run and there is no need to check this warning for each tests.
>> So added the check at the end of the tests.
>>
>>>
>>>
>>>    >
>>>    > > Did you have an entire pipeline with this? To see if everything is still green?
>>>    >
>>>    > Yes. https://gitlab.freedesktop.org/vigneshraman/linux/-/jobs/62177711
>>>    >
>>>    > This is a test branch in which I reverted a fix for the lockdep issue.
>>>    > We see 'WARNING: bad unlock balance detected!' in logs and pipeline is
>>>    > still green.
>>>
>>> But with your patch, it would red right?
>>
>> Yes it would fail and the pipeline will be red.
>>
>>> With the current patch, is the pipeline still all green?
>>
>> With this current patch, it will fail.
>> Pipeline link to show lockdep_stats before and after tests,
>> https://gitlab.freedesktop.org/vigneshraman/linux/-/pipelines/1246721
>>
>> Regards,
>> Vignesh
>>
>>>
>>> Regards,
>>> Helen
>>>
>>>    >
>>>    > Regards,
>>>    > Vignesh
>>>    >
>>>    > >
>>>    > > Helen
>>>    > >
>>>    > >   >
>>>    > >   > Signed-off-by: Vignesh Raman vignesh.raman at collabora.com>
>>>    > >   > ---
>>>    > >   >
>>>    > >   > v1:
>>>    > >   >  - Pipeline link to show lockdep_stats before and after tests,
>>>    > >   > https://gitlab.freedesktop.org/vigneshraman/linux/-/pipelines/1246721
>>>    > >   >
>>>    > >   > ---
>>>    > >   >  drivers/gpu/drm/ci/igt_runner.sh | 11 +++++++++++
>>>    > >   >  1 file changed, 11 insertions(+)
>>>    > >   >
>>>    > >   > diff --git a/drivers/gpu/drm/ci/igt_runner.sh b/drivers/gpu/drm/ci/igt_runner.sh
>>>    > >   > index f38836ec837c..d2c043cd8c6a 100755
>>>    > >   > --- a/drivers/gpu/drm/ci/igt_runner.sh
>>>    > >   > +++ b/drivers/gpu/drm/ci/igt_runner.sh
>>>    > >   > @@ -85,6 +85,17 @@ deqp-runner junit \
>>>    > >   >  --limit 50 \
>>>    > >   >  --template "See https://$CI_PROJECT_ROOT_NAMESPACE.pages.freedesktop.org/-/$CI_PROJECT_NAME/-/jobs/$CI_JOB_ID/artifacts/results/{{testcase}}.xml"
>>>    > >   >
>>>    > >   > +# Check if /proc/lockdep_stats exists
>>>    > >   > +if [ -f /proc/lockdep_stats ]; then
>>>    > >   > +    # If debug_locks is 0, it indicates lockdep is detected and it turns itself off.
>>>    > >   > +    debug_locks=$(grep 'debug_locks:' /proc/lockdep_stats | awk '{print $2}')
>>>    > >   > +    if [ "$debug_locks" -eq 0 ]; then
>>>    > >   > +        echo "LOCKDEP issue detected. Please check dmesg logs for more information."
>>>    > >   > +        cat /proc/lockdep_stats
>>>    > >   > +        ret=1
>>>    > >   > +    fi
>>>    > >   > +fi
>>>    > >   > +
>>>    > >   >  # Store the results also in the simpler format used by the runner in ChromeOS CI
>>>    > >   >  #sed -r 's/(dmesg-warn|pass)/success/g' /results/results.txt > /results/results_simple.txt
>>>    > >   >
>>>    > >   > --
>>>    > >   > 2.43.0
>>>    > >   >
>>>    > >   >
>>>    >


More information about the dri-devel mailing list