[Intel-gfx] [PATCH 01/25] dma-fence: basic lockdep annotations
Daniel Vetter
daniel at ffwll.ch
Thu Jul 9 07:52:16 UTC 2020
On Thu, Jul 09, 2020 at 08:32:41AM +0100, Daniel Stone wrote:
> Hi,
>
> On Wed, 8 Jul 2020 at 16:13, Daniel Vetter <daniel.vetter at ffwll.ch> wrote:
> > On Wed, Jul 8, 2020 at 4:57 PM Christian König <christian.koenig at amd.com> wrote:
> > > Could we merge this controlled by a separate config option?
> > >
> > > This way we could have the checks upstream without having to fix all the
> > > stuff before we do this?
> >
> > Since it's fully opt-in annotations nothing blows up if we don't merge
> > any annotations. So we could start merging the first 3 patches. After
> > that the fun starts ...
> >
> > My rough idea was that first I'd try to tackle display, thus far
> > there's 2 actual issues in drivers:
> > - amdgpu has some dma_resv_lock in commit_tail, plus a kmalloc. I
> > think those should be fairly easy to fix (I'd try a stab at them even)
> > - vmwgfx has a full on locking inversion with dma_resv_lock in
> > commit_tail, and that one is functional. Not just reading something
> > which we can safely assume to be invariant anyway (like the tmz flag
> > for amdgpu, or whatever it was).
> >
> > I've done a pile more annotations patches for other atomic drivers
> > now, so hopefully that flushes out any remaining offenders here. Since
> > some of the annotations are in helper code worst case we might need a
> > dev->mode_config.broken_atomic_commit flag to disable them. At least
> > for now I have 0 plans to merge any of these while there's known
> > unsolved issues. Maybe if some drivers take forever to get fixed we
> > can then apply some duct-tape for the atomic helper annotation patch.
> > Instead of a flag we can also copypasta the atomic_commit_tail hook,
> > leaving the annotations out and adding a huge warning about that.
>
> How about an opt-in drm_driver DRIVER_DEADLOCK_HAPPY flag? At first
> this could just disable the annotations and nothing else, but as we
> see the annotations gaining real-world testing and maturity, we could
> eventually make it taint the kernel.
You can do that pretty much per-driver, since the annotations are pretty
much per-driver. No annotations in your code, no lockdep splat. Only if
there's some dma_fence_begin/end_signalling() calls is there even the
chance of a problem.
E.g. this round has the i915 patch dropped and *traraaaa* intel-gfx-ci is
happy (or well at least a lot happier, there's some noise in there that's
probably not from my stuff).
So I guess if amd wants this, we could do an DRM_AMDGPU_MOAR_LOCKDEP
Kconfig or similar. I haven't tested, but I think as long as we don't
merge any of the amdgpu specific patches, there's no splat in amdgpu. So
with that I think that's plenty enough opt-in for each driver. The only
problem is a bit shared helper code like atomic helpers and drm scheduler.
There we might need some opt-out (I don't think merging makes sense when
most of the users are still broken).
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the amd-gfx
mailing list