[Intel-gfx] [PATCH 1/2] drm: refernce count event->completion

Jani Nikula jani.nikula at linux.intel.com
Thu Feb 9 14:39:29 UTC 2017


On Wed, 04 Jan 2017, Daniel Vetter <daniel at ffwll.ch> wrote:
> On Wed, Dec 21, 2016 at 12:08:45PM +0100, Maarten Lankhorst wrote:
>> Op 21-12-16 om 11:36 schreef Chris Wilson:
>> > On Wed, Dec 21, 2016 at 11:23:30AM +0100, Daniel Vetter wrote:
>> >> When writing the generic nonblocking commit code I assumed that
>> >> through clever lifetime management I can assure that the completion
>> >> (stored in drm_crtc_commit) only gets freed after it is completed. And
>> >> that worked.
>> >>
>> >> I also wanted to make nonblocking helpers resilient against driver
>> >> bugs, by having timeouts everywhere. And that worked too.
>> >>
>> >> Unfortunately taking boths things together results in oopses :( Well,
>> >> at least sometimes: What seems to happen is that the drm event hangs
>> >> around forever stuck in limbo land. The nonblocking helpers eventually
>> >> time out, move on and release it. Now the bug I tested all this
>> >> against is drivers that just entirely fail to deliver the vblank
>> >> events like they should, and in those cases the event is simply
>> >> leaked. But what seems to happen, at least sometimes, on i915 is that
>> >> the event is set up correctly, but somohow the vblank fails to fire in
>> >> time. Which means the event isn't leaked, it's still there waiting for
>> >> evevntually a vblank to fire. That tends to happen when re-enabling the
>> >> pipe, and then the trap springs and the kernel oopses.
>> >>
>> >> The correct fix here is simply to refcount the crtc commit to make
>> >> sure that the event sticks around even for drivers which only
>> >> sometimes fail to deliver vblanks for some arbitrary reasons. Since
>> >> crtc commits are already refcounted that's easy to do.
>> > Or make the event a part of the atomic state?
>> > -Chris
>> >
>> afaict crtc commit is already taken to wait for completion, so this patch makes sense.
>> 
>> There's just a minor typo in the subject. :)
>> Not sure that release_commit should be a function pointer, regardless..
>> 
>> Reviewed-by: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
>
> It didn't help the bug reporters against oopses (but the reporters are
> supremely confusing, I have no idea what's really being tested, the
> bugzilla is a mess), but I still think the patch is useful for more
> robuestness, I dropped the cc: stable and applied it to drm-misc.

Agreed on the bug [1] being a mess. However, the bug has a reliable
bisect result, the revert was posted by some of the reporters on the
lists and in the bug, and now something that will not help anyone in
v4.9 or v4.10 was pushed. :(

BR,
Jani.


[1] https://bugs.freedesktop.org/show_bug.cgi?id=96781

-- 
Jani Nikula, Intel Open Source Technology Center


More information about the Intel-gfx mailing list