[PATCH 5/5] drm/i915: Solve the GPU reset vs. modeset deadlocks with an rw_semaphore

Daniel Vetter daniel at ffwll.ch
Fri Jun 30 18:23:58 UTC 2017


On Fri, Jun 30, 2017 at 5:44 PM, Ville Syrjälä
<ville.syrjala at linux.intel.com> wrote:
>> And if the GEM folks insist the old behavior can't be restored, then we
>> just need a tailor-made get-out-of-jail card for gen4 gpu reset somewhere
>> in i915_sw_fence. Force-completing all render requests atomic updates
>> depend upon is imo the simplest solution to this, and we've had a driver
>> that worked like that for years.
>
> And it used to break all the time. I think we've had to fix it at least
> three times by now. So I tend to think it's better to fix it in a way
> that won't break so easily.

Why exactly is making the atomic code massive more tricky the easy
fix? That's the part I don't get. Yes it got broken a bunch because no
one runs CI and everyone forgets that gen3/4 reset the display in gpu
reset, but in the end we do have a depency loop, and either the
modeset side or the render side needs to bail out and cancel it's
async stuff (whether that's a request or a nonblocking flip/atomic
commit doesn't matter). In my opinion, cancelling the request (even if
we're clever and only cancel the requests for the modeset waiters,
which isn't to hard to pull off) seems about the simplest option.
Especially since we need that code anyway, even TDR can't safe
everything and resubmit under all circumstances (at least the buggy
batch can't be resubmitted).

Cancelling any kind of atomic commit otoh looks like a lot more
complexity. Why do you think this is the easier, or at least less
fragile option? This patch series is full of FIXMEs, and even the more
complete set seems to have a pile of holes. Plus we need to stop using
obj->state, and I don't see any easy way to test for that (since the
gen3/4 gpu reset case is the only corner cases that currently needs
that).

So not seeing how this is easier or more robust at all. What do I miss?

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


More information about the dri-devel mailing list