[Intel-gfx] [PATCH] drm/i915/psr: Chase psr.enabled only under the psr.lock

Chris Wilson chris at chris-wilson.co.uk
Tue Apr 10 11:00:26 UTC 2018


Quoting Rodrigo Vivi (2018-04-09 20:14:32)
> On Sat, Apr 07, 2018 at 10:05:25AM +0100, Chris Wilson wrote:
> > Quoting Rodrigo Vivi (2018-04-06 23:18:16)
> > > On Fri, Apr 06, 2018 at 11:12:27AM -0700, Souza, Jose wrote:
> > > > On Thu, 2018-04-05 at 12:49 +0100, Chris Wilson wrote:
> > > > > +           struct drm_crtc *crtc =
> > > > > +                   dp_to_dig_port(intel_dp)->base.base.crtc;
> > > 
> > > I'm afraid that the issue is this pointer here. So this will only mask
> > > the issue.
> > > 
> > > Should we maybe stash the pipe? :/
> > 
> > It's not that bad. pipe cannot change until after psr_disable is called,
> > right? And psr_disable ensures that this worker is flushed. The current
> > problem is just the coordination of cancelling the worker, where we may
> > set psr.enabled to NULL right before the worker grabs it and
> > dereferences it.
> > 
> > So if we lock until we have the pipe, we know that dereference chain is
> > valid, and we know that psr_disable() cannot complete until we complete
> > the wait. So the pipe remains valid until we return (so long as the pipe
> > exists when we start).
> 
> hmm... it makes sense and I have no better suggestion actually.
> So, as long it really fixes the regression we introduced:
> 
> Acked-by: Rodrigo Vivi <rodrigo.vivi at intel.com>

It does fix the abstract race, but I have no evidence of this being hit
in practice. Pushed, but up to you if you care about this being
backported.

Note this race is different from the GPF CI reported. Hmm, I think
https://bugs.freedesktop.org/show_bug.cgi?id=105959 is the same one as
hit on the kasan run earlier.
-Chris


More information about the Intel-gfx mailing list