[Intel-gfx] BUG: sleeping function called from invalid context on 3.10.10-rt7

Wed Sep 18 19:03:41 CEST 2013

On Wed, Sep 18, 2013 at 12:52:07PM -0400, Peter Hurley wrote:
> On 09/17/2013 04:55 PM, Daniel Vetter wrote:
> > On Tue, Sep 17, 2013 at 9:50 PM, Peter Hurley <peter at hurleysoftware.com> wrote:
> >> On 09/11/2013 03:31 PM, Peter Hurley wrote:
> >>>
> >>> [+cc dri-devel]
> >>>
> >>> On 09/11/2013 11:38 AM, Steven Rostedt wrote:
> >>>>
> >>>> On Wed, 11 Sep 2013 11:16:43 -0400
> >>>> Peter Hurley <peter at hurleysoftware.com> wrote:
> >>>>
> >>>>>> The funny part is, there's a comment there that shows that this was
> >>>>>> done even for "PREEMPT_RT". Unfortunately, the call to
> >>>>>> "get_scanout_position()" can call functions that use the rt-mutex
> >>>>>> "sleeping spin locks" and it breaks there.
> >>>>>>
> >>>>>> I guess we need to ask the authors of the mainline patch exactly why
> >>>>>> that preempt_disable() is needed?
> >>>>>
> >>>>>
> >>>>> The drm core associates a timestamp with each vertical blank frame #.
> >>>>> Drm drivers can optionally support a 'high resolution' hw timestamp.
> >>>>> The vblank frame #/timestamp tuple is user-space visible.
> >>>>>
> >>>>> The i915 drm driver supports a hw timestamp via this drm helper function
> >>>>> which computes the timestamp from the crtc scan position (based on the
> >>>>> pixel clock).
> >>>>>
> >>>>> For mainline, the preempt_disable/_enable() isn't actually necessary
> >>>>> because every call tree that leads here already has preemption disabled.
> >>>>>
> >>>>> For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?
> >>>>>
> >>>>
> >>>> No, it should not. Note, any other lock that can be held when it is
> >>>> held would also need to be raw.
> >>>
> >>>
> >>> By that, you mean "any other lock" that might be claimed "would also need
> >>> to be raw"?  Hopefully not "any other lock" already held?
> >>>
> >>>> And by taking a quick audit of the code, I see this:
> >>>>
> >>>>      spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
> >>>>
> >>>>      /* Reset the chip */
> >>>>
> >>>>      /* GEN6_GDRST is not in the gt power well, no need to check
> >>>>       * for fifo space for the write or forcewake the chip for
> >>>>       * the read
> >>>>       */
> >>>>      __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL);
> >>>>
> >>>>      /* Spin waiting for the device to ack the reset request */
> >>>>      ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) &
> >>>> GEN6_GRDOM_FULL) == 0, 500);
> >>>>
> >>>> That spin is unacceptable in RT with preemption and interrupts disabled.
> >>>
> >>>
> >>> Yep. That would be bad.
> >>>
> >>> AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included
> >>> in the force-wake set, so raw reads of the registers would
> >>> probably be acceptable (thus obviating the need for claiming the
> >>> uncore.lock).
> >>>
> >>> Except that _ALL_ register access is disabled with the uncore.lock
> >>> during a gpu reset. Not sure if that's meant to include crtc registers
> >>> or not, or what other synchronization/serialization issues are being
> >>> handled/hidden by forcing all register accesses to wait during a gpu
> >>> reset.
> >>>
> >>> Hopefully an i915 expert can weigh in here?
> >>
> >>
> >>
> >> Daniel,
> >>
> >> Can you shed some light on whether the i915+ crtc registers (specifically
> >> those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter())
> >> read as part of the vblank counter/timestamp handling need to
> >> be prevented during gpu reset?
> >
> > The depency here in the locking is a recent addition:
> >
> > commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a
> > Author: Chris Wilson <chris at chris-wilson.co.uk>
> > Date:   Fri Jul 19 20:36:51 2013 +0100
> >
> >      drm/i915: Serialize almost all register access
> >
> > It's a (slightly) oversized hammer to work around a hardware issue -
> > we could break it down to register blocks, which can be accessed
> > concurrently, but that tends to be more fragile. But the chip really
> > dies if you access (even just reads) the same block concurrently :(
> 
> Ouch. But thanks for clarifying that.
> 
> Ok, so register access needs to be serialized. And a separate but
> related concern is that gen6+ resets also need to hold-off register
> access where forcewake is required.
> 
> 
> While I was reviewing the registers that require forcewake handling,
> I saw this:
> 
> from i915_reg.h:
> #define _DPLL_A	(dev_priv->info->display_mmio_offset + 0x6014)
> #define _DPLL_B	(dev_priv->info->display_mmio_offset + 0x6018)
> 
> from i915_drv.c:
> static const struct intel_device_info intel_valleyview_m_info = {
> 	GEN7_FEATURES,
> 	.is_mobile = 1,
> 	.num_pipes = 2,
> 	.is_valleyview = 1,
> 	.display_mmio_offset = VLV_DISPLAY_BASE,     <<<-------
> 	.has_llc = 0, /* legal, last one wins */
> };
> 
> from intel_uncore.c:
> #define NEEDS_FORCE_WAKE(dev_priv, reg) \
> 	((HAS_FORCE_WAKE((dev_priv)->dev)) && \
> 	 ((reg) < 0x40000) &&            \
> 	 ((reg) != FORCEWAKE))
> 
> Is this is a mistake or do the valleyview PLLs not require the
> same forcewake handling as the other intel gpus?

Display registers shouldn't need forcewake on any platform. I guess our
NEEDS_FORCE_WAKE() check is a bit too coarse and includes a bunch of
stuff doesn't need to be there. So sort of by accident we do the right
thing on VLV, and the "wrong" thing on other platforms.

-- 
Ville Syrjälä
Intel OTC