[Intel-gfx] [PATCH 2/2] drm/i915: restrict unclaimed register checking

chris at chris-wilson.co.uk chris at chris-wilson.co.uk
Wed Sep 2 13:53:14 PDT 2015


On Wed, Sep 02, 2015 at 08:20:52PM +0000, Zanoni, Paulo R wrote:
> Em Qua, 2015-08-26 às 08:44 +0100, Chris Wilson escreveu:
> > On Tue, Aug 25, 2015 at 07:03:42PM -0300, Paulo Zanoni wrote:
> > > The unclaimed register bit is only triggered when someone touches 
> > > the
> > > specified register range.
> > > 
> > > For the normal use case (with i915.mmio_debug=0), this commit will
> > > avoid the extra __raw_i915_read32() call for every register outside
> > > the specified range, at the expense of a few additional "if"
> > > statements.
> > > 
> > > Cc: Chris Wilson <chris at chris-wilson.co.uk>
> > > Signed-off-by: Paulo Zanoni <paulo.r.zanoni at intel.com>
> > 
> > 
> > > @@ -612,7 +614,7 @@ hsw_unclaimed_reg_debug(struct drm_i915_private 
> > > *dev_priv, u32 reg, bool read,
> > >  	const char *op = read ? "reading" : "writing to";
> > >  	const char *when = before ? "before" : "after";
> > >  
> > > -	if (!i915.mmio_debug)
> > > +	if (!i915.mmio_debug || !UNCLAIMED_CHECK_RANGE(reg))
> > >  		return;
> > 
> > Place the register check on the preceding line so that it is not
> > confused with checking the debug-enabled status. (Also you can argue
> > that reg will be register/cache-hot and so a cheaper test to do first
> > than loading a module parameter.)
> 
> That makes sense, I'll do it.
> 
> > 
> > >  	if (__raw_i915_read32(dev_priv, FPGA_DBG) & 
> > > FPGA_DBG_RM_NOCLAIM) {
> > > @@ -624,11 +626,11 @@ hsw_unclaimed_reg_debug(struct 
> > > drm_i915_private *dev_priv, u32 reg, bool read,
> > >  }
> > >  
> > >  static void
> > > -hsw_unclaimed_reg_detect(struct drm_i915_private *dev_priv)
> > > +hsw_unclaimed_reg_detect(struct drm_i915_private *dev_priv, u32 
> > > reg)
> > >  {
> > >  	static bool mmio_debug_once = true;
> > >  
> > > -	if (i915.mmio_debug || !mmio_debug_once)
> > > +	if (i915.mmio_debug || !UNCLAIMED_CHECK_RANGE(reg) || 
> > > !mmio_debug_once)
> > >  		return;
> > 
> > The use is wrong here though because you never reviewed my patch to
> > enable the single-shot debugging from the interrupt handler to catch
> > invalid usage from elsewhere.
> 
> If you're talking about intel_uncore_check_errors(), I think we can
> just kill it now. I'll submit a patch with the arguments, so we can
> continue this topic there.
> 
> Moving back to the main subject:
> 
> In the last time you sent the patch to do the unclaimed checks only on
> forcewake code, I started the conversations with the HW guys about the
> range of registers relevant by FPGA_DBG (and CCd you on these
> conversations). My plan was to write this patch at that time, but
> priorities happened and it got postponed :(. I also hoped that maybe
> you would eventually end up writing it and I would just have to review
> it :)
> 
> My main argument was that by doing unclaimed register checking only at
> forcewake time we would be running a check that is only relevant to
> display code during non-display code. So my idea is that if we restrict
> the unclaimed check to the actual range of registers that can be caught
> we basically remove unclaimed register checking for most (all?) of the
> performance-sensitive code.
> 
> Since we have multiple solutions, I decided to do some experiments.
> First of all, since I was not really seeing hsw_unclaimed_reg_detect()
> on perf, I decided to patch it so it would do the read/write check
> around FPGA_DBG 50 times per call instead of just one. With this, by
> running "perf record glxgears" for a few seconds I could see
> hsw_unclaimed_reg_detect() taking about 4.6% of the total time.

Something is very suspect with your system if you are not observing much
hsw_unclaimed_reg_check during trivial workloads.

> Then I applied just this patch, and the time went down to 0.13%. I also
> applied your patch on top of it all, and it went up to 0.52% (I guess
> the extra checks at forcewake time cost a little bit). Also notice that
> since I add a "reg" argument to hsw_unclaimed_reg_detect(), I changed
> the forcewake call to use a register in the display range.
> I also tested your patch without my patch, and the measured result was
> about 0.56%.
> 
> Now this isn't a super relevant experiment: just glxgears with a
> modified hsw_unclaimed_reg_detect(), but I thought it would be useful
> information, and maybe you could provide me suggestions for better
> workloads.
> 
> So I'd like to know if you're ok with proceeding with just this patch
> (considering I implement your suggestions), or if you think we should
> go with just your patch or both or none.

If you really wanted to, you could combine approaches a check in the
forcewake handler as demonstrated is an order of magnitude less frequent
than the register accesses. The key point is that we *need* the checking
against other users, not just our known register access. Checking our own
access basically amounts to detecting invalid registers, which your
approach more or less erradicates (since it is a flag we add to known good
registers, we may as well just compile it and only turn it on during hw
bringup) and that we have the power well awake.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list