[Intel-xe] [RFC] drm/xe: A minimal assert for some forcewake domain on xe_mmio.

Matt Roper matthew.d.roper at intel.com
Tue May 23 03:28:05 UTC 2023


On Mon, May 22, 2023 at 06:15:27PM -0400, Rodrigo Vivi wrote:
> It is the maximum protection we can do with the current infrastructure.
> 
> Cc: John Harrison <John.C.Harrison at Intel.com>
> Cc: Matthew Auld <matthew.auld at intel.com>
> Cc: Francois Dugast <francois.dugast at intel.com>
> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
> Cc: Jani Nikula <jani.nikula at intel.com>
> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
> Cc: Matt Roper <matthew.d.roper at intel.com>
> Cc: Matthew Brost <matthew.brost at intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
> ---
> 
> 
> RFC
> ===
> 
> Okay, so, this is more an RFC to brainstorm the future of the force_wake in
> Xe than anything else.
> 
> On i915 the force_wake is built-in the mmio functions at uncore component.
> 
> With that approach we had few historical issues iirc:
> 
> 1. Display performance with vblank evasion that requested uncore to provide
> the 'fw' variantes that are actually the ones that avoid fw (contrary to what
> the name suggests).

In i915 there were more differences between fw and non-fw variants
of register functions than just forcewake handling:

 * _fw() functions assumed that the caller was already holding
   forcewake, whereas the non-fw functions would obtain it themselves
 * _fw() functions also assumed that the caller was already holding
   uncore->lock, whereas the non-fw functions would obtain and release
   the lock around each register access
 * _fw() functions do no tracing and no debug assertions
 * _fw() functions do not check for unclaimed MMIO

I don't think the first bullet there (forcewake) really mattered much
from a performance perspective.  For display registers, a quick binary
search of the FW table (just a handful of CPU cycles) would quickly
determine that no forcewake domains were needed for those register
offsets, so we wouldn't be doing anything with forcewake at the hardware
level at all.  For display registers, the more performance-relevant
aspects of using the _fw() functions was doing all your register
accesses together without contention with other MMIO work.  I.e.,
holding the uncore lock over an entire set of registers rather than
grabbing/releasing it for each one, and (for vblank evasion
specifically) doing it all while interrupts were disabled.  For debug
drivers, there was also a bunch of other stuff that the _fw() functions
bypassed (e.g., tripling of the number of register accesses due to
reading FPGA_DBG before/after each register to look for unclaimed
accesses).

> 
> 2. Missed ranges updates. Sometimes we messed up with the ranges, there were
> other times that the spec was updated and we didn't get the notification, and
> there were cases that the BSpec had bugs.
> 
> For these reasons in Xe we have decided to let to the caller the
> responsibility to set the force_wake bits for their domains before doing
> the MMIO.

I don't think #2 was a reason to skip forcewake in Xe.  Not noticing a
bspec update (or the bspec itself having incorrect information) would
lead to even more mistakes if the caller has to explicitly grab the
appropriate domains than if the driver does it implicitly.  The explicit
handling only helps in the subset of cases where we blindly grab all of
the domains (as we do during part of init).


> 
> However I'm seeing many questions and doubts popping up:
> 
> 1. Are we confident that we are not missing any wake-up?
> 2. Are we confident that the domains are set correctly?
> 3. Are we not wasting power if we are waking up ALL the domains instead
>    of some specific one?

I think we're only holding ALL domains during init; for most of the
runtime MMIO accesses, I think the code is currently attempting to only
grab the domain(s) that it thinks the registers it's accessing will
need.  Although that might be working right now, I'm a little bit
worried about how that will scale in the long term when a single
register might be in several different domains depending on what
platform you're running on.  There have definitely been cases in the
past where groups of registers migrated from RENDER to GT or vice versa
between platforms, so the exact domain you needed for an operation
varied by platform.  And things are even more complicated if you're
doing any MMIO against the media, since the hardware seems to change
exactly how it splits up the media power wells somewhat often.

> 4. What about the display disconnection now because i915 and xe have different
>    mmio approaches but reusing i915-display?
> 
> It looks to me that the cons of the current approach are superseding the
> cons of the i915 approach. But I want to hear more thoughts here before
> we decide which route to take.
> 
> Maybe we have that domain as part of the mmio calls themselves? Maybe
> a double approach where the caller is responsible but mmio has the range
> information and double check it?

As noted above, part of the challenge with forcewake is that even if the
caller knows it wants to access register FOO, and even if it know FOO is
a GT register that likely needs forcewake, it's sometimes challenging to
make sure it's grabbing the correct domain(s) for every single platform
the Xe driver will eventually support if the power management handling
changes.  I think that was part of the motivation for encoding the
tables into the driver in i915.  It seems like GT power wells don't
change as much these days as they used to, but it's hard to say whether
that will continue in the future or not.  Who knows...maybe they'll
eventually start creating dedicated domains for stuff like blitters,
GuC, etc.  rather than lumping all of those into the "GT" catchall
domain.

If we do decide to go back to implicit forcewake handling with the table
encoded into the driver, it might be worth doing something sort of like
what Lucas is doing in the OOB workaround series --- drop the
per-platform tables into a human-readable text file that's more similar
to the format used by the bspec (exact ranges, forcewake domain, MCR
replication type, etc.) and then provide a small parser program that
will convert that into actual code (and do things like consolidating
adjacent ranges).

> 
> Any other idea? Thoughts?

Once the big GT vs tile refactor stuff I have in flight gets finalized,
I plan to follow up with another series that creates a more appropriate
MMIO target for register operations rather than using "xe_gt" as the
target (even for things completely unrelated to the small GT subset of
hardware).  My idea is that you'd grab an MMIO target structure for MMIO
operations against a specific hardware unit, and then the info inside
the MMIO structure would be able to figure out if there are additional
checks and/or operations it should perform.  E.g.,

 * mmio = xe_mmio_for_device(xe_device *xe);
    - Used to submit MMIO operations against the root tile of the PCI
      device.  Only used during init (e.g., to read the registers that
      tell us which tiles exist on the platform) and during top-level
      interrupt enable/disable (since all interrupts are routed through
      the root tile).
    - No forcewake needed for register accesses through a handle of this
      type since you'd only ever be accessing sgunit registers for these
      types of things.
    - Register accesses through mmio can warn on debug builds if the
      register appears to be in a GT-related MMIO range.

 * mmio = xe_mmio_for_display(xe_device *xe);
    - Pretty much the same as handles returned by xe_mmio_for_device(),
      but if a handle of this type is used to try to read/write
      registers outside the display range, we could have debug builds
      throw some extra warnings.
    - Unclaimed register detection could be confined to accesses through
      these handles.

 * mmio = xe_mmio_for_tile(xe_tile *tile);
    - Used to access non-GT registers that reside in a specific tile.
      I.e., sgunit/soc registers.
    - As above, no forcewake needed, can make MMIO operations warn
      if used to access a GT range.

 * mmio = xe_mmio_for_gt(xe_gt *gt);
    - Used to access GT registers in a specific GT
    - Does automatic GSI offset translation for media GTs
    - Can either do automatic forcewake like i915 does, or can do debug
      check+warn like you have here.
    - Can make MMIO operations warn if MMIO offset is outside GT range
    - Can also trigger warnings if a GT non-GSI, non-media engine
      register is accessed from an MMIO obtained from a media GT.

At some point we'll probably want to add some debug tracepoints for MMIO
access to assist with debugging; the mmio structure can also help
provide meaningful output for each type of MMIO handle as well.


One other MMIO-related thing:
At the moment Xe does no locking of MMIO access.  I know there were
hardware lockups on old platforms if simultaneous MMIO accesses
happened, but I've never seen that tied to a specific HSD ticket that
would let us know whether modern platforms are still susceptible or not.
If it turns out they are and that we need to bring in the equivalent of
i915's uncore lock (and figure out if any register access is
problematic, or only certain types/ranges).


Matt

> 
> Thanks,
> Rodrigo.
> 
>  drivers/gpu/drm/xe/xe_mmio.h | 26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_mmio.h b/drivers/gpu/drm/xe/xe_mmio.h
> index 1407f1189b0d..6302ca85e3f4 100644
> --- a/drivers/gpu/drm/xe/xe_mmio.h
> +++ b/drivers/gpu/drm/xe/xe_mmio.h
> @@ -18,8 +18,26 @@ struct xe_device;
>  
>  int xe_mmio_init(struct xe_device *xe);
>  
> +static inline void xe_mmio_assert_forcewake(struct xe_gt *gt,
> +					    struct xe_reg reg)
> +{
> +	struct xe_force_wake *fw = &gt->mmio.fw;
> +
> +	/* No need for forcewake in this range */
> +	if (reg.addr >= 0x40000 && reg.addr < 0x116000)
> +		return;
> +
> +	/*
> +	 * XXX: Weak. It checks for some domain, but impossible to determine
> +	 * if the right one is selected, or if it was set by the same caller
> +	 */
> +	WARN_ON(!fw->awake_domains);
> +}
> +
>  static inline u8 xe_mmio_read8(struct xe_gt *gt, struct xe_reg reg)
>  {
> +	xe_mmio_assert_forcewake(gt, reg);
> +
>  	if (reg.addr < gt->mmio.adj_limit)
>  		reg.addr += gt->mmio.adj_offset;
>  
> @@ -29,6 +47,8 @@ static inline u8 xe_mmio_read8(struct xe_gt *gt, struct xe_reg reg)
>  static inline void xe_mmio_write32(struct xe_gt *gt,
>  				   struct xe_reg reg, u32 val)
>  {
> +	xe_mmio_assert_forcewake(gt, reg);
> +
>  	if (reg.addr < gt->mmio.adj_limit)
>  		reg.addr += gt->mmio.adj_offset;
>  
> @@ -37,6 +57,8 @@ static inline void xe_mmio_write32(struct xe_gt *gt,
>  
>  static inline u32 xe_mmio_read32(struct xe_gt *gt, struct xe_reg reg)
>  {
> +	xe_mmio_assert_forcewake(gt, reg);
> +
>  	if (reg.addr < gt->mmio.adj_limit)
>  		reg.addr += gt->mmio.adj_offset;
>  
> @@ -58,6 +80,8 @@ static inline u32 xe_mmio_rmw32(struct xe_gt *gt, struct xe_reg reg, u32 clr,
>  static inline void xe_mmio_write64(struct xe_gt *gt,
>  				   struct xe_reg reg, u64 val)
>  {
> +	xe_mmio_assert_forcewake(gt, reg);
> +
>  	if (reg.addr < gt->mmio.adj_limit)
>  		reg.addr += gt->mmio.adj_offset;
>  
> @@ -66,6 +90,8 @@ static inline void xe_mmio_write64(struct xe_gt *gt,
>  
>  static inline u64 xe_mmio_read64(struct xe_gt *gt, struct xe_reg reg)
>  {
> +	xe_mmio_assert_forcewake(gt, reg);
> +
>  	if (reg.addr < gt->mmio.adj_limit)
>  		reg.addr += gt->mmio.adj_offset;
>  
> -- 
> 2.39.2
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation


More information about the Intel-xe mailing list