[Intel-xe] [RFC] drm/xe: A minimal assert for some forcewake domain on xe_mmio.
Rodrigo Vivi
rodrigo.vivi at kernel.org
Wed May 24 17:06:06 UTC 2023
On Mon, May 22, 2023 at 08:28:05PM -0700, Matt Roper wrote:
> On Mon, May 22, 2023 at 06:15:27PM -0400, Rodrigo Vivi wrote:
> > It is the maximum protection we can do with the current infrastructure.
> >
> > Cc: John Harrison <John.C.Harrison at Intel.com>
> > Cc: Matthew Auld <matthew.auld at intel.com>
> > Cc: Francois Dugast <francois.dugast at intel.com>
> > Cc: Lucas De Marchi <lucas.demarchi at intel.com>
> > Cc: Jani Nikula <jani.nikula at intel.com>
> > Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > Cc: Matt Roper <matthew.d.roper at intel.com>
> > Cc: Matthew Brost <matthew.brost at intel.com>
> > Cc: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > ---
> >
> >
> > RFC
> > ===
> >
> > Okay, so, this is more an RFC to brainstorm the future of the force_wake in
> > Xe than anything else.
> >
> > On i915 the force_wake is built-in the mmio functions at uncore component.
> >
> > With that approach we had few historical issues iirc:
> >
> > 1. Display performance with vblank evasion that requested uncore to provide
> > the 'fw' variantes that are actually the ones that avoid fw (contrary to what
> > the name suggests).
>
> In i915 there were more differences between fw and non-fw variants
> of register functions than just forcewake handling:
>
> * _fw() functions assumed that the caller was already holding
> forcewake, whereas the non-fw functions would obtain it themselves
> * _fw() functions also assumed that the caller was already holding
> uncore->lock, whereas the non-fw functions would obtain and release
> the lock around each register access
> * _fw() functions do no tracing and no debug assertions
> * _fw() functions do not check for unclaimed MMIO
>
> I don't think the first bullet there (forcewake) really mattered much
> from a performance perspective. For display registers, a quick binary
> search of the FW table (just a handful of CPU cycles) would quickly
> determine that no forcewake domains were needed for those register
> offsets, so we wouldn't be doing anything with forcewake at the hardware
> level at all. For display registers, the more performance-relevant
> aspects of using the _fw() functions was doing all your register
> accesses together without contention with other MMIO work. I.e.,
> holding the uncore lock over an entire set of registers rather than
> grabbing/releasing it for each one, and (for vblank evasion
> specifically) doing it all while interrupts were disabled. For debug
> drivers, there was also a bunch of other stuff that the _fw() functions
> bypassed (e.g., tripling of the number of register accesses due to
> reading FPGA_DBG before/after each register to look for unclaimed
> accesses).
>
> >
> > 2. Missed ranges updates. Sometimes we messed up with the ranges, there were
> > other times that the spec was updated and we didn't get the notification, and
> > there were cases that the BSpec had bugs.
> >
> > For these reasons in Xe we have decided to let to the caller the
> > responsibility to set the force_wake bits for their domains before doing
> > the MMIO.
>
> I don't think #2 was a reason to skip forcewake in Xe. Not noticing a
> bspec update (or the bspec itself having incorrect information) would
> lead to even more mistakes if the caller has to explicitly grab the
> appropriate domains than if the driver does it implicitly. The explicit
> handling only helps in the subset of cases where we blindly grab all of
> the domains (as we do during part of init).
>
>
> >
> > However I'm seeing many questions and doubts popping up:
> >
> > 1. Are we confident that we are not missing any wake-up?
> > 2. Are we confident that the domains are set correctly?
> > 3. Are we not wasting power if we are waking up ALL the domains instead
> > of some specific one?
>
> I think we're only holding ALL domains during init; for most of the
> runtime MMIO accesses, I think the code is currently attempting to only
> grab the domain(s) that it thinks the registers it's accessing will
> need. Although that might be working right now, I'm a little bit
> worried about how that will scale in the long term when a single
> register might be in several different domains depending on what
> platform you're running on. There have definitely been cases in the
> past where groups of registers migrated from RENDER to GT or vice versa
> between platforms, so the exact domain you needed for an operation
> varied by platform. And things are even more complicated if you're
> doing any MMIO against the media, since the hardware seems to change
> exactly how it splits up the media power wells somewhat often.
>
> > 4. What about the display disconnection now because i915 and xe have different
> > mmio approaches but reusing i915-display?
> >
> > It looks to me that the cons of the current approach are superseding the
> > cons of the i915 approach. But I want to hear more thoughts here before
> > we decide which route to take.
> >
> > Maybe we have that domain as part of the mmio calls themselves? Maybe
> > a double approach where the caller is responsible but mmio has the range
> > information and double check it?
>
> As noted above, part of the challenge with forcewake is that even if the
> caller knows it wants to access register FOO, and even if it know FOO is
> a GT register that likely needs forcewake, it's sometimes challenging to
> make sure it's grabbing the correct domain(s) for every single platform
> the Xe driver will eventually support if the power management handling
> changes. I think that was part of the motivation for encoding the
> tables into the driver in i915. It seems like GT power wells don't
> change as much these days as they used to, but it's hard to say whether
> that will continue in the future or not. Who knows...maybe they'll
> eventually start creating dedicated domains for stuff like blitters,
> GuC, etc. rather than lumping all of those into the "GT" catchall
> domain.
>
> If we do decide to go back to implicit forcewake handling with the table
> encoded into the driver, it might be worth doing something sort of like
> what Lucas is doing in the OOB workaround series --- drop the
> per-platform tables into a human-readable text file that's more similar
> to the format used by the bspec (exact ranges, forcewake domain, MCR
> replication type, etc.) and then provide a small parser program that
> will convert that into actual code (and do things like consolidating
> adjacent ranges).
good idea!
>
> >
> > Any other idea? Thoughts?
>
> Once the big GT vs tile refactor stuff I have in flight gets finalized,
> I plan to follow up with another series that creates a more appropriate
> MMIO target for register operations rather than using "xe_gt" as the
> target (even for things completely unrelated to the small GT subset of
> hardware). My idea is that you'd grab an MMIO target structure for MMIO
> operations against a specific hardware unit, and then the info inside
> the MMIO structure would be able to figure out if there are additional
> checks and/or operations it should perform. E.g.,
>
> * mmio = xe_mmio_for_device(xe_device *xe);
> - Used to submit MMIO operations against the root tile of the PCI
> device. Only used during init (e.g., to read the registers that
> tell us which tiles exist on the platform) and during top-level
> interrupt enable/disable (since all interrupts are routed through
> the root tile).
> - No forcewake needed for register accesses through a handle of this
> type since you'd only ever be accessing sgunit registers for these
> types of things.
> - Register accesses through mmio can warn on debug builds if the
> register appears to be in a GT-related MMIO range.
>
> * mmio = xe_mmio_for_display(xe_device *xe);
> - Pretty much the same as handles returned by xe_mmio_for_device(),
> but if a handle of this type is used to try to read/write
> registers outside the display range, we could have debug builds
> throw some extra warnings.
> - Unclaimed register detection could be confined to accesses through
> these handles.
>
> * mmio = xe_mmio_for_tile(xe_tile *tile);
> - Used to access non-GT registers that reside in a specific tile.
> I.e., sgunit/soc registers.
> - As above, no forcewake needed, can make MMIO operations warn
> if used to access a GT range.
>
> * mmio = xe_mmio_for_gt(xe_gt *gt);
> - Used to access GT registers in a specific GT
> - Does automatic GSI offset translation for media GTs
> - Can either do automatic forcewake like i915 does, or can do debug
> check+warn like you have here.
> - Can make MMIO operations warn if MMIO offset is outside GT range
> - Can also trigger warnings if a GT non-GSI, non-media engine
> register is accessed from an MMIO obtained from a media GT.
Ack! This sounds like a great plan!
So this RFC discussion was much easier than I expected :)
>
> At some point we'll probably want to add some debug tracepoints for MMIO
> access to assist with debugging; the mmio structure can also help
> provide meaningful output for each type of MMIO handle as well.
>
>
> One other MMIO-related thing:
> At the moment Xe does no locking of MMIO access. I know there were
> hardware lockups on old platforms if simultaneous MMIO accesses
> happened, but I've never seen that tied to a specific HSD ticket that
> would let us know whether modern platforms are still susceptible or not.
> If it turns out they are and that we need to bring in the equivalent of
> i915's uncore lock (and figure out if any register access is
> problematic, or only certain types/ranges).
>
>
> Matt
>
> >
> > Thanks,
> > Rodrigo.
> >
> > drivers/gpu/drm/xe/xe_mmio.h | 26 ++++++++++++++++++++++++++
> > 1 file changed, 26 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_mmio.h b/drivers/gpu/drm/xe/xe_mmio.h
> > index 1407f1189b0d..6302ca85e3f4 100644
> > --- a/drivers/gpu/drm/xe/xe_mmio.h
> > +++ b/drivers/gpu/drm/xe/xe_mmio.h
> > @@ -18,8 +18,26 @@ struct xe_device;
> >
> > int xe_mmio_init(struct xe_device *xe);
> >
> > +static inline void xe_mmio_assert_forcewake(struct xe_gt *gt,
> > + struct xe_reg reg)
> > +{
> > + struct xe_force_wake *fw = >->mmio.fw;
> > +
> > + /* No need for forcewake in this range */
> > + if (reg.addr >= 0x40000 && reg.addr < 0x116000)
> > + return;
> > +
> > + /*
> > + * XXX: Weak. It checks for some domain, but impossible to determine
> > + * if the right one is selected, or if it was set by the same caller
> > + */
> > + WARN_ON(!fw->awake_domains);
> > +}
> > +
> > static inline u8 xe_mmio_read8(struct xe_gt *gt, struct xe_reg reg)
> > {
> > + xe_mmio_assert_forcewake(gt, reg);
> > +
> > if (reg.addr < gt->mmio.adj_limit)
> > reg.addr += gt->mmio.adj_offset;
> >
> > @@ -29,6 +47,8 @@ static inline u8 xe_mmio_read8(struct xe_gt *gt, struct xe_reg reg)
> > static inline void xe_mmio_write32(struct xe_gt *gt,
> > struct xe_reg reg, u32 val)
> > {
> > + xe_mmio_assert_forcewake(gt, reg);
> > +
> > if (reg.addr < gt->mmio.adj_limit)
> > reg.addr += gt->mmio.adj_offset;
> >
> > @@ -37,6 +57,8 @@ static inline void xe_mmio_write32(struct xe_gt *gt,
> >
> > static inline u32 xe_mmio_read32(struct xe_gt *gt, struct xe_reg reg)
> > {
> > + xe_mmio_assert_forcewake(gt, reg);
> > +
> > if (reg.addr < gt->mmio.adj_limit)
> > reg.addr += gt->mmio.adj_offset;
> >
> > @@ -58,6 +80,8 @@ static inline u32 xe_mmio_rmw32(struct xe_gt *gt, struct xe_reg reg, u32 clr,
> > static inline void xe_mmio_write64(struct xe_gt *gt,
> > struct xe_reg reg, u64 val)
> > {
> > + xe_mmio_assert_forcewake(gt, reg);
> > +
> > if (reg.addr < gt->mmio.adj_limit)
> > reg.addr += gt->mmio.adj_offset;
> >
> > @@ -66,6 +90,8 @@ static inline void xe_mmio_write64(struct xe_gt *gt,
> >
> > static inline u64 xe_mmio_read64(struct xe_gt *gt, struct xe_reg reg)
> > {
> > + xe_mmio_assert_forcewake(gt, reg);
> > +
> > if (reg.addr < gt->mmio.adj_limit)
> > reg.addr += gt->mmio.adj_offset;
> >
> > --
> > 2.39.2
> >
>
> --
> Matt Roper
> Graphics Software Engineer
> Linux GPU Platform Enablement
> Intel Corporation
More information about the Intel-xe
mailing list