[Intel-gfx] [PATCH 0/6] i915: Simplify mmio handling & add new DG2 shadow table
Matt Roper
matthew.d.roper at intel.com
Fri Sep 10 15:07:17 UTC 2021
On Fri, Sep 10, 2021 at 04:03:50PM +0100, Tvrtko Ursulin wrote:
>
> On 10/09/2021 15:24, Matt Roper wrote:
> > On Fri, Sep 10, 2021 at 02:03:44PM +0100, Tvrtko Ursulin wrote:
> > >
> > > On 10/09/2021 06:33, Matt Roper wrote:
> > > > Our uncore MMIO functions for reading/writing registers have become very
> > > > complicated over time. There's significant macro magic used to generate
> > > > several nearly-identical functions that only really differ in terms of
> > > > which platform-specific shadow register table they should check on write
> > > > operations. We can significantly simplify our MMIO handlers by storing
> > > > a reference to the current platform's shadow table within the 'struct
> > > > intel_uncore' the same way we already do for forcewake; this allows us
> > > > to consolidate the multiple variants of each 'write' function down to
> > > > just a single 'fwtable' version that gets the shadow table out of the
> > > > uncore struct rather than hardcoding the name of a specific platform's
> > > > table. We can do similar consolidation on the MMIO read side by
> > > > creating a single-entry forcewake table to replace the open-coded range
> > > > check they had been using previously.
> > > >
> > > > The final patch of the series adds a new shadow table for DG2; this
> > > > becomes quite clean and simple now, given the refactoring in the first
> > > > five patches.
> > >
> > > Tidy and it ends up saving kernel binary size.
> > >
> > > However I am undecided yet, because one thing to note is that the trade off
> > > is source code and kernel text consolidation at the expense of more indirect
> > > calls at runtime and larger common read/write functions.
> > >
> > > To expand, current code generates a bunch of per gen functions but in doing
> > > so it manages to inline a bunch of checks like NEEDS_FORCE_WAKE and BSEARCH
> > > (from find_fw_domain) so at runtime each platform mmio read/write does not
> > > have to do indirect calls to do lookups.
> > >
> > > It may matter a lot in the grand scheme of things but this trade off is
> > > something to note in the cover letter I think.
> >
> > That's true. However it seems like if the extra indirect calls are good
> > enough for our forcewake lookups (which are called more frequently and
> > have to search through much larger tables) then using the same strategy
> > for shadow registers should be less of a concern. Plus most of
> > timing-critical parts of the code don't call through this at all; they
> > just grab an explicit forcewake and then issue a bunch of *_fw()
> > operations that skip all the per-register forcewake and shadow handling.
>
> With lookups you mean intel_uncore_forcewake_for_reg? Yeah I don't have a
> good idea of how many of those followed by "_fw" accessors we have vs
> "un-optimized" access. But it's a good point.
>
> I was mostly coming from the point of view of old platforms like gen6, where
> with this series reads go from inlined checks (NEEDS_FORCE_WAKE) to always
> calling find_fw_domain. Just because it is a bit unfortunate to burden old
> CPUs (they are not getting any faster) with executing more code. It's not
> nice when old hardware gets slower and slower with software updates. :) But
> whether or not this case would at all be measurable.. probably not. Unless
> some compounding effects, like "death by thousand cuts", would come into
> play.
Chris pointed out in an offline mail that NEEDS_FORCE_WAKE does cut cut
out a lot of display MMIO lookups. So I think it might be worth adding
that back, but also adding an "|| GEN11_BSD_RING_BASE" so that it will
still be accurate for the newer platforms too.
But I think another thing to consider here would be that we might want
to switch our intel_de_{read,write} wrappers to call raw mmio directly,
to completely bypass forcewake and shadow logic.
Matt
>
> Regards,
>
> Tvrtko
>
> > But you're right that this is something I should mention more clearly in
> > the cover letter.
> >
> >
> > Matt
> >
> > >
> > > Regards,
> > >
> > > Tvrtko
> > >
> > > > Matt Roper (6):
> > > > drm/i915/uncore: Convert gen6/gen7 read operations to fwtable
> > > > drm/i915/uncore: Associate shadow table with uncore
> > > > drm/i915/uncore: Replace gen8 write functions with general fwtable
> > > > drm/i915/uncore: Drop gen11/gen12 mmio write handlers
> > > > drm/i915/uncore: Drop gen11 mmio read handlers
> > > > drm/i915/dg2: Add DG2-specific shadow register table
> > > >
> > > > drivers/gpu/drm/i915/intel_uncore.c | 190 ++++++++++--------
> > > > drivers/gpu/drm/i915/intel_uncore.h | 7 +
> > > > drivers/gpu/drm/i915/selftests/intel_uncore.c | 1 +
> > > > 3 files changed, 110 insertions(+), 88 deletions(-)
> > > >
> >
--
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795
More information about the dri-devel
mailing list