[Intel-gfx] [PATCH 0/6] i915: Simplify mmio handling & add new DG2 shadow table

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Fri Sep 10 15:03:50 UTC 2021


On 10/09/2021 15:24, Matt Roper wrote:
> On Fri, Sep 10, 2021 at 02:03:44PM +0100, Tvrtko Ursulin wrote:
>>
>> On 10/09/2021 06:33, Matt Roper wrote:
>>> Our uncore MMIO functions for reading/writing registers have become very
>>> complicated over time.  There's significant macro magic used to generate
>>> several nearly-identical functions that only really differ in terms of
>>> which platform-specific shadow register table they should check on write
>>> operations.  We can significantly simplify our MMIO handlers by storing
>>> a reference to the current platform's shadow table within the 'struct
>>> intel_uncore' the same way we already do for forcewake; this allows us
>>> to consolidate the multiple variants of each 'write' function down to
>>> just a single 'fwtable' version that gets the shadow table out of the
>>> uncore struct rather than hardcoding the name of a specific platform's
>>> table.  We can do similar consolidation on the MMIO read side by
>>> creating a single-entry forcewake table to replace the open-coded range
>>> check they had been using previously.
>>>
>>> The final patch of the series adds a new shadow table for DG2; this
>>> becomes quite clean and simple now, given the refactoring in the first
>>> five patches.
>>
>> Tidy and it ends up saving kernel binary size.
>>
>> However I am undecided yet, because one thing to note is that the trade off
>> is source code and kernel text consolidation at the expense of more indirect
>> calls at runtime and larger common read/write functions.
>>
>> To expand, current code generates a bunch of per gen functions but in doing
>> so it manages to inline a bunch of checks like NEEDS_FORCE_WAKE and BSEARCH
>> (from find_fw_domain) so at runtime each platform mmio read/write does not
>> have to do indirect calls to do lookups.
>>
>> It may matter a lot in the grand scheme of things but this trade off is
>> something to note in the cover letter I think.
> 
> That's true.  However it seems like if the extra indirect calls are good
> enough for our forcewake lookups (which are called more frequently and
> have to search through much larger tables) then using the same strategy
> for shadow registers should be less of a concern.  Plus most of
> timing-critical parts of the code don't call through this at all; they
> just grab an explicit forcewake and then issue a bunch of *_fw()
> operations that skip all the per-register forcewake and shadow handling.

With lookups you mean intel_uncore_forcewake_for_reg? Yeah I don't have 
a good idea of how many of those followed by "_fw" accessors we have vs 
"un-optimized" access. But it's a good point.

I was mostly coming from the point of view of old platforms like gen6, 
where with this series reads go from inlined checks (NEEDS_FORCE_WAKE) 
to always calling find_fw_domain. Just because it is a bit unfortunate 
to burden old CPUs (they are not getting any faster) with executing more 
code. It's not nice when old hardware gets slower and slower with 
software updates. :) But whether or not this case would at all be 
measurable.. probably not. Unless some compounding effects, like "death 
by thousand cuts", would come into play.

Regards,

Tvrtko

> But you're right that this is something I should mention more clearly in
> the cover letter.
> 
> 
> Matt
> 
>>
>> Regards,
>>
>> Tvrtko
>>
>>> Matt Roper (6):
>>>     drm/i915/uncore: Convert gen6/gen7 read operations to fwtable
>>>     drm/i915/uncore: Associate shadow table with uncore
>>>     drm/i915/uncore: Replace gen8 write functions with general fwtable
>>>     drm/i915/uncore: Drop gen11/gen12 mmio write handlers
>>>     drm/i915/uncore: Drop gen11 mmio read handlers
>>>     drm/i915/dg2: Add DG2-specific shadow register table
>>>
>>>    drivers/gpu/drm/i915/intel_uncore.c           | 190 ++++++++++--------
>>>    drivers/gpu/drm/i915/intel_uncore.h           |   7 +
>>>    drivers/gpu/drm/i915/selftests/intel_uncore.c |   1 +
>>>    3 files changed, 110 insertions(+), 88 deletions(-)
>>>
> 


More information about the Intel-gfx mailing list