[Intel-xe] [PATCH 11/17] drm/xe: Introduce xe_reg_t

Wed Apr 19 23:38:34 UTC 2023

On Wed, Apr 19, 2023 at 04:14:45PM -0700, Matt Roper wrote:
>On Wed, Apr 19, 2023 at 02:09:20PM -0700, Lucas De Marchi wrote:
>> On Wed, Apr 19, 2023 at 01:24:31PM -0700, Matt Roper wrote:
>> > On Wed, Apr 19, 2023 at 01:19:49PM -0700, Lucas De Marchi wrote:
>> > > On Wed, Apr 19, 2023 at 03:30:17PM -0400, Rodrigo Vivi wrote:
>> > > > On Wed, Apr 19, 2023 at 11:49:56AM -0700, Lucas De Marchi wrote:
>> > > > > On Wed, Apr 19, 2023 at 10:33:45AM -0700, Matt Roper wrote:
>> > > > > > On Wed, Apr 19, 2023 at 12:44:34AM -0700, Lucas De Marchi wrote:
>> > > > > > > Stop using i915 types for register our own xe_reg_t. Differently from
>> > > > > > > i915, this will keep under this will keep under the register definition
>> > > > > > > the knowledge for the different types of registers. For now, the "flags"
>> > > > > > > are mcr and masked, although only the former is being used.
>> > > > > > >
>> > > > > > > Most of the driver is agnostic to the register differences. Convert the
>> > > > > > > few places that care about that, namely xe_gt_mcr.c, to take the generic
>> > > > > > > type and warn if the wrong register is used.
>> > > > > >
>> > > > > > The disadvantage of this approach is that we don't get the nice
>> > > > > > type-checking that we have in i915 to catch register misuse at build
>> > > > > > time.  Instead we wind up with a bunch of run-time checks that only tell
>> > > > > > you that you used the wrong register semantics after the fact.  Wouldn't
>> > > > > > it be better to keep the strict types and let the compiler find mistakes
>> > > > > > for us at build time?
>> > > > > >
>> > > > > > The only real problem with using strict types is that there are a few
>> > > > > > places where the driver is just trying to refer to registers/offsets
>> > > > > > without actually accessing them (e.g., whitelists, OA register groups,
>> > > > >
>> > > > > that is not true.
>> > > > >
>> > > > > > etc.) and the strict typing makes it harder to provide a heterogeneous
>> > > > > > list of those registers.  But I still feel the benefits of compile-time
>> > > > > > checking outweighs the extra complexity we wind up with in a handful of
>> > > > > > places.
>> > > > >
>> > > > > I disagree here because the strict type checking is only checking for
>> > > > > "regular vs mcr". Add "masked" in the mix (that has been a huge source
>> > > > > of mistakes in i915) and you already have 4 types(?) to deal with.
>> > > > >
>> > > > > There are also places in the code where we want that "this is an
>> > > > > mcr/regular/masked register" info to be used later. See how the reg-sr is
>> > > > > handling that.
>> > > > >
>> > > > > In the end I believe having this info encoded in the regs/*.h is
>> > > > > sufficient to avoid the mistakes we had in the past.
>> > > > >
>> > > > > Any change in the code triggering these warnings has a very easy an
>> > > > > actionable fix.
>> > > > >
>> > > > > If for some reason we want to only differentiate mcr/normal as strict
>> > > > > type checks,  then I think we need to provide a better struct for the
>> > > > > places that don't care about that. Something like below maybe
>> > > > >
>> > > > > 	typedef {
>> > > > > 		xe_reg_t reg;
>> > > > > 	} xe_reg_mcr_t;
>> > > > >
>> > > > > but reg.mcr still being there. Or going the other way around:
>> > > > >
>> > > > > 	typedef union {
>> > > > > 		xe_reg_normal_t normal;
>> > > > > 		xe_reg_mcr_t	mcr;
>> > > > > 	} xe_reg_t;
>> > > > >
>> > > > > but xe_reg_normal_t and xe_reg_mcr_t actually having the same defintion.
>> > > > >
>> > > > > Lucas De Marchi
>> > > >
>> > > > I honestly liked the approach you took on this patch here. Getting rid of
>> > > > the i915_mcr is already a win by itself.
>> > > >
>> > > > One case that came to my mind though is some time sensitive mmio operations
>> > > > like display inside vblank areas. any runtime check could potentially
>> > > > increase latency, no?! If that's irrelevant time, than let's move with this
>> > > > but if it is really a problem, then explore your latest suggestion here
>> > > > with the compiler doing the checks for us seems a good alternative.
>> > >
>> > > the additional check is inside xe_gt_mcr.c... I think the additional
>> > > check is negligible compared to the other operations there, with each
>> > > access to reg X requiring other 2 mmio accesses to take place.
>> >
>> > The super critical checks (which don't exist yet in this series; they'd
>> > presumably be added in a follow-up series) are the ones to make sure you
>> > don't try to call xe_mmio_{read,write}32() on an MCR register.  Doing
>> > that can lead to all kinds of subtle bugs (which become even more
>> > problematic on the more recent platforms), and unfortunately that kind
>> > of mistake is what people are going to write in the code 99% of the time
>> > out of habit.
>>
>> I'm not sure when that would happen... when the next GPU changes a register
>> from normal to MCR, we rightfully updated the header to update its type
>> and forgot to change the calls using that register?
>
>It happened really frequently on i915 until we finally added the strict
>typing.  It's not a matter of adding new platforms, it's a matter of
>just writing new code in the driver (generally for platforms we already
>have and with register definitions that already exist).  When someone
>needs to add a new register read/write operation for something they're
>working on, they'll generally reach for xe_mmio_*() out of habit (or
>intel_uncore_*() on i915) since that's what gets used for 90% of
>register accesses.  If the register is actually an MCR register, and the
>compiler doesn't catch this for you, then you just wind up with
>incorrect semantics at runtime.
>
>Of course nothing is foolproof.  If the read/write is against a
>completely new register, and the patch is adding both an incorrect
>register definition _and_ an incorrect usage at the same time, then the
>compiler won't help there.  Although the fact that a completely new
>register is being defined makes it slightly more likely that the
>reviewers will scrutinize the register details more carefully and catch
>the incorrect definition.  It's a shame that the MCR nature of registers
>isn't documented more clearly in the bspec...
>
>Cases where the same register changed type between platforms is also a
>problematic case that will never be foolproof since we wind up with
>multiple definitions for the same register offset; if you pick the wrong
>definition _and_ the wrong access semantics, then the compiler won't be
>able to catch that either.  But again, the multiple definitions with
>different types will still make it more likely that the developer and/or
>reviewer take a closer look and realize the mistake.
>
>>
>> Currently xe_mmio_* receives u32 and we just do FOO.reg, with reg being the
>
>Right, this is obviously completely wrong and was something I raised as
>a concern months ago when the first Xe megapatch got posted.  I started
>typing up a fix for that a few weeks ago but got sidetracked and didn't
>finish it.  xe_mmio_* should obviously be accepting i915_reg_t (or a new
>Xe-specific equivalent) rather than a u32.  That will protect against
>both the 'misordered parameters' problem Jani mentioned, as well as the
>'used wrong function on an MCR register' problem that was problematic on
>i915.
>
>> name both in i915_reg_t and i915_mcr_reg_t. So, right now the different
>> types are not helping anything.  I'm planning to add xe_reg_t to
>> xe_mmio_* and have a companion xe_mmio_raw_* for the cases we use a u32
>> (one of the reasons being because xe_gt_mcr.c itself also needs to call
>> xe_mmio_*)
>>
>> How does the type protect us against "daily task of enabling the next
>> generation of GPUs?":
>>
>> 	- gpu version N+1 changed register FOO from normal to MCR.
>> 	  we didn't notice the register changed. Type doesn't help
>> 	  because we are still using the old one.  If we did notice,
>> 	  the type would only help if adopted the approach "change the
>> 	  type", build and see what breaks to fix it, which is IMO
>> 	  worse than a git-grep approach.
>>
>> That would mean the hw changed a register from being MCR to being a
>> normal register which is more rare than the oppposite: a normal register
>> becoming MCR.
>
>The strict type safety protects us in both directions.  xe_mmio_* only
>takes unicast registers (after the obvious fixes to replace the u32
>parameters), xe_gt_mcr_* only takes MCR registers.  If you pass the
>wrong register type to either type of function, the compiler catches it
>immediately.  As you noted, it's still possible to combine two bugs
>(incorrect register definition + wrong usage semantics) to sneak by the
>compiler; that will surely still happen, but we've raised the bar and
>made it harder.
>
>Having registers change between MCR and non-MCR between platforms is
>still a problem, but not the most common one based on past history.  The
>much more common source of problems is people just writing new code that
>uses registers we already have in the driver.  Assuming the definition
>is correct, the new code has to be correct too to get past the compiler.
>If all registers become the same type, as you're moving toward in this
>series, you could also probably get similar guardrails by turning
>xe_mmio_*() into macros that do a BUILD_BUG_ON() before calling the a
>real __xe_mmio_*() function, although then you run into issues anywhere
>that non-constant register variables get passed up/down the callstack.
>
>>
>> Even if we do the strict type, it doesn't protect us from such mistakes
>> when we have a normal register becoming MCR. We will have this:
>>
>> 	#define FOO	XE_REG(10)
>>
>> and will continue to happily call xe_mmio_* even if in version X of the
>> GPU the register is now mcr.
>
>When enabling a new platform and implementing the MCR stuff for the
>first time, we do need to be careful to not only add the new ranges to
>the driver, but also to go analyze any registers that live in a range
>that changed.  If there's code that will still be executed that works on
>those registers, then we need to figure out how that code needs to
>change.  We can't automatically assume that writes should become
>broadcast_writes and that reads should become read_any's (although
>that's the most common outcome); in some situations we actually need to
>re-write the code to loop through the various hardware instances and
>handle them all independently.
>
>>
>> Hence my point: the header with the register definition must be the
>> source of truth on the different register behaviors*.  The additional
>> type checks only help in some cases, and make it harder to handle
>> others.
>
>With xe it seems like we have a lot fewer areas where multiple types
>become harder.  IIRC, the biggest pain points on i915 were GVT and
>cmd_parser since both of those needed to provide giant lists of
>registers with MCR and non-MCR mixed together.  Both of those are no
>longer relevant to Xe, so I think the main areas that will still have a
>bit of pain are:
> * OA/perf register lists
> * userspace whitelist (although I'm not sure if we ever whitelist MCR
>   registers...userspace would have difficult steering to specific
>   instances)
>
>>
>> On the other hand, I think the different types would be handy when/if we
>> have the following:
>>
>> 	#define xe_mmio_read32(reg) _Generic((reg),			\
>> 		xe_reg_t: xe_mmio_raw_read32((reg).reg),		\
>> 		xe_reg_mcr_t: xe_gt_mcr_read32(reg),
>
>Unfortunately there's a trap here.  There are multiple ways to read (and
>write) a multicast register and it's very important that you explicitly
>pick the proper semantics to use.  Sometimes you want just want to read
>any random instance's value (e.g., when verifying that workarounds
>stuck), but in other cases you need to read and process every single
>instance independently (various error reporting, INSTDONE, etc.
>registers).  Likewise, sometimes it's okay to broadcast a single value
>into every instance of the register, but other times we potentially need
>to write different values into each instance, or even do per-instance
>RMW's to avoid clobbering other bitfields.

I still think the extra type gives more a false sense of safety here.
Anyway, it seems an area we've had issues in the past and it will be
hard to reach a consensus on changing it.  If we were talking about this
a few days ago, I'd be very sad because that'd meant my plans to fixup
the hacky #undef in the rtp part was dead in the water. However with the
new macros I think I can workaround that and work with both types.

So... going back to what I said above:

  	typedef {
  		xe_reg_t reg;
  	} xe_reg_mcr_t;

Would you be ok with that? The information I need would still be
available in the register definition.

Inside xe_gt_mcr.c it then does an up-cast.
We keep the up-cast macro private to xe_gt_mcr.c so nobody tries to use
it elsewhere. Example:

u32 xe_gt_mcr_unicast_read_any(struct xe_gt *gt, xe_reg_mcr_t mcr_reg)
{
	const xe_reg_t reg = to_xe_reg(mcr_reg);
	...
}

... probably to be done only in the few functions accessing reg.*, not
the ones passing it around. And then we probably need to rename reg to
"addr" or "off".

Thoughts?

thanks
Lucas De Marchi