[Intel-xe] [PATCH 11/17] drm/xe: Introduce xe_reg_t

Lucas De Marchi lucas.demarchi at intel.com
Wed Apr 19 21:09:20 UTC 2023


On Wed, Apr 19, 2023 at 01:24:31PM -0700, Matt Roper wrote:
>On Wed, Apr 19, 2023 at 01:19:49PM -0700, Lucas De Marchi wrote:
>> On Wed, Apr 19, 2023 at 03:30:17PM -0400, Rodrigo Vivi wrote:
>> > On Wed, Apr 19, 2023 at 11:49:56AM -0700, Lucas De Marchi wrote:
>> > > On Wed, Apr 19, 2023 at 10:33:45AM -0700, Matt Roper wrote:
>> > > > On Wed, Apr 19, 2023 at 12:44:34AM -0700, Lucas De Marchi wrote:
>> > > > > Stop using i915 types for register our own xe_reg_t. Differently from
>> > > > > i915, this will keep under this will keep under the register definition
>> > > > > the knowledge for the different types of registers. For now, the "flags"
>> > > > > are mcr and masked, although only the former is being used.
>> > > > >
>> > > > > Most of the driver is agnostic to the register differences. Convert the
>> > > > > few places that care about that, namely xe_gt_mcr.c, to take the generic
>> > > > > type and warn if the wrong register is used.
>> > > >
>> > > > The disadvantage of this approach is that we don't get the nice
>> > > > type-checking that we have in i915 to catch register misuse at build
>> > > > time.  Instead we wind up with a bunch of run-time checks that only tell
>> > > > you that you used the wrong register semantics after the fact.  Wouldn't
>> > > > it be better to keep the strict types and let the compiler find mistakes
>> > > > for us at build time?
>> > > >
>> > > > The only real problem with using strict types is that there are a few
>> > > > places where the driver is just trying to refer to registers/offsets
>> > > > without actually accessing them (e.g., whitelists, OA register groups,
>> > >
>> > > that is not true.
>> > >
>> > > > etc.) and the strict typing makes it harder to provide a heterogeneous
>> > > > list of those registers.  But I still feel the benefits of compile-time
>> > > > checking outweighs the extra complexity we wind up with in a handful of
>> > > > places.
>> > >
>> > > I disagree here because the strict type checking is only checking for
>> > > "regular vs mcr". Add "masked" in the mix (that has been a huge source
>> > > of mistakes in i915) and you already have 4 types(?) to deal with.
>> > >
>> > > There are also places in the code where we want that "this is an
>> > > mcr/regular/masked register" info to be used later. See how the reg-sr is
>> > > handling that.
>> > >
>> > > In the end I believe having this info encoded in the regs/*.h is
>> > > sufficient to avoid the mistakes we had in the past.
>> > >
>> > > Any change in the code triggering these warnings has a very easy an
>> > > actionable fix.
>> > >
>> > > If for some reason we want to only differentiate mcr/normal as strict
>> > > type checks,  then I think we need to provide a better struct for the
>> > > places that don't care about that. Something like below maybe
>> > >
>> > > 	typedef {
>> > > 		xe_reg_t reg;
>> > > 	} xe_reg_mcr_t;
>> > >
>> > > but reg.mcr still being there. Or going the other way around:
>> > >
>> > > 	typedef union {
>> > > 		xe_reg_normal_t normal;
>> > > 		xe_reg_mcr_t	mcr;
>> > > 	} xe_reg_t;
>> > >
>> > > but xe_reg_normal_t and xe_reg_mcr_t actually having the same defintion.
>> > >
>> > > Lucas De Marchi
>> >
>> > I honestly liked the approach you took on this patch here. Getting rid of
>> > the i915_mcr is already a win by itself.
>> >
>> > One case that came to my mind though is some time sensitive mmio operations
>> > like display inside vblank areas. any runtime check could potentially
>> > increase latency, no?! If that's irrelevant time, than let's move with this
>> > but if it is really a problem, then explore your latest suggestion here
>> > with the compiler doing the checks for us seems a good alternative.
>>
>> the additional check is inside xe_gt_mcr.c... I think the additional
>> check is negligible compared to the other operations there, with each
>> access to reg X requiring other 2 mmio accesses to take place.
>
>The super critical checks (which don't exist yet in this series; they'd
>presumably be added in a follow-up series) are the ones to make sure you
>don't try to call xe_mmio_{read,write}32() on an MCR register.  Doing
>that can lead to all kinds of subtle bugs (which become even more
>problematic on the more recent platforms), and unfortunately that kind
>of mistake is what people are going to write in the code 99% of the time
>out of habit.

I'm not sure when that would happen... when the next GPU changes a register
from normal to MCR, we rightfully updated the header to update its type
and forgot to change the calls using that register?

Currently xe_mmio_* receives u32 and we just do FOO.reg, with reg being the
name both in i915_reg_t and i915_mcr_reg_t. So, right now the different
types are not helping anything.  I'm planning to add xe_reg_t to
xe_mmio_* and have a companion xe_mmio_raw_* for the cases we use a u32
(one of the reasons being because xe_gt_mcr.c itself also needs to call
xe_mmio_*)

How does the type protect us against "daily task of enabling the next
generation of GPUs?":

	- gpu version N+1 changed register FOO from normal to MCR.
	  we didn't notice the register changed. Type doesn't help
	  because we are still using the old one.  If we did notice,
	  the type would only help if adopted the approach "change the
	  type", build and see what breaks to fix it, which is IMO
	  worse than a git-grep approach.

That would mean the hw changed a register from being MCR to being a
normal register which is more rare than the oppposite: a normal register
becoming MCR.

Even if we do the strict type, it doesn't protect us from such mistakes
when we have a normal register becoming MCR. We will have this:

	#define FOO	XE_REG(10)

and will continue to happily call xe_mmio_* even if in version X of the
GPU the register is now mcr.

Hence my point: the header with the register definition must be the
source of truth on the different register behaviors*.  The additional
type checks only help in some cases, and make it harder to handle
others.

On the other hand, I think the different types would be handy when/if we
have the following:

	#define xe_mmio_read32(reg) _Generic((reg),			\
		xe_reg_t: xe_mmio_raw_read32((reg).reg),		\
		xe_reg_mcr_t: xe_gt_mcr_read32(reg),
	)

because that would eliminate an entire class of bugs, encoding in each
call place what is the right function to be called based on the
declaration in the header, with no additional overhead... 

Lucas De Marchi

* https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/204 and ideally
   be generated from the spec with proper filters in place so we don't
   blow out the size of our headers.  My goal is to clean our headers and
   have them in a state that would be 90% identical to something
   generated by a script parsing the spec.

>
>
>Matt
>
>>
>> Besides, there is no such user in display/
>>
>> 	$ git grep xe_gt_mcr -- drivers/gpu/drm/xe/display/
>>
>>
>> Lucas De Marchi
>
>-- 
>Matt Roper
>Graphics Software Engineer
>Linux GPU Platform Enablement
>Intel Corporation


More information about the Intel-xe mailing list