[Intel-gfx] [PATCH v2] drm/i915: Make IRQ reset and postinstall multi-gt aware

Matt Roper matthew.d.roper at intel.com
Thu Apr 13 16:38:02 UTC 2023


On Thu, Apr 13, 2023 at 06:19:16PM +0200, Andi Shyti wrote:
> On Thu, Apr 13, 2023 at 09:03:29AM -0700, Ceraolo Spurio, Daniele wrote:
> > 
> > 
> > On 4/13/2023 8:52 AM, Matt Roper wrote:
> > > On Thu, Apr 13, 2023 at 03:56:21PM +0200, Andi Shyti wrote:
> > > > Hi Tvrtko,
> > > > 
> > > > (I forgot to CC Daniele)
> > > > 
> > > > On Thu, Apr 13, 2023 at 11:41:28AM +0100, Tvrtko Ursulin wrote:
> > > > > On 13/04/2023 10:20, Andi Shyti wrote:
> > > > > > From: Paulo Zanoni <paulo.r.zanoni at intel.com>
> > > > > > 
> > > > > > In multitile systems IRQ need to be reset and enabled per GT.
> > > > > > 
> > > > > > Although in MTL the GUnit misc interrupts register set are
> > > > > > available only in GT-0, we need to loop through all the GT's
> > > > > > in order to initialize the media engine which lies on a different
> > > > > > GT.
> > > > > > 
> > > > > > Signed-off-by: Paulo Zanoni <paulo.r.zanoni at intel.com>
> > > > > > Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > > > > > Signed-off-by: Andi Shyti <andi.shyti at linux.intel.com>
> > > > > > ---
> > > > > > Hi,
> > > > > > 
> > > > > > proposing again this patch, apparently GuC needs this patch to
> > > > > > initialize the media GT.
> > > > > What is the resolution for Matt's concern that this is wrong for MTL?
> > > > There are two explanations, one easy and one less easy.
> > > > 
> > > > The easy one: without this patch i915 doesn't boot on MTL!(*)
> > > > 
> > > > The second explanation is that in MTL the media engine has it's
> > > > own set of misc irq's registers and those are on a different GT
> > > > (Daniele pointed this out).
> > > Assuming you're talking about MTL_GUC_MGUC_INTR_MASK, that's not true;
> > > it's just a single sgunit register (0x1900e8) that has different
> > > bitfields for the primary GuC and the media GuC.  So I still think we
> > > should avoid looping over GTs; it's actually much simpler to handle
> > > things in a single pass since we can just determine the single register
> > > value once (all fields) and write it directly, instead of doing two
> > > separate RMW updates to the same register to try to avoid clobbering
> > > the other GuC's settings.
> 
> if we handle exceptions in a single pass wouldn't we have many
> exceptions to handle in the long run?

I don't think so, it basically boils down to something along the lines
of

        if (MEDIA_VER(i915) >= 13)
                val = HIGH_BITS | LOW_BITS;
        else
                val = HIGH_BITS;

        ...

        intel_uncore_write(val);

which isn't really any more complicated than today's logic:

        called for each gt {
                ...

                if (gt is MEDIA)
                        bits = LOW_BITS;
                else
                        bits = HIGH_BITS;

                ...

                intel_uncore_rmw(bits);
        }


Matt

> 
> > > For pre-MTL platforms, it's the same register, except that the bitfield
> > > now devoted to the media GuC was previously used for something else
> > > (scatter/gather).
> > 
> > It's not just the GuC, the VCS/VECS engine programming is also tied to the
> > media GT (via the HAS_ENGINE checks). It looks like we unconditionally
> > program VCS 0 and 2, so it'll still work for MTL, but if we get a device
> > with more VCS engines it'll break. Maybe we can add a MTL version of the
> > function that just programs everything unconditionally? Going forward it
> > should be ok to program things for engines that don't exist, but I'm not
> > sure we can do that for older platforms that came before the extra engines
> > were ever defined in HW.
> 
> This is more or less what Tvrtko has suggested, as well. Looks to
> me like replicating some code... anyway, I will try and see how
> it looks like.
> 
> Andi
> 
> PS Thanks Matt, Daniele and Tvrtko for the feedback.
> 
> > Daniele
> > 
> > > 
> > > 
> > > Matt
> > > 
> > > > I sent this patch not to bypass any review, but to restart the
> > > > discussion as this patch was just dropped.
> > > > 
> > > > Thanks,
> > > > Andi
> > > > 
> > > > 
> > > > (*)
> > > > [drm] *ERROR* GT1: GUC: CT: No response for request 0x550a (fence 7)
> > > > [drm] *ERROR* GT1: GUC: CT: Sending action 0x550a failed (-ETIMEDOUT) status=0X0
> > > > [drm] *ERROR* GT1: GUC: Failed to enable usage stats: -ETIMEDOUT
> > > > [drm] *ERROR* GT1: GuC initialization failed -ETIMEDOUT
> > > > [drm] *ERROR* GT1: Enabling uc failed (-5)
> > > > [drm] *ERROR* GT1: Failed to initialize GPU, declaring it wedged!

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation


More information about the Intel-gfx mailing list