[Intel-gfx] [PATCH v2] drm/i915: Make IRQ reset and postinstall multi-gt aware
Andi Shyti
andi.shyti at linux.intel.com
Thu Apr 13 16:19:16 UTC 2023
On Thu, Apr 13, 2023 at 09:03:29AM -0700, Ceraolo Spurio, Daniele wrote:
>
>
> On 4/13/2023 8:52 AM, Matt Roper wrote:
> > On Thu, Apr 13, 2023 at 03:56:21PM +0200, Andi Shyti wrote:
> > > Hi Tvrtko,
> > >
> > > (I forgot to CC Daniele)
> > >
> > > On Thu, Apr 13, 2023 at 11:41:28AM +0100, Tvrtko Ursulin wrote:
> > > > On 13/04/2023 10:20, Andi Shyti wrote:
> > > > > From: Paulo Zanoni <paulo.r.zanoni at intel.com>
> > > > >
> > > > > In multitile systems IRQ need to be reset and enabled per GT.
> > > > >
> > > > > Although in MTL the GUnit misc interrupts register set are
> > > > > available only in GT-0, we need to loop through all the GT's
> > > > > in order to initialize the media engine which lies on a different
> > > > > GT.
> > > > >
> > > > > Signed-off-by: Paulo Zanoni <paulo.r.zanoni at intel.com>
> > > > > Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > > > > Signed-off-by: Andi Shyti <andi.shyti at linux.intel.com>
> > > > > ---
> > > > > Hi,
> > > > >
> > > > > proposing again this patch, apparently GuC needs this patch to
> > > > > initialize the media GT.
> > > > What is the resolution for Matt's concern that this is wrong for MTL?
> > > There are two explanations, one easy and one less easy.
> > >
> > > The easy one: without this patch i915 doesn't boot on MTL!(*)
> > >
> > > The second explanation is that in MTL the media engine has it's
> > > own set of misc irq's registers and those are on a different GT
> > > (Daniele pointed this out).
> > Assuming you're talking about MTL_GUC_MGUC_INTR_MASK, that's not true;
> > it's just a single sgunit register (0x1900e8) that has different
> > bitfields for the primary GuC and the media GuC. So I still think we
> > should avoid looping over GTs; it's actually much simpler to handle
> > things in a single pass since we can just determine the single register
> > value once (all fields) and write it directly, instead of doing two
> > separate RMW updates to the same register to try to avoid clobbering
> > the other GuC's settings.
if we handle exceptions in a single pass wouldn't we have many
exceptions to handle in the long run?
> > For pre-MTL platforms, it's the same register, except that the bitfield
> > now devoted to the media GuC was previously used for something else
> > (scatter/gather).
>
> It's not just the GuC, the VCS/VECS engine programming is also tied to the
> media GT (via the HAS_ENGINE checks). It looks like we unconditionally
> program VCS 0 and 2, so it'll still work for MTL, but if we get a device
> with more VCS engines it'll break. Maybe we can add a MTL version of the
> function that just programs everything unconditionally? Going forward it
> should be ok to program things for engines that don't exist, but I'm not
> sure we can do that for older platforms that came before the extra engines
> were ever defined in HW.
This is more or less what Tvrtko has suggested, as well. Looks to
me like replicating some code... anyway, I will try and see how
it looks like.
Andi
PS Thanks Matt, Daniele and Tvrtko for the feedback.
> Daniele
>
> >
> >
> > Matt
> >
> > > I sent this patch not to bypass any review, but to restart the
> > > discussion as this patch was just dropped.
> > >
> > > Thanks,
> > > Andi
> > >
> > >
> > > (*)
> > > [drm] *ERROR* GT1: GUC: CT: No response for request 0x550a (fence 7)
> > > [drm] *ERROR* GT1: GUC: CT: Sending action 0x550a failed (-ETIMEDOUT) status=0X0
> > > [drm] *ERROR* GT1: GUC: Failed to enable usage stats: -ETIMEDOUT
> > > [drm] *ERROR* GT1: GuC initialization failed -ETIMEDOUT
> > > [drm] *ERROR* GT1: Enabling uc failed (-5)
> > > [drm] *ERROR* GT1: Failed to initialize GPU, declaring it wedged!
More information about the dri-devel
mailing list