[PATCH V2 00/10] Reset improvements for GC10+

Alex Deucher alexdeucher at gmail.com
Fri May 23 14:39:04 UTC 2025


On Fri, May 23, 2025 at 10:12 AM Alex Deucher <alexdeucher at gmail.com> wrote:
>
> On Fri, May 23, 2025 at 10:03 AM Christian König
> <christian.koenig at amd.com> wrote:
> >
> > On 5/23/25 15:58, Alex Deucher wrote:
> > > I think that's probably the best option.  I was thinking we could
> > > mirror the ring frames for each gang and after a reset, we submit the
> > > unprocessed frames again.  That way we can still do a ring test to
> > > make sure the ring is functional after the reset and then submit the
> > > unprocessed work.
> >
> > Keep in mind that we can't allocate any memory during submission or in a reset.
>
> Yeah, I was thinking we'd just have a static mirror allocated upfront.
>
> >
> > I think we should just tell the newly mapped kernel ring to start to from the known good RPTR and process to whatever the current WPTR is. Only after that an IB test should be inserted.
>
> I considered that, but we don't know if the reset worked or not
> without some sort of test.  I guess we could put an IB test at the
> end, but it may take a while if there is a lot of content to process.
> I guess that's not really fundamentally different from how vmid reset
> is supposed to work anyway.  We should be able to set the requested
> wptr/rptr in the MQD when we map the ring after the reset.

I think I've got something workable.  What's the best way to keep
track of the last known good RPTR?

Alex

>
> >
> > We could also modify the conditional code used for MCBP to skip processing for a specific VMID by applying a mask instead of always checking for 0 and 1.
>
> How would that work?  I haven't paged that into my head in a while.
>
> Alex
>
> >
> > Christian.


More information about the amd-gfx mailing list