Looking for clarifications around gfx/kcq/kiq

Yann Dirson ydirson at free.fr
Fri Dec 10 20:36:32 UTC 2021


> > amdgpu_ring_alloc() itself is unconditionally setting count_dw,
> > which looked
> > suspicious to me -- so I added the check shown below, and it does
> > look like
> > ring_alloc() gets called again too soon.  Am I right in thinking
> > this could be
> > the cause of amdgpu_ring_test_helper() failing in timeout ?
> >
> 
> Not likely.  The PSP failing to load firmware is most likely the
> problem.  You need to have a functional PSP for any of the other
> engines to be usable.  If we can't load the firmware for the
> microcontrollers, the driver can't interact with them.

Even if it has no effect on my primary issue, I'm still having doubt
on this: if we call amdgpu_ring_alloc() twice without ensuring the
allocated space has been padded with nop's (ie. 0xFFFFFFFF, right ?)
what happens when the GFX IP (or should we rather say "GC"?) will
parse those ?

My reading of gfx_enable_kcq() is that it is in this case.  Isn't
it missing a call to ring_commit() before ring_test() ?

> 
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> > @@ -70,6 +70,9 @@ int amdgpu_ring_alloc(struct amdgpu_ring *ring,
> > unsigned ndw)
> >         if (WARN_ON_ONCE(ndw > ring->max_dw))
> >                 return -ENOMEM;
> >
> > +       /* check we're not allocating too fast */
> > +       WARN_ON_ONCE(ring->count_dw);
> > +
> >         ring->count_dw = ndw;
> >         ring->wptr_old = ring->wptr;
> >
> >
> > About gfx_v9_0_sw_fini():
> > - the 2 calls to bo_free are called here without condition, whereas
> > they are
> >   allocated from rlc_init, not directly from sw_init.  Is this
> >   asymmetry wanted ?
> >
> >
> > Maybe such info should join the documentation at some point?
> 
> Yeah, would be useful.
> 
> Alex
> 
> >
> > [0]
> > https://lists.freedesktop.org/archives/amd-gfx/2021-November/071855.html
> 


More information about the amd-gfx mailing list