Looking for clarifications around gfx/kcq/kiq
Yann Dirson
ydirson at free.fr
Fri Dec 10 20:36:32 UTC 2021
> > amdgpu_ring_alloc() itself is unconditionally setting count_dw,
> > which looked
> > suspicious to me -- so I added the check shown below, and it does
> > look like
> > ring_alloc() gets called again too soon. Am I right in thinking
> > this could be
> > the cause of amdgpu_ring_test_helper() failing in timeout ?
> >
>
> Not likely. The PSP failing to load firmware is most likely the
> problem. You need to have a functional PSP for any of the other
> engines to be usable. If we can't load the firmware for the
> microcontrollers, the driver can't interact with them.
Even if it has no effect on my primary issue, I'm still having doubt
on this: if we call amdgpu_ring_alloc() twice without ensuring the
allocated space has been padded with nop's (ie. 0xFFFFFFFF, right ?)
what happens when the GFX IP (or should we rather say "GC"?) will
parse those ?
My reading of gfx_enable_kcq() is that it is in this case. Isn't
it missing a call to ring_commit() before ring_test() ?
>
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> > @@ -70,6 +70,9 @@ int amdgpu_ring_alloc(struct amdgpu_ring *ring,
> > unsigned ndw)
> > if (WARN_ON_ONCE(ndw > ring->max_dw))
> > return -ENOMEM;
> >
> > + /* check we're not allocating too fast */
> > + WARN_ON_ONCE(ring->count_dw);
> > +
> > ring->count_dw = ndw;
> > ring->wptr_old = ring->wptr;
> >
> >
> > About gfx_v9_0_sw_fini():
> > - the 2 calls to bo_free are called here without condition, whereas
> > they are
> > allocated from rlc_init, not directly from sw_init. Is this
> > asymmetry wanted ?
> >
> >
> > Maybe such info should join the documentation at some point?
>
> Yeah, would be useful.
>
> Alex
>
> >
> > [0]
> > https://lists.freedesktop.org/archives/amd-gfx/2021-November/071855.html
>
More information about the amd-gfx
mailing list