[PATCH] drm/amdgpu: fix a kcq hang issue for SRIOV

Christian König christian.koenig at amd.com
Wed Mar 28 07:36:51 UTC 2018


Am 28.03.2018 um 06:36 schrieb Liu, Monk:
> The SDMA is not directly connected to the GFXHUB, so even if the SDMA would provide a single command for this the write/wait would still be executed as two operations.
>
> I don't understand this point, more details may be ??
>
> For SDMA from v148 ucode, it'll ignore PREEMPT command when it is doing SRBM_WRITE and POLL_MEM_REG on registers, so as long as SDMA is dong vm invalidate the world switch will not
> Interrupt it

Ah! Good to know, I was assuming that the GFX block might actually 
switch anyway in this case.

Christian.

>
> /Monk
>
> -----Original Message-----
> From: Koenig, Christian
> Sent: 2018年3月28日 0:30
> To: Alex Deucher <alexdeucher at gmail.com>
> Cc: Deng, Emily <Emily.Deng at amd.com>; Liu, Monk <Monk.Liu at amd.com>; amd-gfx list <amd-gfx at lists.freedesktop.org>
> Subject: Re: [PATCH] drm/amdgpu: fix a kcq hang issue for SRIOV
>
> Am 27.03.2018 um 17:52 schrieb Alex Deucher:
>> [SNIP]
>>>> 2. add the new callback implementation to gfx9 and gfx8 (I think
>>>> gfx8 will need this as well since we support sr-iov there too)
>>> gfx8 doesn't have the hardware bug which seems to make this
>>> necessary, not does it have the same VMHUB design as gfx9.
>> Oh, right, in this case it's the req/ack engines which were new for
>> soc15.  We may want the same fix for sdma4 though.
> And exactly that is one of the reasons why this workaround doesn't work correctly.
>
> The SDMA is not directly connected to the GFXHUB, so even if the SDMA would provide a single command for this the write/wait would still be executed as two operations.
>
> In other words we can again run into the problem and the same thing applies for CPU based updates.
>
> The only real workaround would be to write the request, read the register back and if the write didn't succeeded write it again.
>
> But seriously remember that this issue is not limited to the VMHUB registers. Do you want to write and read back every register to make sure the write succeeded?
>
> Regards,
> Christian.



More information about the amd-gfx mailing list