Re: 答复: Regression with kernel 4.18 - AMD RX 550 fails IB ring test on power-up

Christian König christian.koenig at amd.com
Thu Jul 12 07:13:49 UTC 2018


Hi Luis,

well was "drm/amdgpu: defer test IBs on the rings at boot (V3)" does is 
delaying the IB test a bit and running it async to the rest of the bootup.

So what most likely happens is that some hardware feature (like power or 
clock gating) which doesn't works correctly on your system kicks in and 
lets the IB test fail.

It's rather likely that this problem is also responsible for the crashes 
you expect later on. So I think we should concentrate on fixing that.

Regards,
Christian.

Am 11.07.2018 um 23:27 schrieb Luís Mendes:
> Hi Jim,
>
> I followed your suggestion and was able to bisect the kernel patches.
> The offending patch is: drm/amdgpu: defer test IBs on the rings at 
> boot (V3)
> commit:
>
> 	2c773de2ecb8c327f2448bd1eecad224e9227087 
> <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v4.18-rc4&id=2c773de2ecb8c327f2448bd1eecad224e9227087> 
>
>
>
> After reverting this patch the IB test succeeded with kernel v4.18-rc4 
> on both systems and the amdgpu driver was correctly loaded both on 
> SAPPHIRE RX550 4GB and on SAPPHIRE RX460 2GB.
>
> The GPU hang remains, however.
>  I will try to configure a remote IPMI connection to see what is 
> happening with the kernel boot or setup a serial console for the Kernel.
>
> Thanks & Regards,
> Luís
>
> On Wed, Jul 11, 2018 at 10:56 AM, jimqu <jimqu at amd.com 
> <mailto:jimqu at amd.com>> wrote:
>
>     HI Luis,
>
>
>     Let us trace the issue one by one.
>
>
>     IB test fail:
>
>     This should be regression issue on 4.18, you can bisect the kernel
>     patches.
>
>     GPU hang:
>
>     Fix IB test fail first.
>
>
>     Thanks
>
>     JimQu
>
>
>
>     On 2018年07月11日 17:34, Luís Mendes wrote:
>>     Hi Jim,
>>
>>     Thanks for your interest in this issue. Actually this is a
>>     multiple issue... not only the IB ring test is failing... as I am
>>     having quite some trouble getting the cards SAPPHIRE RX 550 4GB
>>     on a Tyan S7025 and SAPPHIRE RX 460 2GB on a TYAN S7002 to work,
>>     both systems using same Ubuntu 18.04 with vanilla kernel.
>>
>>     *1. May you also test earlier kernel? v4.17 or v4.16.*
>>     I've tested kernels v4.17.5 and v4.16.6 with same system and both
>>     are able to pass the IB ring test and system boots into X using
>>     NVIDIA as the display connected card.
>>     dmesg log attached for kernel 4.17.5, file
>>     TYAN_S7025_kernelv4.17.5_amdgpu_IB_ring_test_OK.txt.
>>
>>     *2. May you test the issue only with amdgpu?*
>>     - I've tested on a TYAN S7002 system with a single SAPPHIRE RX
>>     460 2GB, on-board VGA enabled and used as primary display.
>>     Kernel v4.18-rc4 fails the IB ring test, system is able to enter
>>     X through the on-board VGA.
>>     dmesg log attached for kernel 4.18-rc4, file
>>     TYAN_S7002_kernel_v4.18-rc4_IB_ring_test_fail.txt.
>>
>>     - Same TYAN S7002 system, but now with on-board VGA disabled and
>>     using RX 460 as display connected card.
>>     Kernels v4.17.5 and v4.16.6 are able to pass the IB ring test,
>>     but GPU hangs before entering X. Don't have logs for these yet.
>>
>>     Regards,
>>     Luís Mendes
>>     Aparapi contributor and MSc Researcher
>>
>>
>>
>>
>>
>>     On Wed, Jul 11, 2018 at 3:49 AM, Qu, Jim <Jim.Qu at amd.com
>>     <mailto:Jim.Qu at amd.com>> wrote:
>>
>>         Hi Luis,
>>
>>         1. May you also test earlier kernel? v4.17 or v4.16.
>>         2. May you test the issue only with amdgpu?
>>
>>         Thanks
>>         JimQu
>>
>>         ________________________________________
>>         发件人: amd-gfx <amd-gfx-bounces at lists.freedesktop.org
>>         <mailto:amd-gfx-bounces at lists.freedesktop.org>> 代表 Luís
>>         Mendes <luis.p.mendes at gmail.com <mailto:luis.p.mendes at gmail.com>>
>>         发送时间: 2018年7月11日 6:04:00
>>         收件人: Michel Dänzer; Koenig, Christian; amd-gfx list
>>         主题: Re: Regression with kernel 4.18 - AMD RX 550 fails IB
>>         ring test on power-up
>>
>>         Hi,
>>
>>         Issue remains in kernel 4.18-rc4 using SAPPHIRE RX 550 4GB.
>>
>>         Logs follow attached.
>>
>>         Regards,
>>         Luis
>>
>>         On Tue, Jun 26, 2018 at 10:08 AM, Luís Mendes
>>         <luis.p.mendes at gmail.com
>>         <mailto:luis.p.mendes at gmail.com><mailto:luis.p.mendes at gmail.com
>>         <mailto:luis.p.mendes at gmail.com>>> wrote:
>>         Hi,
>>
>>         I've tried kernel 4.18-rc2 on a system with a NVIDIA GTX 1050
>>         Ti and an AMD RX 550 4GB and the RX 550 card is failing the
>>         IB ring test.
>>
>>         [    5.033217] [drm:gfx_v8_0_ring_test_ib [amdgpu]] *ERROR*
>>         amdgpu: ib test failed (scratch(0xC040)=0xFFFFFFFF)
>>         [    5.033264] [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR*
>>         amdgpu: failed testing IB on ring 6 (-22).
>>
>>         Please see the attached log.
>>
>>         Regards,
>>         Luís
>>
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20180712/8f3fe772/attachment.html>


More information about the amd-gfx mailing list