Re: 答复: Regression with kernel 4.18 - AMD RX 550 fails IB ring test on power-up

Luís Mendes luis.p.mendes at gmail.com
Thu Jul 12 09:59:15 UTC 2018


Hi Christian,

Sure, how can I help to fix that?

Regards,
Luís

On Thu, Jul 12, 2018 at 8:13 AM, Christian König <christian.koenig at amd.com>
wrote:

> Hi Luis,
>
> well was "drm/amdgpu: defer test IBs on the rings at boot (V3)" does is
> delaying the IB test a bit and running it async to the rest of the bootup.
>
> So what most likely happens is that some hardware feature (like power or
> clock gating) which doesn't works correctly on your system kicks in and
> lets the IB test fail.
>
> It's rather likely that this problem is also responsible for the crashes
> you expect later on. So I think we should concentrate on fixing that.
>
> Regards,
> Christian.
>
>
> Am 11.07.2018 um 23:27 schrieb Luís Mendes:
>
> Hi Jim,
>
> I followed your suggestion and was able to bisect the kernel patches.
> The offending patch is: drm/amdgpu: defer test IBs on the rings at boot
> (V3)
> commit:
>
> 2c773de2ecb8c327f2448bd1eecad224e9227087
> <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v4.18-rc4&id=2c773de2ecb8c327f2448bd1eecad224e9227087>
>
> After reverting this patch the IB test succeeded with kernel v4.18-rc4 on
> both systems and the amdgpu driver was correctly loaded both on SAPPHIRE
> RX550 4GB and on SAPPHIRE RX460 2GB.
>
> The GPU hang remains, however.
>  I will try to configure a remote IPMI connection to see what is happening
> with the kernel boot or setup a serial console for the Kernel.
>
> Thanks & Regards,
> Luís
>
> On Wed, Jul 11, 2018 at 10:56 AM, jimqu <jimqu at amd.com> wrote:
>
>> HI Luis,
>>
>>
>> Let us trace the issue one by one.
>>
>>
>> IB test fail:
>>
>> This should be regression issue on 4.18, you can bisect the kernel
>> patches.
>>
>> GPU hang:
>>
>> Fix IB test fail first.
>>
>>
>> Thanks
>>
>> JimQu
>>
>>
>>
>> On 2018年07月11日 17:34, Luís Mendes wrote:
>>
>> Hi Jim,
>>
>> Thanks for your interest in this issue. Actually this is a multiple
>> issue... not only the IB ring test is failing... as I am having quite some
>> trouble getting the cards SAPPHIRE RX 550 4GB on a Tyan S7025 and SAPPHIRE
>> RX 460 2GB on a TYAN S7002 to work, both systems using same Ubuntu 18.04
>> with vanilla kernel.
>>
>> *1. May you also test earlier kernel? v4.17 or v4.16.*
>> I've tested kernels v4.17.5 and v4.16.6 with same system and both are
>> able to pass the IB ring test and system boots into X using NVIDIA as the
>> display connected card.
>> dmesg log attached for kernel 4.17.5, file TYAN_S7025_kernelv4.17.5_amdgp
>> u_IB_ring_test_OK.txt.
>>
>> *2. May you test the issue only with amdgpu?*
>> - I've tested on a TYAN S7002 system with a single SAPPHIRE RX 460 2GB,
>> on-board VGA enabled and used as primary display.
>> Kernel v4.18-rc4 fails the IB ring test, system is able to enter X
>> through the on-board VGA.
>> dmesg log attached for kernel 4.18-rc4, file
>> TYAN_S7002_kernel_v4.18-rc4_IB_ring_test_fail.txt.
>>
>> - Same TYAN S7002 system, but now with on-board VGA disabled and using RX
>> 460 as display connected card.
>> Kernels v4.17.5 and v4.16.6 are able to pass the IB ring test, but GPU
>> hangs before entering X. Don't have logs for these yet.
>>
>> Regards,
>> Luís Mendes
>> Aparapi contributor and MSc Researcher
>>
>>
>>
>>
>>
>> On Wed, Jul 11, 2018 at 3:49 AM, Qu, Jim <Jim.Qu at amd.com> wrote:
>>
>>> Hi Luis,
>>>
>>> 1. May you also test earlier kernel? v4.17 or v4.16.
>>> 2. May you test the issue only with amdgpu?
>>>
>>> Thanks
>>> JimQu
>>>
>>> ________________________________________
>>> 发件人: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> 代表 Luís Mendes <
>>> luis.p.mendes at gmail.com>
>>> 发送时间: 2018年7月11日 6:04:00
>>> 收件人: Michel Dänzer; Koenig, Christian; amd-gfx list
>>> 主题: Re: Regression with kernel 4.18 - AMD RX 550 fails IB ring test on
>>> power-up
>>>
>>> Hi,
>>>
>>> Issue remains in kernel 4.18-rc4 using SAPPHIRE RX 550 4GB.
>>>
>>> Logs follow attached.
>>>
>>> Regards,
>>> Luis
>>>
>>> On Tue, Jun 26, 2018 at 10:08 AM, Luís Mendes <luis.p.mendes at gmail.com
>>> <mailto:luis.p.mendes at gmail.com>> wrote:
>>> Hi,
>>>
>>> I've tried kernel 4.18-rc2 on a system with a NVIDIA GTX 1050 Ti and an
>>> AMD RX 550 4GB and the RX 550 card is failing the IB ring test.
>>>
>>> [    5.033217] [drm:gfx_v8_0_ring_test_ib [amdgpu]] *ERROR* amdgpu: ib
>>> test failed (scratch(0xC040)=0xFFFFFFFF)
>>> [    5.033264] [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* amdgpu:
>>> failed testing IB on ring 6 (-22).
>>>
>>> Please see the attached log.
>>>
>>> Regards,
>>> Luís
>>>
>>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20180712/c7b32aff/attachment-0001.html>


More information about the amd-gfx mailing list