amdgpu hangs on boot or shutdown on AMD Raven Ridge CPU (Engineer Sample)

Chris Chiu chiu at endlessm.com
Thu Feb 1 13:13:50 UTC 2018


On Thu, Feb 1, 2018 at 12:08 AM, Harry Wentland <harry.wentland at amd.com> wrote:
> On 2018-01-31 09:31 AM, Chris Chiu wrote:
>> Hi,
>>     We are working with new laptops that have the AMD Ravenl Ridge
>> chipset with this `/proc/cpuinfo`
>> https://gist.github.com/mschiu77/b06dba574e89b9a30cf4c450eaec49bc
>>
>>     With the latest kernel 4.15, there're lots of different
>> panics/oops during boot so no chance to get into X. It also happens
>> during shutdown. Then I tried to build kernel from
>> git://people.freedesktop.org/~agd5f/linux on branch
>> amd-staging-drm-next with head on commit "drm: Fix trailing semicolon"
>> and update the linux-firmware. Things seem to get better, only 1 oops
>> observed. Here's the oops
>> https://gist.github.com/mschiu77/1a68f27272b24775b2040acdb474cdd3.
>
> Hi Chris,
>
> what are the steps to reproduce this oops?
>
> Does it reproduce all the time or is it intermittent?
>
> Can you send a dmesg with amdgpu.dc_log=1, in addition to drm.debug=0xe?
>
> Thanks,
> Harry
>

I did nothing special to reproduce the oops. Boot and sometimes it
just shows blank
screen but still responds to magic sysrq. So I reboot and take the journal log.

It's intermittent, I ran into it 2 times during 13 reboots.
The logs are listed as follows
https://gist.github.com/mschiu77/9307d1ca0acd046cc6817f8cad63d79c
https://gist.github.com/mschiu77/fa81110f93428721f017cb9fbfd06fbe

One more log here. It enters X OK but after few minutes the display
went black and
only a mouse cursor left. But the mouse cursor can't even move. So I do a sysrq
reboot again.
The last error is
""
[  636.312759] endless kernel:
[drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR*
[CRTC:41:crtc-0] flip_done timed out
[  646.552344] endless kernel:
[drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR*
[CRTC:41:crtc-0] flip_done timed out
""
full log here https://gist.github.com/mschiu77/c8696e5fefb17bb1c53598214fb4e382

Only 4 times I can login X, blank screen or hangs w/o responding to
magic sysrq for
the rest. I took a picture of the only panic although I think it's not
about amdgpu.
It's here.
https://pasteboard.co/H5CUvxk.jpg

Hope they can be helpful.

Chris

>> However, I still get stuck on the following messages during boot very
>> often
>> ""
>> [    4.998241] endless kernel: [drm] amdgpu kernel modesetting enabled.
>> [    4.998288] endless kernel: checking generic (e0000000 7f0000) vs
>> hw (e0000000 10000000)
>> [    4.998289] endless kernel: fb: switching to amdgpudrmfb from EFI VGA
>> ""
>> I turned on drm.debug=0xe while booting, but no more information at this point.
>> Anything I can do at this point?
>>
>>     And there's 1 more information may be helpful. Sometimes the
>> system boots OK with the blank screen, I can't switch to virtual
>> console, but it did respond to the magic sys-rq key. The dmesg with
>> drm.debug=0xe is here
>> https://gist.github.com/mschiu77/291e47b1f07dc52be9461c55c820464c.
>>
>>     I'm pretty sure it's due to the amdgpu driver. Because when I boot
>> with my own kernel which disables the amdgpu driver, all these
>> symptoms went away. Please suggest anything I can do for this. Thanks
>>
>> Chris
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>


More information about the amd-gfx mailing list