Regression with kernel 4.20 on armhf

Luís Mendes luis.p.mendes at gmail.com
Fri Dec 28 14:05:57 UTC 2018


Hi Alex,

Before all... Have a nice holidays! Happy new year!!

- Okay, so it looks like sometimes the driver is able to enter
graphical mode with the Polaris card, but most of the time it fails
before with:
[   49.762704] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
timeout, signaled seq=2, emitted seq=3

- This is something that is happening sporadically but in a less
intensive way in 4.17, 4.18 and 4.19 kernels, so this is actually not
a regression, but rather an existent issue, which maybe the patch
"drm/amdgpu/gfx_v8_0: Reorder the gfx, kiq and kcq ring tests
sequence" solves. I tried to backport it to 4.20, but had no
improvement. Need to try with the git version, or rc1.

- This hang happens after the console is displayed in the screen, but
before switching to graphical mode with X.

- However if X is entered then the driver is stable and can be used
for long periods.

Regards,
Luís Mendes

On Tue, Dec 18, 2018 at 11:16 PM Luís Mendes <luis.p.mendes at gmail.com> wrote:
>
> Hi Alex,
>
> I am already using drm_arch_can_wc_memory() set to false.
> I will try to bisect...
>
> Regards,
> Luís
>
> On Tue, Dec 18, 2018 at 7:03 PM Alex Deucher <alexdeucher at gmail.com> wrote:
> >
> > On Tue, Dec 18, 2018 at 8:58 AM Luís Mendes <luis.p.mendes at gmail.com> wrote:
> > >
> > > Hi Christian,
> > >
> > > I've been using a Sapphire RX 550 and a Sapphire RX 460 on a custom
> > > armhf board that runs well with Linux 4.19.9 at least, but now
> > > starting with Linux kernel 4.20, I'm having a gpu hang, right after
> > > the console being displayed, but before entering in graphical mode,
> > > when starting X session.
> > > I'm only reporting this now, because there was a PCI commit for mvebu
> > > that also entered for linux-4.20 that caused a kernel oops during
> > > pci_map_rom call in amdgpu initialization code. I've reverted that
> > > patch, but now amdgpu is hanging.
> >
> > It would be useful if you could bisect.  This is the first I've heard
> > of amdgpu working on an ARM board without write combining (WC)
> > disabled.  You might check to see if disabling WC helps.  Return false
> > in drm_arch_can_wc_memory().
> >
> > Alex
> >
> > >
> > >
> > > [   24.801861] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
> > > timeout, signaled seq=2, emitted seq=3
> > >
> > > 02:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
> > > [AMD/ATI] Baffin [Polaris11] (rev ff) (prog-if 00 [VGA controller])
> > >     Subsystem: Sapphire Technology Limited Baffin [Radeon RX 560]
> > >     Flags: bus master, fast devsel, latency 0, IRQ 51
> > >     Memory at d0000000 (64-bit, prefetchable) [size=256M]
> > >     Memory at e0000000 (64-bit, prefetchable) [size=2M]
> > >     I/O ports at 10000 [size=256]
> > >     Memory at e0200000 (32-bit, non-prefetchable) [size=256K]
> > >     Expansion ROM at e0240000 [disabled] [size=128K]
> > >     Capabilities: <access denied>
> > >     Kernel driver in use: amdgpu
> > >     Kernel modules: amdgpu
> > >
> > > dmesg follows in attachment.
> > >
> > > Regards,
> > > Luís
> > > _______________________________________________
> > > amd-gfx mailing list
> > > amd-gfx at lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx


More information about the amd-gfx mailing list