Regression: RX 470 fails to boot with amdgpu.dpm=1 on kernel 6.7+
Ozgur Kara
ozgur at goosey.org
Thu May 22 12:26:12 UTC 2025
Durmuş <dozaltay at gmail.com>, 22 May 2025 Per, 15:15 tarihinde şunu yazdı:
>
> I'm using dual monitors. I disconnected the HDMI to test with a single
> screen, but the result was the same. I also swapped the HDMI ports,
> but the issue still persisted.
> I'm not using DisplayPort — in fact, it's a bit weird: I convert VGA
> to HDMI and connect it to the graphics card. I'm not an expert of
> course, but since there were no issues on the LTS kernel and the
> problems started with kernels after 6.7, it made me think it might be
> a kernel issue.
> If needed, I’ll set dpm=0 when I install (i don't know when) Linux
> again and test it.
> If I remember correctly, when I added amdgpu.dc=0 to GRUB, nothing
> changed — the system still froze after GRUB.
>
Hello,
i suspect this is related to latest patch rather than a kernel bug so
i will add Aurabindo because you may be affected after cfb2d41831ee
commit.
first of all, is there any chance you can revert this commit and test kernel?
$ git revert cfb2d41831ee
So after commit, dmcub ring calls became much higher and some power
states became unstable i dont know i'm not expert but these usually
have to do with things like dmcub firmware and power gating (gfxoff)
or post-reset ring buffer access.
maybe this commit is that vmin/vmax update call may now be made much
more frequently and this may cause dmcub to not synchronize properly
some power states to become unstable or firmware to crash.
we might need to look at the contents of
/sys/module/amdgpu/parameters/force_vmin_vmax_update but vmin vmax
potential call height might be giving an error.
So I added Aurabindo Pillai, should have added you after 3 different
bug reports.
Regards
Ozgur
> On Thu, May 22, 2025 at 3:05 PM Ozgur Kara <ozgur at goosey.org> wrote:
> >
> > Durmuş <dozaltay at gmail.com>, 22 May 2025 Per, 14:58 tarihinde şunu yazdı:
> > >
> > > Hey, thanks for the reply, but I don't use Linux anymore, so I can't
> > > provide any logs or test it further. Also, FYI, this bug has been
> > > around since kernel v6.7. If I install Linux again soon, I'll try to
> > > test it. Could you please advise what I should do about amdgpu.dpm?
> > > Should it stay at 0 or be set to 1? When I try booting with 1, the PC
> > > freezes right after the grub screen. I've used Linux for 2-3 months
> > > but still don’t really know how to debug these kinds of errors
> > > properly. Thanks!
> > >
> >
> > Hello,
> >
> > not problem maybe we should talk about this separately but kernel
> > lists are progressing complicated with too many development patch
> > content that is not very suitable for this.
> > we can also see it as a problem with kernel, gpus or amd company and
> > too many firmware and drivers.
> >
> > if it is hardware based especially gpu related, kernel doesnt
> > intervene fully at this point.
> > the system can be opened with amdgpu.dpm=0 but this is not correct and
> > you did a very good job reporting it.
> > maybe by adding amdgpu.dc=0 the display core is disabled but this
> > prevents you from getting 144 mhz.
> >
> > we should make sure that there is the correct firmware under
> > /lib/firmware/amdgpu.
> > did you use DisplayPort and did you get 144 mhz output?
> >
> > $ journalctl -b -1 will give you some information.
> > $ glxinfo | grep OpenGL can also give you the problem or error.
> >
> > So kernel developers and AMD developers should look into this issue
> > but i think it is most likely a firmware blockage on the AMD side not
> > a kernel side.
> >
> > Regards
> >
> > Ozgur
> >
> > > On Thu, May 22, 2025 at 2:52 PM Ozgur Kara <ozgur at goosey.org> wrote:
> > > >
> > > > Durmuş <dozaltay at gmail.com>, 22 May 2025 Per, 14:27 tarihinde şunu yazdı:
> > > > >
> > > > > Hello,
> > > > >
> > > >
> > > > Hello,
> > > >
> > > > did you get a message in dmesg from kernel, for example an error like this?
> > > >
> > > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1106268
> > > >
> > > > The dmesg command will give you an output maybe journalctl output or
> > > > mesa (glxinfo) output would also be sufficient because we need to know
> > > > which upstream it is affected by.
> > > > and thanks for report.
> > > >
> > > > Note: because there are two similar errors i added the necessary
> > > > maintainers for upstream.
> > > >
> > > > Regards
> > > >
> > > > Ozgur
> > > >
> > > > > I'm experiencing a critical issue on my system with an AMD RX 470 GPU.
> > > > > When booting with recent kernel versions (6.7.x or newer), the system
> > > > > fails to boot properly unless I explicitly disable Dynamic Power
> > > > > Management (DPM) via the `amdgpu.dpm=0` kernel parameter.
> > > > >
> > > > > When DPM is enabled (`amdgpu.dpm=1` or omitted, since it's the
> > > > > default), the system either freezes during early boot or fails to
> > > > > initialize the display. However, using the LTS kernel (6.6.x),
> > > > > everything works as expected with DPM enabled.
> > > > >
> > > > > This seems to be a regression introduced in kernel 6.7 or later, and
> > > > > it specifically affects older GCN4 (Polaris) GPUs like the RX 470.
> > > > > Disabling DPM allows the system to boot, but significantly reduces GPU
> > > > > performance.
> > > > >
> > > > > Things I’ve tried:
> > > > > - Confirmed that the latest `linux-firmware` is installed.
> > > > > - Verified correct firmware files exist under `/lib/firmware/amdgpu/`.
> > > > > - Tested multiple kernels (mainline and LTS).
> > > > > - Using Mesa with ACO (Radeon open driver stack).
> > > > > - System boots fine with LTS kernel (6.6.x) + DPM enabled.
> > > > >
> > > > > System info:
> > > > > - GPU: AMD RX 470 (GCN 4 / Polaris)
> > > > > - Distro: Arch Linux
> > > > > - Kernel (working): linux-lts 6.6.x
> > > > > - Kernel (broken): 6.7.x and newer (currently tested on 6.14.6)
> > > > >
> > > > > Thanks in advance,
> > > > > Durmus Ozaltay
> > > > >
> > > > >
> > > > >
> > >
> > >
>
>
More information about the amd-gfx
mailing list