[PATCH 0/3] drm/radeon kexec fixes
markus at trippelsdorf.de
Mon Sep 9 02:21:40 PDT 2013
On 2013.09.08 at 17:32 -0700, Eric W. Biederman wrote:
> Markus Trippelsdorf <markus at trippelsdorf.de> writes:
> > Here are a couple of patches that get kexec working with radeon devices.
> > I've tested this on my RS780.
> > Comments or flames are welcome.
> > Thanks.
> A couple of high level comments.
> This looks promising for the usual case.
> Removing the printk at the end of the kexec path seems a little dubious,
> what of other cpus, interrupt handlers, etc. Basically estabilishing a
> new rule on when printk is allowed seems a little dubious at this point,
> even if it is a useful debugging trick.
OK. I will drop this patch. It doesn't seem to be necessary, because I
cannot reproduce the printk related hang anymore.
> Having a clean shutdown of the radeon definitely seems worth doing,
> because the cases where we care abouty video are when a person is in
> front of the system.
Yes. But please note that even with radeon_pci_shutdown implemented, I
still get ring test failures on roughly every eighth kexec boot:
[drm:r600_dma_ring_test] *ERROR* radeon: ring 3 test failed (0xCAFEDEAD)
radeon 0000:01:05.0: disabling GPU acceleration
That's definitely better than the current state of affairs, with ring
test failures on every second boot. But I haven't figured out the reason
for these failures yet. It's curious that once a ring test failure
occurs, it will reliably fail after each kexec invocation, no matter how
often repeated. Only a reboot brings the machine back to normal.
> I don't know if you want to remove the sanity checks. They seem cheap
> and safe regardless. Are they expensive or ineffective? Moreover if
> they work a reasonable amount of the time that means that the kexec on
> panic case (where we don't shut anything down) can actually use the
> video, and that in general the driver will be more robust. I don't
> expect anyone much cares as kexec on panic is mostly used to just write
> a core file to the network, or the local disk. But if it is easy to
> keep that case working most of the time, why not.
IIRC Alex said the sanity checks are expensive and boot-time could be
improved by dropping them. Maybe he can chime in?
More information about the dri-devel