[PATCH v4 2/4] drm/panic: Add a drm panic handler

Maxime Ripard mripard at kernel.org
Tue Oct 10 09:33:54 UTC 2023


On Tue, Oct 10, 2023 at 11:04:33AM +0200, Thomas Zimmermann wrote:
> Am 09.10.23 um 18:07 schrieb Maxime Ripard:
> > On Mon, Oct 09, 2023 at 04:05:19PM +0200, Jocelyn Falempe wrote:
> > > > > - I find it risky to completely reconfigure the hardware in a panic handler.
> > > > 
> > > > I would expect to only change the format and base address of the
> > > > framebuffer. I guess it can fail, but it doesn't seem that different to
> > > > the async plane update we already have and works well.
> > > > 
> > > In this case it can work, but by using generic drm api, it's hard to know
> > > what the driver will do.
> > 
> > We should document extensively what we expect drivers to do in those
> > hooks, and possibly call cant_sleep() in the framework function to have
> > some reporting at least.
> > 
> > > > > Also how many drivers would need this ?
> > > > > 
> > > > > Currently I was mostly considering x86 platform, so:
> > > > > 
> > > > > simpledrm/ast/mgag200 which works well with the get_scanout_buffer().
> > > > > 
> > > > > i915/amdgpu/nouveau, which are quite complex, and will need to do their own
> > > > > thing anyway.
> > > > 
> > > > I guess we're not entirely aligned there then. I would expect that
> > > > mechanism to work with any atomic KMS driver. You are right that i915,
> > > > amdgpu and nouveau are special enough that some extra internal plumbing
> > > > is going to be required, but I'd expect it to be easy to support with
> > > > any other driver for a memory-mapped device.
> > > > 
> > > > I guess what I'm trying to say is, even though it's totally fine that
> > > > you only support those drivers at first, supporting in vc4 for example
> > > > shouldn't require to rewrite the whole thing.
> > > 
> > > Would that work for you to put that in a drm_panic_helper.c,
> > > so that drivers can opt-in ?
> > > 
> > > So the driver can call a drm_panic_helper_prepare_commit() at
> > > initialization, and then in the get_scanout_buffer() function
> > 
> > If we have a full blown commit with a new framebuffer, why do we need
> > get_scanout_buffer? It should be either the framebuffer itself, or in
> > the plane state if you have a conversion.
> 
> We also have discussions about kexec/kdump support. Here we'd need to
> retrieve the scanout address, forward it to the kexec kernel and put
> simpledrm onto that framebuffer until the regular driver takes over.

Generically speaking, there's strictly no guarantee that the current
scanout buffer is accessible by the CPU. It's not even a driver issue,
it's a workload issue, so it will affect 100% of the times some users,
and some 0% of the time.

> An interface like get_scanout_buffer will be helpful for this use
> case. So it makes sense to use it for the panic handler as well.

It won't be helpful because the vast majority of the ARM drivers (and
thus the vast majority of the KMS drivers) won't be able to implement it
properly and reliably.

> > > run the atomic_update() on it, and return this commit's framebuffer ?
> > > 
> > > That way each driver have a better control on what the panic handler will
> > > do.
> > > It can even call directly its internal functions, to avoid the locks of the
> > > drm generic functions, and make sure it will only change the format and base
> > > address.
> > > That's a bit more work for each driver, but should be more reliable I think.
> > 
> > I don't think that better control there is a good idea, it's a path that
> > won't get tested much so we'd be better off not allowing drivers to
> > deviate too much from the "ideal" design.
> > 
> > What I had in mind is something like:
> > 
> >    - Add a panic hook in drm_mode_config_funcs, with a
> >      drm_atomic_helper_panic helper;
> > 
> >    - Provide an atomic_panic hook or something in drm_plane_helper_funcs;
> > 
> >    - If they are set, we register the drm_panic handler;
> > 
> >    - The handler will call drm_mode_config_funcs.panic, which will take
> >      its prepared state, fill the framebuffer it allocated with the
> >      penguin and backtrace, call drm_plane_helper_funcs.atomic_panic().
> > 
> >    - The driver now updates the format and fb address.
> > 
> >    - Halt and catch fire
> > 
> > Does that make sense?
> 
> Please see my other replies. I find this fragile, and unnecessary for cases
> where there already is a working scanout buffer in place.

And please see my other replies. Depending on a working scanout buffer
in place being accessible by the CPU is incredibly limiting. Ignoring it
when I'm pointing that out won't get us to a solution acceptable for
everyone.

> It's something a driver could implement internally to provide a
> scanout buffer if none has been set up already.

Some drivers need extra resources when setting up a plane. VC4 for
example need for every plane to allocate multiple internal SRAM buffers.
Just allocating an extra framebuffer won't cut it.

This is tied to the state so far, so I guess we would need to allocate a
new state. Oh, and if we have several drivers that need to allocate that
extra framebuffer and state, we could make it part of the core or turn
it into a helper?

Maxime
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20231010/d4755aee/attachment.sig>


More information about the dri-devel mailing list