[PATCH v4 2/4] drm/panic: Add a drm panic handler

Tue Oct 10 13:05:20 UTC 2023

Hi

Am 10.10.23 um 11:33 schrieb Maxime Ripard:
[...]
>> We also have discussions about kexec/kdump support. Here we'd need to
>> retrieve the scanout address, forward it to the kexec kernel and put
>> simpledrm onto that framebuffer until the regular driver takes over.
> 
> Generically speaking, there's strictly no guarantee that the current
> scanout buffer is accessible by the CPU. It's not even a driver issue,
> it's a workload issue, so it will affect 100% of the times some users,
> and some 0% of the time.

And I'd be OK with that. It's all best effort anyway.

> 
>> An interface like get_scanout_buffer will be helpful for this use
>> case. So it makes sense to use it for the panic handler as well.
> 
> It won't be helpful because the vast majority of the ARM drivers (and
> thus the vast majority of the KMS drivers) won't be able to implement it
> properly and reliably.

The barrier for firmware-based drivers is low: a pre-configured display 
and mmap'able framebuffer memory. And it's assumed that a callback for 
kexec would attempt to configure hardware accordingly. If a system 
really cannot provide this, so be it.

> 
>>>> run the atomic_update() on it, and return this commit's framebuffer ?
>>>>
>>>> That way each driver have a better control on what the panic handler will
>>>> do.
>>>> It can even call directly its internal functions, to avoid the locks of the
>>>> drm generic functions, and make sure it will only change the format and base
>>>> address.
>>>> That's a bit more work for each driver, but should be more reliable I think.
>>>
>>> I don't think that better control there is a good idea, it's a path that
>>> won't get tested much so we'd be better off not allowing drivers to
>>> deviate too much from the "ideal" design.
>>>
>>> What I had in mind is something like:
>>>
>>>     - Add a panic hook in drm_mode_config_funcs, with a
>>>       drm_atomic_helper_panic helper;
>>>
>>>     - Provide an atomic_panic hook or something in drm_plane_helper_funcs;
>>>
>>>     - If they are set, we register the drm_panic handler;
>>>
>>>     - The handler will call drm_mode_config_funcs.panic, which will take
>>>       its prepared state, fill the framebuffer it allocated with the
>>>       penguin and backtrace, call drm_plane_helper_funcs.atomic_panic().
>>>
>>>     - The driver now updates the format and fb address.
>>>
>>>     - Halt and catch fire
>>>
>>> Does that make sense?
>>
>> Please see my other replies. I find this fragile, and unnecessary for cases
>> where there already is a working scanout buffer in place.
> 
> And please see my other replies. Depending on a working scanout buffer
> in place being accessible by the CPU is incredibly limiting. Ignoring it
> when I'm pointing that out won't get us to a solution acceptable for
> everyone.
> 
>> It's something a driver could implement internally to provide a
>> scanout buffer if none has been set up already.
> 
> Some drivers need extra resources when setting up a plane. VC4 for
> example need for every plane to allocate multiple internal SRAM buffers.
> Just allocating an extra framebuffer won't cut it.
> 
> This is tied to the state so far, so I guess we would need to allocate a
> new state. Oh, and if we have several drivers that need to allocate that
> extra framebuffer and state, we could make it part of the core or turn
> it into a helper?

I mentioned that even the simple drivers hold locks during the atomic 
commit. Some of the drivers use vmap/vunmap, which might make problems 
as well. It's used in the context of damage handling, which also makes 
no sense here. So the regular atomic helpers most likely won't work.

It sounds to me as if you're essentially asking for something like a 
flush or sync operation.

Stepping back from get_scanout_buffer for a moment and adopting your 
proposal from above, I can see something like this working:

   - the driver provides a panic callback in struct drm_driver

   - DRM registers a panic handler to invokes the callback

   - the panic callback has a scanout buffer from somewhere (currently 
active, prepared, from firmware, plain memory, etc)

   - we provide a helper that fills the scanout buffer with information

   - the driver then updates the hardware from/with the scanout buffer; 
details depend on the hardware

The final step is like a flush or commit operation. To give some 
examples: The simple drivers of this patchset probably don't have to do 
anything. Drivers with command/packet queues could attempt to send the 
necessary commands to the device.

Best regards
Thomas

> 
> Maxime

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 840 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20231010/430f5de2/attachment.sig>