[PATCH v4 2/4] drm/panic: Add a drm panic handler

Tue Oct 10 13:32:37 UTC 2023

On 10/10/2023 15:05, Thomas Zimmermann wrote:
> Hi
> 
> Am 10.10.23 um 11:33 schrieb Maxime Ripard:
> [...]
>>> We also have discussions about kexec/kdump support. Here we'd need to
>>> retrieve the scanout address, forward it to the kexec kernel and put
>>> simpledrm onto that framebuffer until the regular driver takes over.
>>
>> Generically speaking, there's strictly no guarantee that the current
>> scanout buffer is accessible by the CPU. It's not even a driver issue,
>> it's a workload issue, so it will affect 100% of the times some users,
>> and some 0% of the time.
> 
> And I'd be OK with that. It's all best effort anyway.
> 
>>
>>> An interface like get_scanout_buffer will be helpful for this use
>>> case. So it makes sense to use it for the panic handler as well.
>>
>> It won't be helpful because the vast majority of the ARM drivers (and
>> thus the vast majority of the KMS drivers) won't be able to implement it
>> properly and reliably.
> 
> The barrier for firmware-based drivers is low: a pre-configured display 
> and mmap'able framebuffer memory. And it's assumed that a callback for 
> kexec would attempt to configure hardware accordingly. If a system 
> really cannot provide this, so be it.
> 
>>
>>>>> run the atomic_update() on it, and return this commit's framebuffer ?
>>>>>
>>>>> That way each driver have a better control on what the panic 
>>>>> handler will
>>>>> do.
>>>>> It can even call directly its internal functions, to avoid the 
>>>>> locks of the
>>>>> drm generic functions, and make sure it will only change the format 
>>>>> and base
>>>>> address.
>>>>> That's a bit more work for each driver, but should be more reliable 
>>>>> I think.
>>>>
>>>> I don't think that better control there is a good idea, it's a path 
>>>> that
>>>> won't get tested much so we'd be better off not allowing drivers to
>>>> deviate too much from the "ideal" design.
>>>>
>>>> What I had in mind is something like:
>>>>
>>>>     - Add a panic hook in drm_mode_config_funcs, with a
>>>>       drm_atomic_helper_panic helper;
>>>>
>>>>     - Provide an atomic_panic hook or something in 
>>>> drm_plane_helper_funcs;
>>>>
>>>>     - If they are set, we register the drm_panic handler;
>>>>
>>>>     - The handler will call drm_mode_config_funcs.panic, which will 
>>>> take
>>>>       its prepared state, fill the framebuffer it allocated with the
>>>>       penguin and backtrace, call 
>>>> drm_plane_helper_funcs.atomic_panic().
>>>>
>>>>     - The driver now updates the format and fb address.
>>>>
>>>>     - Halt and catch fire
>>>>
>>>> Does that make sense?
>>>
>>> Please see my other replies. I find this fragile, and unnecessary for 
>>> cases
>>> where there already is a working scanout buffer in place.
>>
>> And please see my other replies. Depending on a working scanout buffer
>> in place being accessible by the CPU is incredibly limiting. Ignoring it
>> when I'm pointing that out won't get us to a solution acceptable for
>> everyone.
>>
>>> It's something a driver could implement internally to provide a
>>> scanout buffer if none has been set up already.
>>
>> Some drivers need extra resources when setting up a plane. VC4 for
>> example need for every plane to allocate multiple internal SRAM buffers.
>> Just allocating an extra framebuffer won't cut it.
>>
>> This is tied to the state so far, so I guess we would need to allocate a
>> new state. Oh, and if we have several drivers that need to allocate that
>> extra framebuffer and state, we could make it part of the core or turn
>> it into a helper?
> 
> I mentioned that even the simple drivers hold locks during the atomic 
> commit. Some of the drivers use vmap/vunmap, which might make problems 
> as well. It's used in the context of damage handling, which also makes 
> no sense here. So the regular atomic helpers most likely won't work.
> 
> It sounds to me as if you're essentially asking for something like a 
> flush or sync operation.
> 
> Stepping back from get_scanout_buffer for a moment and adopting your 
> proposal from above, I can see something like this working:
> 
>    - the driver provides a panic callback in struct drm_driver
> 
>    - DRM registers a panic handler to invokes the callback
> 
>    - the panic callback has a scanout buffer from somewhere (currently 
> active, prepared, from firmware, plain memory, etc)
> 
>    - we provide a helper that fills the scanout buffer with information
> 
>    - the driver then updates the hardware from/with the scanout buffer; 
> details depend on the hardware
> 
> The final step is like a flush or commit operation. To give some 
> examples: The simple drivers of this patchset probably don't have to do 
> anything. Drivers with command/packet queues could attempt to send the 
> necessary commands to the device.

Yes that was the second question in the cover letter.
Some drivers probably want a flush_scanout_buffer() to send the 
framebuffer to the hardware.
That callback can be in struct drm_driver as well.

-- 

Jocelyn

> 
> Best regards
> Thomas
> 
>>
>> Maxime
>