[Intel-gfx] [PATCH v2 1/3] drm: Add support for panic message output

Wed Mar 13 17:33:52 UTC 2019

On 2019-03-13 5:16 p.m., Kazlauskas, Nicholas wrote:
> On 3/13/19 11:54 AM, Christian König wrote:
>> Am 13.03.19 um 16:38 schrieb Michel Dänzer:
>>> On 2019-03-13 2:37 p.m., Christian König wrote:
>>>> Am 13.03.19 um 14:31 schrieb Ville Syrjälä:
>>>>> On Wed, Mar 13, 2019 at 10:35:08AM +0100, Michel Dänzer wrote:
>>>>>> On 2019-03-12 6:15 p.m., Noralf Trønnes wrote:
>>>>>>> Den 12.03.2019 17.17, skrev Ville Syrjälä:
>>>>>>>> On Tue, Mar 12, 2019 at 11:47:04AM +0100, Michel Dänzer wrote:
>>>>>>>>> On 2019-03-11 6:42 p.m., Noralf Trønnes wrote:
>>>>>>>>>> This adds support for outputting kernel messages on panic().
>>>>>>>>>> A kernel message dumper is used to dump the log. The dumper 
>>>>>>>>>> iterates
>>>>>>>>>> over each DRM device and it's crtc's to find suitable 
>>>>>>>>>> framebuffers.
>>>>>>>>>>
>>>>>>>>>> All the other dumpers are run before this one except mtdoops.
>>>>>>>>>> Only atomic drivers are supported.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Noralf Trønnes <noralf at tronnes.org>
>>>>>>>>>> ---
>>>>>>>>>>    [...]
>>>>>>>>>>
>>>>>>>>>> diff --git a/include/drm/drm_framebuffer.h
>>>>>>>>>> b/include/drm/drm_framebuffer.h
>>>>>>>>>> index f0b34c977ec5..f3274798ecfe 100644
>>>>>>>>>> --- a/include/drm/drm_framebuffer.h
>>>>>>>>>> +++ b/include/drm/drm_framebuffer.h
>>>>>>>>>> @@ -94,6 +94,44 @@ struct drm_framebuffer_funcs {
>>>>>>>>>>                 struct drm_file *file_priv, unsigned flags,
>>>>>>>>>>                 unsigned color, struct drm_clip_rect *clips,
>>>>>>>>>>                 unsigned num_clips);
>>>>>>>>>> +
>>>>>>>>>> +    /**
>>>>>>>>>> +     * @panic_vmap:
>>>>>>>>>> +     *
>>>>>>>>>> +     * Optional callback for panic handling.
>>>>>>>>>> +     *
>>>>>>>>>> +     * For vmapping the selected framebuffer in a panic context.
>>>>>>>>>> Must
>>>>>>>>>> +     * be super careful about locking (only trylocking allowed).
>>>>>>>>>> +     *
>>>>>>>>>> +     * RETURNS:
>>>>>>>>>> +     *
>>>>>>>>>> +     * NULL if it didn't work out, otherwise an opaque cookie
>>>>>>>>>> which is
>>>>>>>>>> +     * passed to @panic_draw_xy. It can be anything: vmap area,
>>>>>>>>>> structure
>>>>>>>>>> +     * with more details, just a few flags, ...
>>>>>>>>>> +     */
>>>>>>>>>> +    void *(*panic_vmap)(struct drm_framebuffer *fb);
>>>>>>>>> FWIW, the panic_vmap hook cannot work in general with the
>>>>>>>>> amdgpu/radeon
>>>>>>>>> drivers:
>>>>>>>>>
>>>>>>>>> Framebuffers are normally tiled, writing to them with the CPU
>>>>>>>>> results in
>>>>>>>>> garbled output.
>>>>>>>>>
>>>>>>> In which case the driver needs to support the ->panic_draw_xy 
>>>>>>> callback,
>>>>>>> or maybe it's possible to make a generic helper for tiled buffers.
>>>>>> I'm afraid that won't help, at least not without porting big chunks of
>>>>>> https://gitlab.freedesktop.org/mesa/mesa/tree/master/src/amd/addrlib
>>>>>> into the kernel, none of which will be used for anything else.
>>>>>>
>>>>>>
>>>>>>>>> There would need to be a mechanism for switching scanout to a 
>>>>>>>>> linear,
>>>>>>>>> CPU accessible framebuffer.
>>>>>>>> I suppose panic_vmap() could just provide a linear temp buffer
>>>>>>>> to the panic handler, and panic_unmap() could copy the contents
>>>>>>>> over to the real fb.
>>>>>> Copy how? Using a GPU engine?
>>>>> CPU maybe? Though I suppose that won't work if the buffer isn't CPU
>>>>> accesible :/
>>>> Well we do have a debug path for accessing invisible memory with the 
>>>> CPU.
>>>>
>>>> E.g. three registers: DATA and auto increment OFFSET_LO/HI. So you can
>>>> just read/write DATA over and over again if you want to access some 
>>>> memory.
>>> Right. I assume that'll be very slow, but I guess it could do when the
>>> memory isn't directly CPU accessible.
>>
>> Just made a quick test and reading 33423360 bytes (4096x2040x4) using 
>> that interfaces takes about 13 seconds.
>>
>> IIRC we don't use the auto increment optimization yet, so that can 
>> probably be improved by a factor of 3 or more.

I'd assume only writes are needed, no reads.

>>>> But turning of tilling etc is still extremely tricky when the system is
>>>> already unstable.
>>> Maybe we could add a little hook to the display code, which just
>>> disables tiling for scanout and maybe disables non-primary planes, but
>>> doesn't touch anything else. Harry / Nicholas, does that seem feasible?
>>>
>>>
>>> I'm coming around from "this is never going to work" to "it might
>>> actually work" with our hardware...
>>
>> Yeah, agree. It's a bit tricky, but doable.
> 
> A "disable_tiling" hook or something along those lines could work for 
> display. It's a little bit non trivial when you want to start dealing 
> with locking and any active DRM commits, but we have a global lock 
> around all our hardware programming anyway that makes that easier to 
> deal with.

This code only runs when there's a kernel panic, so it doesn't have to
care about such things. :) In fact, it should rather avoid them and only
do the absolute minimum register writes needed to do the job.

> I think we can just re-commit and update the existing hardware state 
> with only the tiling info for every plane reset to off.

There's also a concern that non-primary planes might prevent the output
from being visible.

> For most buffers I don't think we'd have to really consider changing
> anything else here as long as you respect the current FB size and
> pitch.

Hmm, e.g. compression might also need to be disabled? Or maybe that's
implicitly off with no tiling?

-- 
Earthling Michel Dänzer               |              https://www.amd.com
Libre software enthusiast             |             Mesa and X developer