[PATCH] drm: Funnel drm logs to tracepoints

Wed Oct 16 15:55:35 UTC 2019

On Wed, Oct 16, 2019 at 03:23:45PM +0200, Thomas Zimmermann wrote:
> Hi
> 
> Am 16.10.19 um 15:05 schrieb Pekka Paalanen:
> > On Wed, 16 Oct 2019 00:35:39 +0200
> > Daniel Vetter <daniel.vetter at ffwll.ch> wrote:
> > 
> >> Yeah I don't think tuning the spam level will ever work. What we need
> >> is some external input (most likely from the user clicking the "my
> >> external screen doesn't work" button, or maybe the compositor
> >> realizing something that should work didn't, or some other thing that
> >> indicates trouble), and then retroactively capture all
> >> debug/informational message leading up to doom.
> >>
> >> But without that external "houston we have a problem" input all the
> >> debug spam is really just spam and unwanted. btw even if we don't spam
> >> dmesg if we enable too much we might have simply trouble with all the
> >> printk formatting work we do for nothing. So maybe we need something
> >> like trace_printk which iirc delays the formatting until the stuff
> >> actually gets read from the log buffer. Plus trace_printk might make
> >> it clear enough that it's not stable uapi ... so maybe we do want
> >> trace_printk in the end?
> >>
> >> Just not really looking forward to reimplementing half the tracing
> >> infrastructure just for this ...
> > 
> > Hi,
> > 
> > a thought about the UAPI:
> > 
> > Debugfs is not good because it's not supposed to be touched or even
> > present in production, right?
> 
> I'm running Tumbleweed where debugfs is mounted by default for root. I
> could live having the user to mount debugfs to get the file's content.
> 
> > specifically be available in production. So a new file in some fs
> > somewhere it should be, and userspace in production can read it at will
> > to attach to a bug report.
> > 
> > Those semantics, "only use this content for attaching into a bug
> > report" should be made very clear in the UAPI.
> 
> Has this ever worked? As soon as a userspace program starts depending on
> the content of this file, it becomes kabi. From the incidents I know,
> Linus has always been quite strict about this. Even for broken interfaces.
> 

I think at this point I've convinced myself to spend the time to add proper
stable event traces to the atomic core + helpers (dp/self refresh/hdcp/etc).
It's going to be a pain (I really hate how much boilerplate is involved in
adding just one event), but I think there's enough interest that it'll be
worth it.

If it turns out to be useful, we can dig into some of the drivers as well.

Sean

> > I believe it has to be a ring buffer that is being continuously written
> > also during normal operations, so that we don't have to ask end users
> > to reproduce the issue again just to get some logs. Maybe the issue
> > happens once in a fortnight. The information must be extractable after
> > the fact, without before-hand preparations.
> 
> Agreed.
> 
> Best regards
> Thomas
> 
> > 
> > 
> > Thanks,
> > pq
> > 
> 
> -- 
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
> 

-- 
Sean Paul, Software Engineer, Google / Chromium OS