[PATCH] drm/atomic: Convert a warning to dbg atomic printk

Michel Dänzer michel.daenzer at mailbox.org
Wed Apr 6 08:30:18 UTC 2022


On 2022-04-06 00:47, Zack Rusin wrote:
> On Tue, 2022-04-05 at 19:09 +0200, Michel Dänzer wrote:
>> On 2022-04-04 20:21, Zack Rusin wrote:
>>> From: Zack Rusin <zackr at vmware.com>
>>>
>>> By default each flip times out after 0.1 sec
>>
>> 10 * HZ is 10 seconds, not 0.1.
> 
> Yea, sorry, this entire commit message is not correct. I've sent out a
> very old diff with a log the best I could remember. I recall our
> conversation now and iirc we said that maybe an interface through drm
> atomic code to enable/disable this error is the way to go but after
> looking at this again I'm not sure. More below.
> 
>>> and a warning about the time out is added to the kernel log. The
>>> warning is
>>> harmless because there's another flip coming right after but it can
>>> quickly fill up
>>> the log, e.g. missing 2 flips every second over a 24 hour span will
>>> add about 172 thousand lines to the log.
>>
>> As we discussed before, while this might be true for the vmwgfx driver,
>> for other drivers this message indicates that either the GPU hung, or
>> something else went wrong spectacularly. As such, I think we do want to
>> see these messages by default for other drivers at least.
> 
> I'm not going to argue for or against that but I am curious what's the
> point of the message. The message is basically saying "something could
> possibly have went very wrong". OK, what's next? Especially if there's
> no visible problems and it's not reproducible. Even if it would be
> reproducible there's nothing actionable from the message itself. If the
> system has no output connected and no users are currently logged in and
> we missed a flip, does it matter?

I don't think waiting for a 10 second timeout is the appropriate behaviour in that case. While a KMS CRTC is enabled, the driver needs to make it work, in the worst case via a timer which ticks at the CRTC refresh rate.


>> I suspect this just papers over the real issue even with vmwgfx though.
> 
> This goes back to the above. I think you, me and Dave looked at the
> logs from those bugs and none of us seem to know what do do about it.
> Lacking some other error messages there seems to be nothing this error
> adds with vmwgfx or without (we have been adding more logging to vmwgfx
> so hopefully with newer kernels we could get some actionable errors but
> that's orthogonal to this).

The error means that either:

* A flip actually didn't complete in 10 seconds.
* There's some kind of time tracking issue which results in the timer firing after less than 10 seconds (of the system actually running).

Either way, it's an issue which should be fixed rather than just swept under the rug.


-- 
Earthling Michel Dänzer            |                  https://redhat.com
Libre software enthusiast          |         Mesa and Xwayland developer


More information about the dri-devel mailing list