[Intel-gfx] [PATCH v7 09/11] drm: uevent for connector status change

Daniel Vetter daniel at ffwll.ch
Mon May 20 16:11:07 UTC 2019


On Fri, May 17, 2019 at 01:08:24PM +0300, Pekka Paalanen wrote:
> On Thu, 16 May 2019 14:24:55 +0200
> Daniel Vetter <daniel at ffwll.ch> wrote:
> 
> > On Thu, May 16, 2019 at 11:22:11AM +0300, Pekka Paalanen wrote:
> > > On Wed, 15 May 2019 10:24:49 +0200
> > > Daniel Vetter <daniel at ffwll.ch> wrote:
> > >   
> > > > On Wed, May 15, 2019 at 10:37:31AM +0300, Pekka Paalanen wrote:  
> > > > > On Tue, 14 May 2019 16:34:01 +0200
> > > > > Daniel Vetter <daniel.vetter at ffwll.ch> wrote:
> > > > >     
> > > > > > On Tue, May 14, 2019 at 3:36 PM Pekka Paalanen <ppaalanen at gmail.com> wrote:    
> > > > > > >
> > > > > > > On Tue, 14 May 2019 13:02:09 +0200
> > > > > > > Daniel Vetter <daniel.vetter at ffwll.ch> wrote:
> > > > > > >      
> > > > > > > > On Tue, May 14, 2019 at 10:18 AM Ser, Simon <simon.ser at intel.com> wrote:      
> > > > > > > > >
> > > > > > > > > On Tue, 2019-05-14 at 11:02 +0300, Pekka Paalanen wrote:      
> > > > > 
> > > > > ...
> > > > >     
> > > > > > > > > > Hi Daniel,
> > > > > > > > > >
> > > > > > > > > > just to clarify the first case, specific to one very particular
> > > > > > > > > > property:
> > > > > > > > > >
> > > > > > > > > > With HDCP, there is a property that may change dynamically at runtime
> > > > > > > > > > (the undesired/desired/enabled tristate). Userspace must be notified
> > > > > > > > > > when it changes, I do not want userspace have to poll that property
> > > > > > > > > > with a timer.
> > > > > > > > > >
> > > > > > > > > > When that property alone changes, and userspace is prepared to handle
> > > > > > > > > > that property changing alone, it must not trigger a reprobe of the
> > > > > > > > > > connector. There is no reason to reprobe at that point AFAIU.
> > > > > > > > > >
> > > > > > > > > > How do you ensure that userspace can avoid triggering a reprobe with the
> > > > > > > > > > epoch approach or with any alternate uevent design?
> > > > > > > > > >
> > > > > > > > > > We need an event to userspace that indicates that re-reading the
> > > > > > > > > > properties is enough and reprobe of the connector is not necessary.
> > > > > > > > > > This is complementary to indicating to userspace that only some
> > > > > > > > > > connectors need to be reprobed instead of everything.      
> > > > > > > > >
> > > > > > > > > Can't you use the PROPERTY hint? If PROPERTY is the HDCP one, skip the
> > > > > > > > > reprobing. Would that work?      
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > yes, that would work, if it was acceptable to DRM upstream. The replies
> > > > > > > to Paul seemed to be going south so fast that I thought we wouldn't get
> > > > > > > any new uevent fields in favour of "epoch counters".
> > > > > > >      
> > > > > > > > Yes that's the idea, depending upon which property you get you know
> > > > > > > > it's a sink change (needs full reprobe) or something else like hdcp
> > > > > > > > state machinery update.      
> > > > > > >
> > > > > > > Right.
> > > > > > >      
> > > > > > > > Wrt avoiding the full reprobe for sink changes: I think we should
> > > > > > > > indeed decouple that from the per-connector event for sink changes.
> > > > > > > > That along is a good win already, since you know for which connector
> > > > > > > > you need to call drmGetConnector (which forces the reprobe). It would
> > > > > > > > be nice to only call drmGetConnectorCurrent (avoids the reprobe), but
> > > > > > > > historically speaking every time we tried to rely on this we ended up
> > > > > > > > regretting things.      
> > > > > > >
> > > > > > > What changed? This sounds very much what Paul suggested. Looking at it
> > > > > > > from userspace side:      
> > > > > > 
> > > > > > This sounds solid, some refinements below:
> > > > > >     
> > > > > > > HOTPLUG=1 CONNECTOR=xx PROPERTY=yy
> > > > > > >
> > > > > > > - If yy is "Content Protection", no need to drmModeGetConnector(), just
> > > > > > >   re-get the connector properties.
> > > > > > >
> > > > > > > - Kernel probably shouldn't bother sending this for properties where
> > > > > > >   re-probe could be necessary, and send the below instead.      
> > > > > > 
> > > > > > 
> > > > > > I think we should assert that the kernel can get the new property
> > > > > > values using drmModeGetConnectorCurrent for this case, i.e. the kernel
> > > > > > does not expect a full reprobe. I.e. upgrade your idea from "should"
> > > > > > to "must"    
> > > > > 
> > > > > Hi Daniel,
> > > > > 
> > > > > ok, that's good.
> > > > >     
> > > > > > Furthermore different property can indicate different kind of updates,
> > > > > > e.g. hdcp vs general sink change vs. whatever else might come in the
> > > > > > future.    
> > > > > 
> > > > > What do you mean by different kinds of updates?    
> > > > 
> > > > Atm we're discussing two:
> > > > 
> > > > - "Content Protection"
> > > > - "sink changed, but you don't need to reprobe" this would be quite a bit
> > > >   a catch all from the output detection. Paul thinks differently, but I'm
> > > >   not sold on splitting this up more, at least not right now. This would
> > > >   include connector status (and related things returned by drmGetConnector
> > > >   which currently aren't a property), EDID, the mst path id, that kind of
> > > >   stuff.
> > > > 
> > > > Ime once we have 2, there's more bound to come :-)  
> > > 
> > > Hi Daniel,
> > > 
> > > I don't understand what the "sink changed" thing could be, but sure,
> > > there can be more.  
> > 
> > So if you have a repeater (hdmi or dp) and you change the thing you plug
> > into that, then on the computer you don't get a full hotplug, because the
> > repeater was always connected. Instead you get a short pulse hotplug,
> > indicating that something with the sink has changed. This could be a
> > slightly adjusted EDID (e.g. different eld in the audio section), or
> > something else. That's what I mean with "sink changed". Similar thing can
> > happen if you unplug and then plug in something else real quick (usually
> > over suspend/resume), where connector->status stays the same, but the
> > actual thing plugged in is different. I think for hdmi this is just the
> > EDID, but we do parse a bunch of things out of the EDID that have further
> > effects (color space, max clock). With DP there's also dp aux stuff, e.g.
> > if you switch from a 2 lane to a 4 lane cable then with same screen more
> > modes can work.
> > 
> > Clearer?
> > 
> > I guess for the documentation for this new uapi we need to make an
> > exhaustive list of all the things that might have changed for a "sink
> > changed" event, whatever we actually agree on what that should look like.
> > Or the PROPERTY=0 fallback you mention below as a fallback idea.
> 
> Hi Daniel,
> 
> to me all that sounds like userspace would better do a probe and start
> from scratch with that one connector. Therefore it would fall into the
> 'HOTPLUG=1 CONNECTOR=xx' case, no PROPERTY.
> 
> I suppose I'm missing something?

Doing a full probe is hella expensive. Atm you always have to do this, but
we're talking about the brave new future where the kernel sucks less, and
the kernel would have done the expensive probing for you already.

> But also I don't mind, I have always expected there might be more
> properties whose change does not require a probe.
> 
> So, the kernel does sometimes do the probe on its own as well, right?
> Is that completely invisible to userspace, or could it stall some
> userspace operations that are not a probe of the same connector?

Major stall because the locking design isn't pretty. If the kernel is
probing right now even your GetConnectorCurrent or GetProperties (on a
connector) will stall for whatever long it takes to read the EDID. Or
whatever else the kernel is doing in the probe paths right now. We could
probably improve this, and make sure that at least GetConnectorCurrent and
GetProperties stop sucking. But needs some serious locking-fu to make that
work.

Aside: Just realized this is another important reason why we need to batch
up property updates. If we don't, then userspace will simply get held up
until the kernel is done anyway.

> I really think the design of the uAPI must start with how userspace is
> expected to react to the events. For that there are three cases:
> re-deiscover and probe everyting, re-probe one connector, re-read
> properties of one connector without probe. Userspace can then discover
> what exactly changed, just like it already does.

Yup, I think on this we all agree.

> > > > > Btw. I started thinking, maybe we should completely leave out the "If
> > > > > yy is "Content Protection"" and require the kernel to guarantee, that
> > > > > if PROPERTY is set, then drmModeGetConnector() (probing) must not be
> > > > > necessary based on this event alone.
> > > > > 
> > > > > Writing it down again:
> > > > > 
> > > > > HOTPLUG=1 CONNECTOR=xx PROPERTY=yy
> > > > > 
> > > > > - yy denotes which connector xx property changed.    
> > > > 
> > > > Maybe yy denotes which group of properties changed, and part of the uapi
> > > > is picking the canonical one. E.g. content protection might also gain more
> > > > properties in the future (there's patches, but the userspace won't be open
> > > > sourced). And for that case I don't think we should then send an even for
> > > > every single individual property, but just for the lead property.
> > > > 
> > > > Maybe we should change it to UPDATE_TYPE=<some-unique-string>, but it felt
> > > > better to use the property id we already have for this.  
> > > 
> > > Indeed, it is not really necessary to identify the exact property.
> > > 
> > > We could even just use PROPERTY=0 to indicate "something may have
> > > changed, you should re-get the properties, but no need to probe I
> > > promise".
> > > 
> > > Or like you said, whatever. I don't really care as long as the
> > > semantics are the same.
> > >   
> > > > > - Userspace does not need to do drmModeGetConnector(), it only needs to
> > > > >   drmModeObjectGetProperties() on the connector to receive the new
> > > > >   updated property values.    
> > > > 
> > > > drmModeGetConnector(Current) also supplies all the properties already.
> > > > This is special with connectors, since the predate the "properties on
> > > > everything" design. I'd just mention this function here, and ignore
> > > > drmModeObjectGetProperties.  
> > > 
> > > To avoid confusion, it would be best to mention all three functions
> > > then. It is very easy to forget about legacy uAPI like properties
> > > through GetConnector.
> > >   
> > > > > - Kernel must not send this event for changes that may require probing
> > > > >   for correct results, exceptional conditions (buggy hardware, etc.)
> > > > >   included. Instead, the kernel must send one of the below events.
> > > > > 
> > > > > Is there actually anything interesting that
> > > > > drmModeGetConnectorCurrent() could guaranteed correctly return that is
> > > > > not a property already? I'd probably leave this consideration out
> > > > > completely, and just say do one of the needs-probing events if anything
> > > > > there changed.    
> > > > 
> > > > That's why I'm suggesting the PROPERTY=<epoch_prop_id> would indicate all
> > > > sink related stuff, including the not-properperty-fied stuff is updated,
> > > > and will be reported correctly by GetConnectorCurrent.  
> > > 
> > > Just because GetConnectorCurrent returns the same properties as
> > > drmModeObjectGetProperties? Ok then. Personally I'm quite firmly in the
> > > land where drmModeObjectGetProperties exists, so don't really care
> > > about the legacy stuff.  
> > 
> > So from a quick skimming GetConnectorCurrent == GetProperties, except you
> > don't get the non-propertified stuff like mode list, ->status, and a few
> > other things we parse out from various sources. So for connectors you need
> > to use GetConnector/Current anyway I think, if you rely on GetProperties
> > only, you're missing stuff.
> 
> That was part of my question: the stuff I would be missing, does it
> matter?
> 
> Most of the stuff I would be missing are things that require a probe to
> be up-to-date, which means I will not be missing them, because I will
> be calling drmModeGetConnector instead (and not GetConnectorCurrent),
> because the kernel driver sent an uevent to tell me to do exactly that.

Yes currently that's true, because the kernel sucks. But it would be nice
if we could do the expensive probing all upfront, for most connectors, and
(at least eventually) block userspace's uevent handler a bit less.

> Is there anything in the set of connector parameters which a) do not
> require a probe to be up-to-date, and b) are not actual properties too?

Not yet. I'd like to make the kernel suck less though (and at least for
dp/hdmi we should be able to do that with few changes).

> If there are, we have two options: complicate userspace to use a third
> function to get some bits (drmModeGetConnectorCurrent), or make the
> kernel driver just send the uevent that makes userspace probe anyway.

See above, I want to get the expensive probe stuff out of the ueven as
much as possible. Reading an EDID usually means you miss a frame (~20ms
for your average hdmi edid, can be a lot more). Except if you bother with
thread your compositor, and then why does the kernel suck that much :-/

> > Agreed that we need to spell this all out.
> > 
> > > > > > > HOTPLUG=1 CONNECTOR=xx
> > > > > > >
> > > > > > > - Needs to drmModeGetConnector() on the one connector, no need to probe
> > > > > > >   others. Implies that one needs to re-get the connector properties as
> > > > > > >   well.      
> > > > > > 
> > > > > > Sounds good.
> > > > > >     
> > > > > > > HOTPLUG=1
> > > > > > >
> > > > > > > - Need to do drmM odeGetResouces() to discover new/disappeared
> > > > > > >   connectors, and need to drmModeGetConnector to re-probe every
> > > > > > >   connector. (As always.)      
> 
> ...
> 
> > > > > > > --------
> > > > > > >
> > > > > > > When discussing this in IRC, I had the concern about how uevents are
> > > > > > > delivered in userspace. Is there a possibility that they might be
> > > > > > > overwritten, contain stale attributes, or get squashed together?
> > > > > > >
> > > > > > > Particularly if a display server is current on the VT and active and
> > > > > > > monitoring udev, but stuck doing something and cannot service uevents
> > > > > > > very fast, and the kernel sends more than one event before the process
> > > > > > > gets back to dispatching. The terminology in libudev API confused me as
> > > > > > > an event is a device. Squashing together would make sense if the
> > > > > > > uevent were just updating a device attribute list. Previously when we
> > > > > > > had just a single kind of uevent, that would not have made a
> > > > > > > difference, but if we gain different kinds of uevents like here, it
> > > > > > > starts to matter.
> > > > > > >
> > > > > > > However, Paul came to the conclusion that we will be ok as long as the
> > > > > > > events come via netlink.      
> > > > > > 
> > > > > > Yeah netlink shouldn't drop events on the floor I think. It might
> > > > > > still happen, but then I think you should get an indication of that
> > > > > > error, and you just treat it as a general hotplug event like on older
> > > > > > kernels.    
> > > > > 
> > > > > Alright, although reading Paul it sounds like there is another
> > > > > (fallback?) method as well that wouldn't work. Should userspace worry
> > > > > about that?
> > > > > 
> > > > > Hmm, get an indication of an error... I don't know how that would be
> > > > > presented in libudev API and I can't point to any code in Weston that
> > > > > would deal with it. Does anyone have a clue about that?
> > > > > 
> > > > > Userspace cannot really start taking advantage of any new fine-grained
> > > > > hotplug events until it can rely on the event delivery. Granted, this
> > > > > seems purely a userspace issue, but I bet it could be formulated as a
> > > > > kernel regression: things stop working after upgrading the kernel while
> > > > > having always used new userspace which was ready for detailed hotplug
> > > > > events but didn't ensure the delivery in userspace.    
> > > > 
> > > > You have this already (if it's really an issue with netlink reliability,
> > > > tbh no idea), you can already miss a global uevent. It's easier to catch
> > > > up if you do miss it, since you're forcing a reprobe on everything. That's
> > > > why I think the EPOCH thing would be good, userspace could be defensive
> > > > and always call GetConnectorCurrent on all connectors if it gets any
> > > > hotplug uevent, and if it gets an EPOCH change, force a reprobe. But I'm
> > > > not sure that's really required (aside from VT switching).  
> > > 
> > > No, my concern is not an issue with netlink reliability. It is a
> > > potential issue when userspace chooses to not use netlink, and uses
> > > something else instead. I'm not sure what that else is but Paul says
> > > there is code in libudev and that is completely outside the control of
> > > KMS apps like display servers.  
> > 
> > afaik this other path only exists because it's the older one, for uapi
> > backwards compatibility with older userspace. Shouldn't be used for
> > anything.
> 
> "Shouldn't be used" and someone screaming "kernel regression"... are you
> sure that path won't matter?
> 
> Like some home-brewn distribution that happens to configure their
> libudev and kernel to use the old method, uses already new userspace,
> and then upgrades the kernel that starts sending fine-grained hotplug
> events, resulting the display server randomly handling hotplug wrong.
> 
> Reading Airlie's recent rant about kernel regression handling make this
> a scary scenario where you would have no other choice than to rip all
> the fine-grained uevents out again.
> 
> Is there any difference in the kernel code between the old method and
> the netlink method? Would it be possible to send fine-grained hotplug
> events only through netlink, and fall back to the old 'HOTPLUG=1' for
> the old method?

There's a lot of grey in kernel regressions, and for fringe setups used by
few people I wouldn't worry about this. If they expect their shit to keep
working when using new stuff and crappy old interfaces, they get to keep
all the pieces.

Dave's recent rant was a bit special, since userspace is clearly smoking
some strong stuff (-modesetting's atomic is seriously not using atomic
correctly), but it was also affecting too many people, and changing the
boot setup meant you'd get a black screen on boot-up already. Instead of
just on the first modeset with more than 1 screen.

There's also a fairly easy fix for that -modesetting issue: We don't
expose atomic if the compositor has a process name of "Xserver". Brutal,
but gets the job done. Once X is fixed, we can give a new "I'm not totally
broken anymore" interface to get back at atomic.

tldr; I'm not worried at all, at least not more than with anything uapi.
Very rarely we'll have regrets.

> > > Can you explain how one could miss a global hotplug event when
> > > userspace is using netlink to watch for events? I thought the netlink
> > > path through libudev was reliable. And how does userspace realize it
> > > missed an event?  
> > 
> > I thought netlink is supposed to be reliable, but then if you send
> > bazillion of events and userspace is stuck, eventually you will run out of
> > memory. I have no idea how netlink signals that, and how that's forwarded
> > or not in libudev. Also not sure whether we should actually care about
> > this.
> 
> Ok. Let's try to keep the number of events low then, e.g. in the case of
> 
> 	HOTPLUG=1 CONNECTOR=xx PROPERTY=yy
> 
> revert to sending just a single uevent with PROPERTY set to 0 instead
> if there are more than one such no-probe property changes for the same
> connector.

Yeah PROPERTY=0 would be another option. I think indicates clearly "don't
reprobe" but also "update not for any specific prop, but for all of
them".

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the Intel-gfx mailing list