[RFC PATCH 00/37] Modesetting for atomic modesetting

Tue Mar 24 15:49:22 PDT 2015

Hi,
Yikes, I think we're talking past each other a bit. So I thought a v2
might help.

On 24 March 2015 at 08:55, Daniel Vetter <daniel at ffwll.ch> wrote:
> On Mon, Mar 23, 2015 at 04:58:47PM +0000, Daniel Stone wrote:
>> On 23 March 2015 at 08:20, Daniel Vetter <daniel at ffwll.ch> wrote:
>> > Ok this is quite a bit a different beast than what I expected. I think
>> > it's way too intrusive for drivers to land quickly, and there's a big
>> > depency chain linking everything. I think we need something much simpler.
>>
>> Yeah, I'm uneasy with how invasive it's got: hence the proposal to
>> drop crtc->mode back into being inlined. I don't think we lose
>> anything from doing that (as crtc->state->mode still changes), and
>> that makes the diff much less scary. The constness changes no longer
>> become required, but I think are still pretty useful from the point of
>> view of setting (and enforcing) our existing expectations. I'd still
>> look to push those anyway, but at a much more sedate pace.

As promised, have pulled this out. I've split the series into two
parallel series, which barely intersect (i.e. the merge is trivial):
http://cgit.collabora.com/git/user/daniels/linux.git/log/?h=wip/drm-next/drm-constness
is the change to constify all crtc->mode users, without turning
crtc->mode itself const, or into a pointer. I think these are all
worthwhile cleanups in their own right, but as they're quite invasive,
we can push them separately and much more slowly.

http://cgit.collabora.com/git/user/daniels/linux.git/log/?h=wip/drm-next/drm-modes-ref
is the much less invasive[0] work to refcount modes, expose
CRTC/connector modes to userspace through properties, and allow
user-supplied blob properties. I think this is really quite safe, and
the interactions with drivers are pretty minimal.

For now, I don't intend to really talk much about drm-constness, since
it's neither interesting nor relevant. So this is just about
drm-modes-ref.

>> >>   - as far as possible, modes should be relateable to their source, e.g. if
>> >>     Plymouth pulls a particular mode from the encoder, and you pick up on that
>> >>     mode as part of current configuration during handover, you should be able
>> >>     to work backwards to where Plymouth sourced it, i.e. the encoder list
>> >
>> > With legacy setcrtc we already lose this information and thus far no one
>> > seems to have cared. And I don't see the use-case since simply comparing
>> > it to sources works well enough, in case you want to know where a mode is
>> > from.
>>
>> On the other hand, I wouldn't suggest SetCrtc to really be a model to
>> follow. I really don't mind SetCrtc breaking all these expectations
>> above, mind.
>
> Well setcrtc is bound to stay around, and userspace using it will also
> stick around for quite a bit I think. If we add new semantic guarantees
> and then userspace can't rely on them we bear all the cost without much
> benefit.

Right, if you're using SetCrtc then you can expect to lose, and that's
fine. But I don't think we should hobble new userspace forever to the
terrible interfaces we made previously.

>> Do we make any guarantee on connector->modes lifetimes? If it can't
>> change without a hotplug event, then we know the ID is valid until the
>> connector goes dark, at which point we have to drop any local cache
>> relating to it.
>>
>> Saying that userspace must read every single mode property back every
>> single time is pretty horrible from the point of view of requiring
>> more userspace->kernel->userspace->kernel->[...] trips to do discovery
>> every time, just because we decided not to work out a sensible
>> lifetime strategy to expose to userspace.
>
> Current kernels should send you a hotplug event, so with those that
> strategy works - you can cache until the next hotplug. But on older
> kernels (and iirc we've only fixed this very recently) sometimes hotplug
> events get lost. So there if someone asks for a full reprobe, you have to
> do the full dance.

Right, but this isn't in any kernels yet, so we don't exactly need to
worry about bad interactions with other parts of old kernels.

>> > This blew up with edid blob properties where SNA had one clever trick too
>> > many and thought that matching edid blob prop id means the edid is
>> > unchanged. But since we remove the old blob before we add the new one
>> > you're pretty much guranteed to reuse the same slot.
>>
>> Sure, if you're not paying attention to the defined lifetime, then
>> don't do any caching. But this is about _defining_ such a usable
>> lifetime that userspace can.
>
> Hm let's backtrack a bit: What's the upside of this defined lifetime? Atm
> I see a lot of cost on the kernel side (getting this series in will take a
> few releases probably because there's so many interlocked subsystem-wide
> changes). But I don't see a benefit.

I think the current drm-modes-ref branch should take care of the cost
argument. The upside is that it gets much easier to debug and trace
command streams (i.e. it's pretty easy to remember mode ID 34; much
more tedious to dump out the entire modeline every time), and that
it's much more difficult to screw up in incredibly subtle ways (two
independent mode properties which have to be updated together at all
times, including in error paths? what could possibly go wrong ...).

> The only place we currently read out the current configuration is when
> the compositor starts up so that it can perfectly take over whatever's
> been set up by the firmware/boot splash for fastboot. Afterwards everyone
> just bashes in their own config, even when switching between compositors
> and all that.

They bash in their own config, because right now it's about all you
can do. As we're giving userspace an interface that allows sensible
handovers and incremental updates, why not use it ... ? That we do
full mode reprogramming when moving from splash screen to compositor
is a bug, not a great piece of design. If we can (even subtly)
encourage userspace to be better and smarter, then I don't see why we
wouldn't.

It also makes it a nightmare to debug the userspace side of things,
because when you just have to smash the entire config in all the time,
it becomes much more difficult to separate the intended changes out
from the unintended ones. Which means people are more likely to throw
their hands up and just blame the kernel.

> Without a clear use-case that justifies the work I don't think we should
> do this - someone will come up with clever abuse for it and then we have
> another hard-to-work-around regression at hand. So from my side the only
> requirement is that atomic clients should be able to somehow get at the
> current blob properties like mode, gamma ...

Which of the requirements are you worried about people abusing?

>> > I think we need much less: If your driver supports atomic (and hence
>> > userspace might be asking for the mode blob prop id) then that blob should
>> > survive as long as the mode is in use.
>>
>> Which not only requires the most invasive of the changes anyway
>> (modulo those to crtc->mode, which can regardless be dropped), but
>> also a bunch of 'every time the state changes, go consult some
>> auxiliary property and potentially expire its lifetime'.
>
> Well yeah, but for the mode property the only place it changes (besides
> the atomic ioctl) is in drm_atomic_helper_set_config.

You forgot intel_crtc_set_config. ;)

> We have a lot more
> compat glue already than the few lines you'd need to add in there. The
> crucial bit is that we only need to make this work for atomic drivers,
> since non-atomic drivers won't ever expose the MODE_ID property. And
> non-atomic userspace doesn't care anyway.

That's actually a fair point, and why I don't so much mind rowing back
on the crtc->mode changes. I'm happy to leave never-will-be-atomic
drivers in their own little ghetto. But it does make it harder and
more fiddly for people to implement atomic drivers: it's one more
subtle thing to screw up and get wrong, especially in error paths.
Whereas crtc_state->mode = drm_mode_reference(foo) is pretty hard to
get wrong.

> Also the important part here isn't so much the mode (userspace can get at
> that), but other new blob properties like per-plane gamma or other color
> corrections tables.

Sure.

>> > Somewhat unordered, but here's what I think we need:
>> > - Subtyping blob properties is not needed, at least I can't think of a
>> >   use-case. It will result though in lots of duplicated code for
>> >   duplicated ref/unref functions and atomic prop handling.
>>
>> When you say 'subtyping', do you mean what I've done with getblob?
>
> Not just but yes. The other result is that we'd need to duplicate the
> property decoding code for each type in the atomic ioctl/set_property
> functions. And you need per-type reference/unreference functions.

At least as far as gamma and other LUTs are concerned, I agree.

>> > - Since we already have a getblob ioctl I think we should just extend the
>> >   existing drm_property_blob:
>>
>> That was actually my initial implementation, but didn't like the
>> resulting reference tie-up between drm_display_mode and
>> drm_property_blob (aka drm_mode_modeinfo). I couldn't figure out a
>> really clean way to do it that didn't provably fall down at some
>> point, so abandoned it and went for referencing drm_display_mode
>> instead.
>
> Why refcount drm_display_mode? That's the drm-internal representation, we
> can just keep that as-is everywhere. I think that's the main point of why
> you opted for this much more invasive refactoring, and I don't see why
> we'd need it. The added kref for drm_blob_property is imo really all the
> krefs I think we need.

I think it's now much less invasive. The reason I opted for doing it
like that is because it avoided duplicating the mode structure in such
a way that resulted in non-obvious breakage. I just don't like the
potential for people to break the synchronisation between the two.
(Which was part of the reason I ended up doing the crtc->mode
constness dance in the first place ...)

>> >   Plus ofc changing drm_property_create/destroy_blob into ref/unref
>> >   functions. And doing the same weak reference trick for idrs as we're
>> >   using for framebuffers now.

Indeed.

>> >   For mode properties the data contained would be struct drm_mode_modeinfo
>> >   (i.e. the ABI struct we already use for the setcrtc/getconnector ioctls,
>> >   not the internal one).
>>
>> Right, this is what's there: 25/37 is explicitly always returning
>> modeinfo, and the creation does as well. I don't see any benefit to
>> exposing the internal struct.
>
> Hm, maybe I need to explain my motiviation for the blob prop stuff: I want
> to use it for gamma tables (not just the per-crtc one we have, but
> per-plane tables too) and other bigger properties. Using it for the mode
> by packing the existing drm_mode_modeinfo into a blob too was just a bit
> an afterthought - for the mode we could as well just add properties for
> all the individual members, like we've done with all the other structures.
>
> But loading gamma tables with gamma-0 up to gamma-1024 is way too
> unwielding, so that's why we imo really need blobs.

Right, that makes total sense. I don't think anything in here
precludes using this, except the precedent of refcounting the mode
rather than the blob property. The reason I did that is because we
already _have_ a drm_mode_object type for modes, and having two
properties tied together in a mutual deathgrip seemed like an
invitation for failure.

So for gamma, either we could take the same approach and generate
wrapper objects around the blobs - which is useful when you have other
representations which are internally widely used (e.g.
drm_display_mode vs. drm_mode_modeinfo), or just refcount the blobs
and let drivers deal with any conversion. My guess is that the latter
would be most useful, as there's nothing we can really standardise on
- and at least currently, gamma _only_ tends to come from userspace.
Not only do modes have an alternate representation which is already
useful enough to have its own object type and ID set, but they
basically all come from the kernel, with the drm_mode_modeinfo being a
conversion, rather than the other way around. Hence why I chose to
keep the current internal representation as the primary.

So I think a lot of the similarities on there are misleading when you
scratch the surface of it; again, I had started with that exact
approach when I came into this, but later revised it.

>> >   Getting the refcounting right for the atomic ioctl should be simple. For
>> >   the legacy ->set_config entrypoint we need to fix things up in the
>> >   helper when we update the mode, by manually releasing the old mode blob
>> >   and creating a new one (owned by the kernel to avoid leaks because old
>> >   userspace won't clean them up - we've had this bug with compat cursor
>> >   fbs just recently).
>>
>> To be honest I think the fb refcnt bug is pretty apropos here - by
>> keeping these awkwardly split out, independently reference-counted,
>> not sensibly linked to each other, and not even verifying the
>> correctness of the users (const), you're inviting another disaster
>> along the same lines. Especially as drivers start to move away from
>> the helpers and are expected to manage more state themselves.
>
> Which fb refcount bug? We've had piles of them in the past, simply because
> refcounting is apparently hard.

Yeah. So if we're introducing refcounting, why not make it as easy and
as clear as possible, rather than splitting your objects between one
that's refcounted and used for half of what you're doing, and another
that isn't refcounted and is used for the other half? We already know
that people screw this up, so why not make it as clear and as
difficult to get wrong as possible?

>> I agree the initial series was too invasive and there's value in
>> having a much more minimal series, so how about I split out the
>> crtc->mode constness (but not pointer - I don't think I have any
>> desire to ever push that through tbqh) changes into another series we
>> can track separately, and then see if we can get a much more minimal
>> patchset that doesn't make me shudder and think of previous
>> refcnt/state-consistency horrors, nor you of atomic conversions past,
>> together. I'm happy to pull in the other guys doing atomic conversions
>> and get some testing out of them to see if it'll work for them and how
>> well, but having it run on Tegra was about as painless as I could've
>> hoped for.
>
> Hm, how does a const crtc->mode as a non-pointer work? We do need to
> update this one in set_config. Or would you do a forced cast in these
> cases?
>
> But even for that I don't see why we'd need to change crtc->mode really to
> make the atomic mode a blob property.

I agree; I'd initially thought that was necessary, and it was only
when I'd got a reasonable way down the road that I'd realised it
wasn't necessary. By the time I'd done that, it seemed prudent to
finish it off and fix the drivers, if only to convince myself that it
wouldn't be breaking drivers or introducing bugs by doing all this.

I'm not sure we're going to convince each other at this point, so I'll
just skim over the core arguments and leave it at this:
  - this is no longer really invasive at all
  - modes aren't as similar to other blobs as they first seem (else
we'd just be using drm_mode_modeinfo with no mode-object type
everywhere)
  - refcounting modes provides for much easier debuggability and
traceability for both kernel and userspace, and especially across the
boundary where you may not be familiar with the other side
  - giving people another opportunity to screw things up is never a
good thing, especially when (both in refcounting and specifically in
modes) they've repeatedly proven that they do screw it up horrendously

If you have a good-faith look at the patchset (git URL is in with
cgit) and still want it re-done to add the blob properties on the side
instead, then I'll do that. But hopefully the much smaller series
(I've also squashed some of the more aggressively-split patches, on
the grounds that they're not that crucial for bisect) will convince
you ... ?

Cheers,
Daniel

[0]:
 drivers/gpu/drm/drm_atomic.c              |  35 +++-
 drivers/gpu/drm/drm_atomic_helper.c       |  69 +++++--
 drivers/gpu/drm/drm_crtc.c                | 325 ++++++++++++++++++++----------
 drivers/gpu/drm/drm_crtc_helper.c         |  23 ++-
 drivers/gpu/drm/drm_fb_helper.c           |  13 +-
 drivers/gpu/drm/drm_fops.c                |   7 +-
 drivers/gpu/drm/drm_ioctl.c               |   2 +
 drivers/gpu/drm/drm_modes.c               | 139 ++++++++++++-
 drivers/gpu/drm/i915/intel_atomic.c       |  13 +-
 drivers/gpu/drm/rcar-du/rcar_du_encoder.c |   2 +-
 drivers/gpu/drm/rcar-du/rcar_du_hdmienc.c |   2 +-
 drivers/gpu/drm/tegra/dc.c                |   7 +
 drivers/gpu/drm/tegra/drm.c               |   2 +-
 drivers/gpu/drm/tegra/dsi.c               |   7 +-
 drivers/gpu/drm/tegra/hdmi.c              |   3 +-
 drivers/gpu/drm/tegra/rgb.c               |   3 +-
 drivers/gpu/drm/tegra/sor.c               |   3 +-
 include/drm/drmP.h                        |   6 +
 include/drm/drm_crtc.h                    |  17 +-
 include/drm/drm_modes.h                   |  12 ++
 include/uapi/drm/drm.h                    |   2 +
 include/uapi/drm/drm_mode.h               |  22 ++
 22 files changed, 556 insertions(+), 158 deletions(-)