[PATCH 01/13] drm/amdgpu: introduce and honour DRM_FORCE_AUTH workaround

Daniel Vetter daniel at ffwll.ch
Fri Jun 21 11:59:16 UTC 2019

On Fri, Jun 21, 2019 at 1:50 PM Daniel Vetter <daniel at ffwll.ch> wrote:
> On Fri, Jun 21, 2019 at 1:37 PM Christian K├Ânig
> <ckoenig.leichtzumerken at gmail.com> wrote:
> >
> > Am 21.06.19 um 13:03 schrieb Daniel Vetter:
> > > On Fri, Jun 21, 2019 at 12:31 PM Koenig, Christian
> > > <Christian.Koenig at amd.com> wrote:
> > >> Am 21.06.19 um 12:20 schrieb Emil Velikov:
> > >>> In particular, that user-space will "remove" render nodes.
> > >> Yes, that is my main concern here. You basically make render nodes
> > >> superfluously. That gives userspace all legitimization to also remove
> > >> support for them. That is not stupid or evil or whatever, userspace
> > >> would just follow the kernel design.
> > > This already happened. At least for kms-only display socs we had to
> > > hide the separate render node (and there you really have to deal with
> > > the separate render node, because it's a distinct driver) entirely
> > > within gbm/mesa.
> >
> > Ok, that is something I didn't knew before and is rather interesting.
> >
> > > So if you want to depracate render functionality on primary nodes, you
> > > _have_ to do that hiding in userspace. Or you'll break a lot of
> > > compositors everywhere. Just testing -amdgpu doesn't really prove
> > > anything here. So you won't move the larger ecosystem with this at
> > > all, that ship sailed.
> >
> > So what else can we do? That sounds like you suggest we should
> > completely drop render node functionality.
> >
> > I mean it's not my preferred option, but certainly something that
> > everybody could do.
> >
> > My primary concern is that we somehow need to get rid of thinks like GEM
> > flink and all that other crufty stuff we still have around on the
> > primary node (we should probably make a list of that).
> >
> > Switching everything over to render nodes just sounded like the best
> > alternative so far to archive that.
> tbh I do like that plan too, but it's a lot more work. And I think to
> have any push whatsoever we probably need to roll it out in gbm as a
> hack to keep things going. But probably not enough.
> I also think that at least some compositors will bother to do the
> right thing, and actually bother with render nodes and all that
> correctly. It's just that there's way more which dont.
> Also for server rendering it'll be render nodes all the way down I
> hope (or we need to seriously educate cloud people about the
> permissions they set on their default images, and distros on how this
> cloud stuff should work.
> > > At that point this all feels like a bikeshed, and for a bikeshed I
> > > don't care what the color is we pick, as long as they're all painted
> > > the same.
> > >
> > > Once we picked a color it's a simple technical matter of how to roll
> > > it out, using Kconfig options, or only enabling on new hw, or "merge
> > > and fix the regressions as they come" or any of the other well proven
> > > paths forward.
> >
> > Yeah, the problem is I don't see an option which currently works for
> > everyone.
> >
> > I absolutely need a grace time of multiple years until we can apply this
> > to amdgpu/radeon.
> Yeah that's what I meant with "pick a color, pick a way to roll it
> out". "enable for new hw, roll out years and years later" is one of
> the options for roll out.
> > And that under the prerequisite to have a plan to somehow enable that
> > functionality now to make it at least painful for userspace to rely on
> > hack around that.
> >
> > > So if you want to do this, please start with the mesa side work (as
> > > the biggest userspace, not all of it) with patches to redirect all
> > > primary node rendering to render nodes. The code is there already for
> > > socs, just needs to be rolled out more. Or we go with the DRM_AUTH
> > > horrors. Or a 3rd option, I really don't care which it is, as long as
> > > its consistent.
> >
> > How about this:
> > 1. We keep DRM_AUTH around for amdgpu/radeon for now.
> > 2. We add a Kconfig option to ignore DRM_AUTH, currently default to N.
> > This also works as a workaround for old RADV.
> > 3. We start to work on further removing old cruft from the primary node.
> > Possible the Kconfig option I suggested to disable GEM flink.
> Hm I'm not worried about flink really. It's gem_open which is the
> security gap, and that one has to stay master-only forever. I guess we
> could try to disable that so people have to deal with dma-buf (and
> once you have that render nodes become a lot easier to use). But then
> I still think we have drivers which don't do dma-buf self-import, so
> again we're stuck. So maybe first step is to just roll out a default
> self-import dma-buf support for every gem driver. Then start ditching
> flink/gem_open. Then start ditching render support on primary nodes.
> Every step in the way taking a few years of prodding userspace plus
> even more years to wait until it's all gone.

I forgot one step here: I think we even still have drivers without
render node support. As long as those exists (and are still relevant)
then userspace needs primary node rendering + flink code anyway. And
they're not going to be happy about us telling them to add more. So I
think that would need to be fixed first. Hence rough plan:

- Make sure all gem drivers with rendering have render nodes.
- Roll out self-import of dma-buf for all gem drivers (we can do that
with 0 driver support, it's like flink).
- Roll out mesa gbm for everyone to ignore primary nodes for rendering
as much as possible. Maybe needs more gbm work so that compositors can
ask for the display drmfd and the render drmfd.
- wait. like really long time :-/
- slowly deprecate flink for new hw as additional forcing function to
get people to move to dma-buf and render nodes
- wait even longer
- ditch rendering on primary nodes.

Lots of work, and I really mean _lots_, but I think this has a chance
of actually working.

> Another option would be to extract the kms stuff from primary nodes,
> but we've tried that with control nodes. Until I realized a few years
> back that with control nodes it's impossible to get any rendered
> buffer in there at all, so useless, and I removed it. No one ever
> complained.
> So yeah would be really nice if we could fix this, but the universe
> conspires against us too much it seems. Hence the fallback of "please
> at least lets aim for a consistent color for this mess, whatever it
> is".
> Cheers, Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

More information about the amd-gfx mailing list