[PATCH 01/13] drm/amdgpu: introduce and honour DRM_FORCE_AUTH workaround
emil.l.velikov at gmail.com
Fri Jun 21 12:01:42 UTC 2019
On 2019/06/21, Daniel Vetter wrote:
> On Fri, Jun 21, 2019 at 1:50 PM Daniel Vetter <daniel at ffwll.ch> wrote:
> > On Fri, Jun 21, 2019 at 1:37 PM Christian König
> > <ckoenig.leichtzumerken at gmail.com> wrote:
> > >
> > > Am 21.06.19 um 13:03 schrieb Daniel Vetter:
> > > > On Fri, Jun 21, 2019 at 12:31 PM Koenig, Christian
> > > > <Christian.Koenig at amd.com> wrote:
> > > >> Am 21.06.19 um 12:20 schrieb Emil Velikov:
> > > >>> In particular, that user-space will "remove" render nodes.
> > > >> Yes, that is my main concern here. You basically make render nodes
> > > >> superfluously. That gives userspace all legitimization to also remove
> > > >> support for them. That is not stupid or evil or whatever, userspace
> > > >> would just follow the kernel design.
> > > > This already happened. At least for kms-only display socs we had to
> > > > hide the separate render node (and there you really have to deal with
> > > > the separate render node, because it's a distinct driver) entirely
> > > > within gbm/mesa.
> > >
> > > Ok, that is something I didn't knew before and is rather interesting.
> > >
> > > > So if you want to depracate render functionality on primary nodes, you
> > > > _have_ to do that hiding in userspace. Or you'll break a lot of
> > > > compositors everywhere. Just testing -amdgpu doesn't really prove
> > > > anything here. So you won't move the larger ecosystem with this at
> > > > all, that ship sailed.
> > >
> > > So what else can we do? That sounds like you suggest we should
> > > completely drop render node functionality.
> > >
> > > I mean it's not my preferred option, but certainly something that
> > > everybody could do.
> > >
> > > My primary concern is that we somehow need to get rid of thinks like GEM
> > > flink and all that other crufty stuff we still have around on the
> > > primary node (we should probably make a list of that).
> > >
> > > Switching everything over to render nodes just sounded like the best
> > > alternative so far to archive that.
> > tbh I do like that plan too, but it's a lot more work. And I think to
> > have any push whatsoever we probably need to roll it out in gbm as a
> > hack to keep things going. But probably not enough.
> > I also think that at least some compositors will bother to do the
> > right thing, and actually bother with render nodes and all that
> > correctly. It's just that there's way more which dont.
> > Also for server rendering it'll be render nodes all the way down I
> > hope (or we need to seriously educate cloud people about the
> > permissions they set on their default images, and distros on how this
> > cloud stuff should work.
> > > > At that point this all feels like a bikeshed, and for a bikeshed I
> > > > don't care what the color is we pick, as long as they're all painted
> > > > the same.
> > > >
> > > > Once we picked a color it's a simple technical matter of how to roll
> > > > it out, using Kconfig options, or only enabling on new hw, or "merge
> > > > and fix the regressions as they come" or any of the other well proven
> > > > paths forward.
> > >
> > > Yeah, the problem is I don't see an option which currently works for
> > > everyone.
> > >
> > > I absolutely need a grace time of multiple years until we can apply this
> > > to amdgpu/radeon.
> > Yeah that's what I meant with "pick a color, pick a way to roll it
> > out". "enable for new hw, roll out years and years later" is one of
> > the options for roll out.
> > > And that under the prerequisite to have a plan to somehow enable that
> > > functionality now to make it at least painful for userspace to rely on
> > > hack around that.
> > >
> > > > So if you want to do this, please start with the mesa side work (as
> > > > the biggest userspace, not all of it) with patches to redirect all
> > > > primary node rendering to render nodes. The code is there already for
> > > > socs, just needs to be rolled out more. Or we go with the DRM_AUTH
> > > > horrors. Or a 3rd option, I really don't care which it is, as long as
> > > > its consistent.
> > >
> > > How about this:
> > > 1. We keep DRM_AUTH around for amdgpu/radeon for now.
> > > 2. We add a Kconfig option to ignore DRM_AUTH, currently default to N.
> > > This also works as a workaround for old RADV.
> > > 3. We start to work on further removing old cruft from the primary node.
> > > Possible the Kconfig option I suggested to disable GEM flink.
> > Hm I'm not worried about flink really. It's gem_open which is the
> > security gap, and that one has to stay master-only forever. I guess we
> > could try to disable that so people have to deal with dma-buf (and
> > once you have that render nodes become a lot easier to use). But then
> > I still think we have drivers which don't do dma-buf self-import, so
> > again we're stuck. So maybe first step is to just roll out a default
> > self-import dma-buf support for every gem driver. Then start ditching
> > flink/gem_open. Then start ditching render support on primary nodes.
> > Every step in the way taking a few years of prodding userspace plus
> > even more years to wait until it's all gone.
> I forgot one step here: I think we even still have drivers without
> render node support. As long as those exists (and are still relevant)
> then userspace needs primary node rendering + flink code anyway. And
> they're not going to be happy about us telling them to add more. So I
> think that would need to be fixed first. Hence rough plan:
> - Make sure all gem drivers with rendering have render nodes.
> - Roll out self-import of dma-buf for all gem drivers (we can do that
> with 0 driver support, it's like flink).
> - Roll out mesa gbm for everyone to ignore primary nodes for rendering
> as much as possible. Maybe needs more gbm work so that compositors can
> ask for the display drmfd and the render drmfd.
> - wait. like really long time :-/
> - slowly deprecate flink for new hw as additional forcing function to
> get people to move to dma-buf and render nodes
> - wait even longer
> - ditch rendering on primary nodes.
> Lots of work, and I really mean _lots_, but I think this has a chance
> of actually working.
Thanks for the extensive proposal/list Daniel. I mentioned a, perhaps
too short, version of this a while back.
Above sounds perfectly reasonable IMHO.
More information about the amd-gfx