[PATCH] drm/amdgpu: fix drm leases being broken on radv

Emil Velikov emil.l.velikov at gmail.com
Thu Apr 18 13:59:17 UTC 2019


On Wed, 17 Apr 2019 at 21:49, Dave Airlie <airlied at gmail.com> wrote:
>
> On Thu, 18 Apr 2019 at 03:30, Koenig, Christian
> <Christian.Koenig at amd.com> wrote:
> >
> > Am 17.04.19 um 17:51 schrieb Emil Velikov:
> > > Hi guys,
> > >
> > > On 2019/04/17, Daniel Vetter wrote:
> > >> On Wed, Apr 17, 2019 at 02:46:06PM +0200, Christian König wrote:
> > >>> Am 17.04.19 um 14:35 schrieb Daniel Vetter:
> > >>>> On Wed, Apr 17, 2019 at 12:06:32PM +0000, Koenig, Christian wrote:
> > >>>>> Am 17.04.19 um 14:00 schrieb Daniel Vetter:
> > >>>>>> On Wed, Apr 17, 2019 at 11:18:35AM +0000, Koenig, Christian wrote:
> > >>>>>>> Am 17.04.19 um 13:06 schrieb Daniel Vetter:
> > >>>>>>>> On Wed, Apr 17, 2019 at 12:29 PM Koenig, Christian
> > >>>>>>>> <Christian.Koenig at amd.com> wrote:
> > >>>>>>>> [SNIP]
> > >>>>>>> Well, what you guys did here is a serious no-go.
> > >>>>>> Not really understanding your reasons for an unconditional nak on all this
> > >>>>>> still.
> > >>>>> Well, let me summarize: You worked around a problem with libva by
> > >>>>> breaking another driver and now proposing a rather ugly and only halve
> > >>>>> backed workaround for that driver.
> > >>>>>
> > >>>>> This is a serious no-go. I've already send out a i915 specific change to
> > >>>>> remove DRM_AUTH from IOCTLs which also have DRM_RENDER_ALLOW and revert
> > >>>>> the offending patch.
> > >>>> Oh I'm totally fine with the revert if that's what we go with. But that
> > >>>> still leaves the issue that it would make sense to drop DRM_AUTH from
> > >>>> DRIVER_RENDER capable drivers (not just for libva, but in general). And
> > >>>> there's fundamentally 2 options to do that:
> > >>>>
> > >>>> - This one here (or anything implementing the same idea), to keep radv
> > >>>>     working.
> > >>>>
> > >>>> - We drop DRM_AUTH from all drivers with DRIVER_RENDER except amdgpu. How
> > >>>>     exactly that's doen doesn't matter, but it leaves amdgpu out in the
> > >>>>     cold. It also doesn't matter whether we get there with revert + lots of
> > >>>>     patches, or a special hack for amdgpu, in the end amdgpu would be
> > >>>>     different. Also note that your patch isn't enough, since it doesn't fix
> > >>>>     the core ioctls.
> > >>>>
> > >>>> I think from an overall platform pov, the first is the better option.
> > >>>> After all we try really hard to avoid driver special cases for these kinds
> > >>>> of things.
> > >>> Well, I initially thought that radv is doing something unusual or broken,
> > >>> but after looking deeper into this that is actually not the case.
> > >>>
> > >>> What radv does is calling an IOCTL and expecting an error message when it is
> > >>> not authenticated.
> > >>>
> > >>> It looks like libdrm_amdgpu has switched to using DRM_IOCTL_GET_CLIENT to
> > >>> figure out if it is authenticated or not, but I clearly remember that we
> > >>> haven't done this from the beginning.
> > >> Maybe in the radoen code, or some earlier internal amdgpu libdrm code?
> > >> First public commit has all the bits: getauth, GetVersion, then the accel
> > >> query.
> > >>
> > >>> Thinking more about this I don't think that this problem is actually amdgpu
> > >>> specific. So I wouldn't drop DRM_AUTH from all render node drivers at all,
> > >>> but only from those who have the specific issue with libva.
> > >> libva was the initial motivation, the goal of Emil's patch wasn't to just
> > >> duct-tape over libva. That would be easier to fix in libva. The point was
> > >> that we should be able to allow this in general.
> > >>
> > >> And we can, except for the radv issue uncovered here.
> > >>
> > >> So please don't get 100% hung up on the libva thing, that wasn't really
> > >> the goal, just the initial motivation. And I still thinks it makes for all
> > >> drivers. So again we have:
> > >>
> > >> - radv hack
> > >> - make amdgpu behave different from everyone else
> > >> - keep tilting windmills about "pls use rendernodes, thx"
> > >>
> > >> I neither like tilting windmills nor making drivers behave different for
> > >> no reaason at all.
> > > Allow me to jump-in some emails down the line.
> > >
> > > First and foremost, sincere apologies for upsetting you Christian. If
> > > it's of any consolidation - let me assure you the goal is _not_ to break
> > > amdgpu or any other driver.
> > >
> > > Secondly, I would like to ask you for a list of projects so we can look and
> > > investigate properly. So far you've mentioned libdrm_amdgpu - I'll look into
> > > it.
> >
> > That is rather hard to come by, because you would also need to look at
> > all previously existing driver stacks.
> >
> > E.g. with the classic R300 driver for example.
> >
> > > On the topic of "working around libva" - sadly libva is _not_ the only
> > > offender. We also had bugs in mesa and kmscube.
> > >
> > > Considering those are taken as a prime example of "the right way", it's very
> > > likely for the mistakes to be repeated in many other projects.
> > >
> > > Where the common "fix" seems to be "run as root"...
> > >
> > >
> > > As Daniel pointed out, we could be fighting the windmills or we could have a
> > > small, admittedly ugly, workaround for amdgpu.
> > >
> > > If you don't like that workaround in the driver we could move it to core DRM.
> > >
> > > In either case, I would like to focus on a pragramic solution which works with
> > > both old and new userspace.
> >
> > Well, I seriously think the original committed solution could cause a
> > lot of problems and the issue with radv is only the tip of the iceberg.
> >
> > I mean we just have to ask our self why have we created render nodes in
> > the first place? The obvious alternative was to just removed the
> > authentication restrictions on the primary node which would have been
> > way less code and maintenance burden.
> >
> > I need to dig up the mailing list archive, but I strongly think that one
> > of the main arguments for this approach was to NOT break existing userspace.
> >
> > Even taking into account that we now don't need to deal with UMS and
> > really really old userspace drivers any more it still feels like a to
> > high risk going down that route.
>
> I'm going to revert the original patch in drm-next now. We've been
> getting a bit lax lately on the regressions need to get reverted no
> matter what rule, and we are getting close to rc6 and it's Easter, so
> I don't want to have to care about this, forget about it, remember it
> again, end up at 5.2.
>
Makes sense, thanks Dave.

Feel free to add the following to the revert:
Acked-by: Emil Velikov <emil.velikov at collabora.com>

> Personally I'm rather in favour of cleaning up the driver ioctls since
> I don't like older drivers suddenly having a new ABI without
> consideration of side effects.
>
Fwiw I think we're overloading the meaning of ABI - this is a functional change.
Which admittedly I'll have a loser and harder look that it doesn't
break things ;-)

Thanks
Emil


More information about the dri-devel mailing list