[PATCH] drm/amdgpu: fix drm leases being broken on radv

Dave Airlie airlied at gmail.com
Wed Apr 17 20:49:30 UTC 2019


On Thu, 18 Apr 2019 at 03:30, Koenig, Christian
<Christian.Koenig at amd.com> wrote:
>
> Am 17.04.19 um 17:51 schrieb Emil Velikov:
> > Hi guys,
> >
> > On 2019/04/17, Daniel Vetter wrote:
> >> On Wed, Apr 17, 2019 at 02:46:06PM +0200, Christian König wrote:
> >>> Am 17.04.19 um 14:35 schrieb Daniel Vetter:
> >>>> On Wed, Apr 17, 2019 at 12:06:32PM +0000, Koenig, Christian wrote:
> >>>>> Am 17.04.19 um 14:00 schrieb Daniel Vetter:
> >>>>>> On Wed, Apr 17, 2019 at 11:18:35AM +0000, Koenig, Christian wrote:
> >>>>>>> Am 17.04.19 um 13:06 schrieb Daniel Vetter:
> >>>>>>>> On Wed, Apr 17, 2019 at 12:29 PM Koenig, Christian
> >>>>>>>> <Christian.Koenig at amd.com> wrote:
> >>>>>>>> [SNIP]
> >>>>>>> Well, what you guys did here is a serious no-go.
> >>>>>> Not really understanding your reasons for an unconditional nak on all this
> >>>>>> still.
> >>>>> Well, let me summarize: You worked around a problem with libva by
> >>>>> breaking another driver and now proposing a rather ugly and only halve
> >>>>> backed workaround for that driver.
> >>>>>
> >>>>> This is a serious no-go. I've already send out a i915 specific change to
> >>>>> remove DRM_AUTH from IOCTLs which also have DRM_RENDER_ALLOW and revert
> >>>>> the offending patch.
> >>>> Oh I'm totally fine with the revert if that's what we go with. But that
> >>>> still leaves the issue that it would make sense to drop DRM_AUTH from
> >>>> DRIVER_RENDER capable drivers (not just for libva, but in general). And
> >>>> there's fundamentally 2 options to do that:
> >>>>
> >>>> - This one here (or anything implementing the same idea), to keep radv
> >>>>     working.
> >>>>
> >>>> - We drop DRM_AUTH from all drivers with DRIVER_RENDER except amdgpu. How
> >>>>     exactly that's doen doesn't matter, but it leaves amdgpu out in the
> >>>>     cold. It also doesn't matter whether we get there with revert + lots of
> >>>>     patches, or a special hack for amdgpu, in the end amdgpu would be
> >>>>     different. Also note that your patch isn't enough, since it doesn't fix
> >>>>     the core ioctls.
> >>>>
> >>>> I think from an overall platform pov, the first is the better option.
> >>>> After all we try really hard to avoid driver special cases for these kinds
> >>>> of things.
> >>> Well, I initially thought that radv is doing something unusual or broken,
> >>> but after looking deeper into this that is actually not the case.
> >>>
> >>> What radv does is calling an IOCTL and expecting an error message when it is
> >>> not authenticated.
> >>>
> >>> It looks like libdrm_amdgpu has switched to using DRM_IOCTL_GET_CLIENT to
> >>> figure out if it is authenticated or not, but I clearly remember that we
> >>> haven't done this from the beginning.
> >> Maybe in the radoen code, or some earlier internal amdgpu libdrm code?
> >> First public commit has all the bits: getauth, GetVersion, then the accel
> >> query.
> >>
> >>> Thinking more about this I don't think that this problem is actually amdgpu
> >>> specific. So I wouldn't drop DRM_AUTH from all render node drivers at all,
> >>> but only from those who have the specific issue with libva.
> >> libva was the initial motivation, the goal of Emil's patch wasn't to just
> >> duct-tape over libva. That would be easier to fix in libva. The point was
> >> that we should be able to allow this in general.
> >>
> >> And we can, except for the radv issue uncovered here.
> >>
> >> So please don't get 100% hung up on the libva thing, that wasn't really
> >> the goal, just the initial motivation. And I still thinks it makes for all
> >> drivers. So again we have:
> >>
> >> - radv hack
> >> - make amdgpu behave different from everyone else
> >> - keep tilting windmills about "pls use rendernodes, thx"
> >>
> >> I neither like tilting windmills nor making drivers behave different for
> >> no reaason at all.
> > Allow me to jump-in some emails down the line.
> >
> > First and foremost, sincere apologies for upsetting you Christian. If
> > it's of any consolidation - let me assure you the goal is _not_ to break
> > amdgpu or any other driver.
> >
> > Secondly, I would like to ask you for a list of projects so we can look and
> > investigate properly. So far you've mentioned libdrm_amdgpu - I'll look into
> > it.
>
> That is rather hard to come by, because you would also need to look at
> all previously existing driver stacks.
>
> E.g. with the classic R300 driver for example.
>
> > On the topic of "working around libva" - sadly libva is _not_ the only
> > offender. We also had bugs in mesa and kmscube.
> >
> > Considering those are taken as a prime example of "the right way", it's very
> > likely for the mistakes to be repeated in many other projects.
> >
> > Where the common "fix" seems to be "run as root"...
> >
> >
> > As Daniel pointed out, we could be fighting the windmills or we could have a
> > small, admittedly ugly, workaround for amdgpu.
> >
> > If you don't like that workaround in the driver we could move it to core DRM.
> >
> > In either case, I would like to focus on a pragramic solution which works with
> > both old and new userspace.
>
> Well, I seriously think the original committed solution could cause a
> lot of problems and the issue with radv is only the tip of the iceberg.
>
> I mean we just have to ask our self why have we created render nodes in
> the first place? The obvious alternative was to just removed the
> authentication restrictions on the primary node which would have been
> way less code and maintenance burden.
>
> I need to dig up the mailing list archive, but I strongly think that one
> of the main arguments for this approach was to NOT break existing userspace.
>
> Even taking into account that we now don't need to deal with UMS and
> really really old userspace drivers any more it still feels like a to
> high risk going down that route.

I'm going to revert the original patch in drm-next now. We've been
getting a bit lax lately on the regressions need to get reverted no
matter what rule, and we are getting close to rc6 and it's Easter, so
I don't want to have to care about this, forget about it, remember it
again, end up at 5.2.

Personally I'm rather in favour of cleaning up the driver ioctls since
I don't like older drivers suddenly having a new ABI without
consideration of side effects.

Dave.


More information about the dri-devel mailing list