[PATCH] drm/amdgpu: fix drm leases being broken on radv
Koenig, Christian
Christian.Koenig at amd.com
Wed Apr 17 17:30:23 UTC 2019
Am 17.04.19 um 17:51 schrieb Emil Velikov:
> Hi guys,
>
> On 2019/04/17, Daniel Vetter wrote:
>> On Wed, Apr 17, 2019 at 02:46:06PM +0200, Christian König wrote:
>>> Am 17.04.19 um 14:35 schrieb Daniel Vetter:
>>>> On Wed, Apr 17, 2019 at 12:06:32PM +0000, Koenig, Christian wrote:
>>>>> Am 17.04.19 um 14:00 schrieb Daniel Vetter:
>>>>>> On Wed, Apr 17, 2019 at 11:18:35AM +0000, Koenig, Christian wrote:
>>>>>>> Am 17.04.19 um 13:06 schrieb Daniel Vetter:
>>>>>>>> On Wed, Apr 17, 2019 at 12:29 PM Koenig, Christian
>>>>>>>> <Christian.Koenig at amd.com> wrote:
>>>>>>>> [SNIP]
>>>>>>> Well, what you guys did here is a serious no-go.
>>>>>> Not really understanding your reasons for an unconditional nak on all this
>>>>>> still.
>>>>> Well, let me summarize: You worked around a problem with libva by
>>>>> breaking another driver and now proposing a rather ugly and only halve
>>>>> backed workaround for that driver.
>>>>>
>>>>> This is a serious no-go. I've already send out a i915 specific change to
>>>>> remove DRM_AUTH from IOCTLs which also have DRM_RENDER_ALLOW and revert
>>>>> the offending patch.
>>>> Oh I'm totally fine with the revert if that's what we go with. But that
>>>> still leaves the issue that it would make sense to drop DRM_AUTH from
>>>> DRIVER_RENDER capable drivers (not just for libva, but in general). And
>>>> there's fundamentally 2 options to do that:
>>>>
>>>> - This one here (or anything implementing the same idea), to keep radv
>>>> working.
>>>>
>>>> - We drop DRM_AUTH from all drivers with DRIVER_RENDER except amdgpu. How
>>>> exactly that's doen doesn't matter, but it leaves amdgpu out in the
>>>> cold. It also doesn't matter whether we get there with revert + lots of
>>>> patches, or a special hack for amdgpu, in the end amdgpu would be
>>>> different. Also note that your patch isn't enough, since it doesn't fix
>>>> the core ioctls.
>>>>
>>>> I think from an overall platform pov, the first is the better option.
>>>> After all we try really hard to avoid driver special cases for these kinds
>>>> of things.
>>> Well, I initially thought that radv is doing something unusual or broken,
>>> but after looking deeper into this that is actually not the case.
>>>
>>> What radv does is calling an IOCTL and expecting an error message when it is
>>> not authenticated.
>>>
>>> It looks like libdrm_amdgpu has switched to using DRM_IOCTL_GET_CLIENT to
>>> figure out if it is authenticated or not, but I clearly remember that we
>>> haven't done this from the beginning.
>> Maybe in the radoen code, or some earlier internal amdgpu libdrm code?
>> First public commit has all the bits: getauth, GetVersion, then the accel
>> query.
>>
>>> Thinking more about this I don't think that this problem is actually amdgpu
>>> specific. So I wouldn't drop DRM_AUTH from all render node drivers at all,
>>> but only from those who have the specific issue with libva.
>> libva was the initial motivation, the goal of Emil's patch wasn't to just
>> duct-tape over libva. That would be easier to fix in libva. The point was
>> that we should be able to allow this in general.
>>
>> And we can, except for the radv issue uncovered here.
>>
>> So please don't get 100% hung up on the libva thing, that wasn't really
>> the goal, just the initial motivation. And I still thinks it makes for all
>> drivers. So again we have:
>>
>> - radv hack
>> - make amdgpu behave different from everyone else
>> - keep tilting windmills about "pls use rendernodes, thx"
>>
>> I neither like tilting windmills nor making drivers behave different for
>> no reaason at all.
> Allow me to jump-in some emails down the line.
>
> First and foremost, sincere apologies for upsetting you Christian. If
> it's of any consolidation - let me assure you the goal is _not_ to break
> amdgpu or any other driver.
>
> Secondly, I would like to ask you for a list of projects so we can look and
> investigate properly. So far you've mentioned libdrm_amdgpu - I'll look into
> it.
That is rather hard to come by, because you would also need to look at
all previously existing driver stacks.
E.g. with the classic R300 driver for example.
> On the topic of "working around libva" - sadly libva is _not_ the only
> offender. We also had bugs in mesa and kmscube.
>
> Considering those are taken as a prime example of "the right way", it's very
> likely for the mistakes to be repeated in many other projects.
>
> Where the common "fix" seems to be "run as root"...
>
>
> As Daniel pointed out, we could be fighting the windmills or we could have a
> small, admittedly ugly, workaround for amdgpu.
>
> If you don't like that workaround in the driver we could move it to core DRM.
>
> In either case, I would like to focus on a pragramic solution which works with
> both old and new userspace.
Well, I seriously think the original committed solution could cause a
lot of problems and the issue with radv is only the tip of the iceberg.
I mean we just have to ask our self why have we created render nodes in
the first place? The obvious alternative was to just removed the
authentication restrictions on the primary node which would have been
way less code and maintenance burden.
I need to dig up the mailing list archive, but I strongly think that one
of the main arguments for this approach was to NOT break existing userspace.
Even taking into account that we now don't need to deal with UMS and
really really old userspace drivers any more it still feels like a to
high risk going down that route.
Christian.
>
> Thanks
> Emil
More information about the dri-devel
mailing list