[PATCH 01/13] drm/amdgpu: introduce and honour DRM_FORCE_AUTH workaround

Koenig, Christian Christian.Koenig at amd.com
Tue May 28 16:22:28 UTC 2019


Am 28.05.19 um 18:10 schrieb Emil Velikov:
> On 2019/05/28, Daniel Vetter wrote:
>> On Tue, May 28, 2019 at 10:03 AM Koenig, Christian
>> <Christian.Koenig at amd.com> wrote:
>>> Am 28.05.19 um 09:38 schrieb Daniel Vetter:
>>>> [SNIP]
>>>>> Might be a good idea looking into reverting it partially, so that at
>>>>> least command submission and buffer allocation is still blocked.
>>>> I thought the issue is a lot more than vainfo, it's pretty much every
>>>> hacked up compositor under the sun getting this wrong one way or
>>>> another. Thinking about this some more, I also have no idea how you'd
>>>> want to deprecate rendering on primary nodes in general. Apparently
>>>> that breaks -modesetting already, and probably lots more compositors.
>>>> And it looks like we're finally achieve the goal kms set out to 10
>>>> years ago, and new compositors are sprouting up all the time. I guess
>>>> we could just break them all (on new hardware) and tell them to all
>>>> suck it up. But I don't think that's a great option. And just
>>>> deprecating this on amdgpu is going to be even harder, since then
>>>> everywhere else it'll keep working, and it's just amdgpu.ko that looks
>>>> broken.
>>>>
>>>> Aside: I'm not supporting Emil's idea here because it fixes any issues
>>>> Intel has - Intel doesn't care. I support it because reality sucks,
>>>> people get this render vs. primary vs. multi-gpu prime wrong all the
>>>> time (that's also why we have hardcoded display+gpu pairs in mesa for
>>>> the various soc combinations out there), and this looks like a
>>>> pragmatic solution. It'd be nice if every compositor and everything
>>>> else would perfectly support multi gpu and only use render nodes for
>>>> rendering, and only primary nodes for display. But reality is that
>>>> people hack on stuff until gears on screen and then move on to more
>>>> interesting things (to them). So I don't think we'll ever win this :-/
>>> Yeah, but this is a classic case of working around user space issues by
>>> making kernel changes instead of fixing user space.
>>>
>>> Having privileged (output control) and unprivileged (rendering control)
>>> functionality behind the same node is a mistake we have made a long time
>>> ago and render nodes finally seemed to be a way to fix that.
>>>
>>> I mean why are compositors using the primary node in the first place?
>>> Because they want to have access to privileged resources I think and in
>>> this case it is perfectly ok to do so.
>>>
>>> Now extending unprivileged access to the primary node actually sounds
>>> like a step into the wrong direction to me.
>>>
>>> I rather think that we should go down the route of completely dropping
>>> command submission and buffer allocation through the primary node for
>>> non master clients. And then as next step at some point drop support for
>>> authentication/flink.
>>>
>>> I mean we have done this with UMS as well and I don't see much other way
>>> to move forward and get rid of those ancient interface in the long term.
>> Well kms had some really good benefits that drove quick adoption, like
>> "suspend/resume actually has a chance of working" or "comes with
>> buffer management so you can run multiple gears".
>>
>> The render node thing is a lot more niche use case (prime, better priv
>> separation), plus "it's cleaner design". And the "cleaner design" part
>> is something that empirically doesn't seem to matter :-/ Just two
>> examples:
>> - KHR_display/leases just iterated display resources on the fd needed
>> for rendering (and iirc there was even a patch to expose that for
>> render nodes too so it works with DRI3), because implementing
>> protocols is too hard. Barely managed to stop that one before it
>> happened.
>> - Various video players use the vblank ioctl on directly to schedule
>> frames, without telling the compositor. I discovered that when I
>> wanted to limite the vblank ioctl to master clients only. Again,
>> apparently too hard to use the existing extensions, or fix the bugs in
>> there, or whatever. One userspace got fixed last year, but it'll
>> probably get copypasted around forever :-/
>>
>> So I don't think we'll ever manage to roll a clean split out, and best
>> we can do is give in and just hand userspace what it wants. As much as
>> that's misguided and unclean and all that. Maybe it'll result in a
>> least fewer stuff getting run as root to hack around this, because
>> fixing properly seems not to be on the table.
>>
>> The beauty of kms is that we've achieved the mission, everyone's
>> writing their own thing. Which is also terrible, and I don't think
>> it'll get better.
>
> With the risk of coming rude I will repeat my earlier comment:
>
> The problem is _neither_ Intel nor libva specific.
>
>
>
> That said, let's step back for a moment and consider:
>
>   - the "block everything but KMS via the primary node" idea is great but
> orthogonal
>
>   - the series does address issues that are vendor-agnostic
>
>   - by default this series does _not_ cause any regression be that for
> new or old userspace
>
>   - there are two trivial solutions, if the AMD team has concerns about
> closed-source/private stack depending on the old behaviour
> If they want I can even write the patches ;-)
>
>
> That said, the notable comments received so far are:
>   - rework patch 13/13 to remove the DRM_AUTH from prime fd to/from
> handle. I'm OK but this will change the return code - from EACCES to
> ENOSYS
>
>   - vmwgfx will need a check on the reference ioctl(s) - IIRC Thomas is
> planning to drop nearly all DRM_AUTH instances in their driver.
>
>
> Christian, as mentioned before - this series does _not_ add
> functionality to render nodes. It effectively paves a way towards
> removing DRM_AUTH.

But it adds functionality to the primary node.

> I understand the series may feel a bit dirty. Yet I would gladly address
> any technical concerns you have.

Well putting compatibility issues aside my concern is that this is 
simply a bad design decision which we can't revert later on.

Because of this I will still reject any changes trying to remove 
DRM_AUTH from AMD drivers.

If that means that AMD drivers will have a behavior difference to other 
drivers I think we are going to live with that.

Regards,
Christian.

>
> Thanks
> Emil



More information about the dri-devel mailing list