[PATCH] drm/amdgpu: fix drm leases being broken on radv

Koenig, Christian Christian.Koenig at amd.com
Wed Apr 17 10:29:22 UTC 2019


Hi guys,

well going back to the beginning once more because something doesn't fit 
together here.

I mean what exactly is the purpose of 8059add0478e "drm: allow render 
capable master with DRM_AUTH ioctls"?

When I read the code correctly the effect is that we ignore the DRM_AUTH 
flag when also the DRM_RENDER_ALLOW flag is set and the driver is a 
render node driver.

Well when this is correct then why the heck aren't we removing the 
DRM_AUTH flag from the affected IOCTLs instead?

The only common IOCTLs affecting this are DRM_IOCTL_PRIME_HANDLE_TO_FD 
and DRM_IOCTL_PRIME_FD_TO_HANDLE and those are protected by capability 
flags separately.

So the patch 8059add0478e doesn't make any sense as far as I can tell. 
What I'm missing?

Regards,
Christian.

Am 17.04.19 um 11:18 schrieb Christian König:
> Am 17.04.19 um 10:10 schrieb Daniel Vetter:
>> On Wed, Apr 17, 2019 at 09:09:27AM +0200, Christian König wrote:
>>> Am 16.04.19 um 23:40 schrieb Andres Rodriguez:
>>>> After a recent commit, access to the DRM_AUTH ioctls become more
>>>> permissive. This resulted in a buggy check for drm_master capabilities
>>>> inside radv stop working.
>>>>
>>>> This commit adds a backwards compatibility workaround so that the radv
>>>> drm_master check keeps working as previously expected.
>>>>
>>>> This fixes SteamVR and other drm lease based apps being broken since
>>>> v5.1-rc1
>>>>
>>>>   From the usermode side, radv will also be fixed to properly test for
>>>> master capabilities:
>>>> https://gitlab.freedesktop.org/mesa/mesa/merge_requests/669
>>>>
>>>> Fixes: 8059add0478e ("drm: allow render capable master with 
>>>> DRM_AUTH ioctls")
>>>> Signed-off-by: Andres Rodriguez <andresx7 at gmail.com>
>>>> Cc: Daniel Vetter <daniel at ffwll.ch>
>>>> Cc: David Airlie <airlied at linux.ie>
>>>> Cc: Emil Velikov <emil.velikov at collabora.com>
>>>> Cc: Alex Deucher <alexander.deucher at amd.com>
>>>> Cc: Keith Packard <keithp at keithp.com>
>>>> Reviewed-by: Keith Packard <keithp at keithp.com>
>>>> Reviewed-by: Daniel Vetter <daniel at ffwll.ch>
>>> Well definitely a NAK. IIRC we have unit tests where the exactly 
>>> first thing
>>> they do is querying AMDGPU_INFO_ACCEL_WORKING.
>>>
>>> And I definitely not going to risk breaking those just to fix buggy 
>>> behavior
>>> in radv.
>> s/buggy/fragile :-)
>>
>> Option B would be to disable 8059add0478e ("drm: allow render capable
>> master with DRM_AUTH ioctls") just for amdgpu.ko, but that's a bit more
>> tricky to implement I guess (since it'd leak all over the place).
>>
>> Option C is to revert 8059add04, but that's a bit silly since it 
>> seems to
>> work everywhere.
>>
>> Breaking radv isn't an option, because no regression. Aside: No one is
>> stopping amd folks from reviewing radv patches and making sure 
>> there's no
>> fragile stuff in there.
>>
>> We discussed this quite a bit on irc with Ben and Keith and others, and
>> figured option A is the most promising to go forward with. Anything 
>> using
>> amdgpu_device_init (which I think are all the umds, but I didn't check)
>> will keep working, as will radv leases/vkdisplay, plus we can keep
>> 8059add04 for everyone (not just everony except amdgpu). If that means
>> breaking a few unit tests ... *shrugh*. Needs patches to fix them 
>> ofc, but
>> shouldn't be that much work really.
>
> I think you miss the point here: This patch will break the render node 
> interface!
>
> E.g. take another look at those lines of code:
>> +        if (fpriv->is_first_ioctl && !filp->is_master)
>> +            return -EACCES;
> This will return -EACCES on the first accel working query which is 
> doesn't comes from the DRM master. And it is irrelevant if its the 
> primary or the render node.
>
> So this patch most likely breaks tons of things and is *definitely* a 
> complete NAK.
>
> Additional to that IIRC we have (rather old) code which behaves 
> differently depending if it is called by the render node or the 
> primary node. In other words it assumes that the primary node is 
> authenticated.
>
> If I understood correctly what 8059add04 does than this is NOT a good 
> to do in general.
>
> Regards,
> Christian.
>
>> -Daniel
>>
>>> Christian.
>>>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  3 +++
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  2 ++
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 22 ++++++++++++++++++++++
>>>>    3 files changed, 27 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> index 8d0d7f3dd5fb..e67bfee8d411 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> @@ -409,6 +409,9 @@ struct amdgpu_fpriv {
>>>>        struct mutex        bo_list_lock;
>>>>        struct idr        bo_list_handles;
>>>>        struct amdgpu_ctx_mgr    ctx_mgr;
>>>> +
>>>> +    /* Part of a backwards compatibility hack */
>>>> +    bool            is_first_ioctl;
>>>>    };
>>>>    int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv 
>>>> **fpriv);
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> index 7419ea8a388b..cd438afa7092 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>> @@ -1128,6 +1128,7 @@ long amdgpu_drm_ioctl(struct file *filp,
>>>>                  unsigned int cmd, unsigned long arg)
>>>>    {
>>>>        struct drm_file *file_priv = filp->private_data;
>>>> +    struct amdgpu_fpriv *fpriv = file_priv->driver_priv;
>>>>        struct drm_device *dev;
>>>>        long ret;
>>>>        dev = file_priv->minor->dev;
>>>> @@ -1136,6 +1137,7 @@ long amdgpu_drm_ioctl(struct file *filp,
>>>>            return ret;
>>>>        ret = drm_ioctl(filp, cmd, arg);
>>>> +    fpriv->is_first_ioctl = false;
>>>>        pm_runtime_mark_last_busy(dev->dev);
>>>>        pm_runtime_put_autosuspend(dev->dev);
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>>> index e860412043bb..00c0a2fb2a69 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>>> @@ -455,6 +455,7 @@ static int amdgpu_hw_ip_info(struct 
>>>> amdgpu_device *adev,
>>>>    static int amdgpu_info_ioctl(struct drm_device *dev, void *data, 
>>>> struct drm_file *filp)
>>>>    {
>>>>        struct amdgpu_device *adev = dev->dev_private;
>>>> +    struct amdgpu_fpriv *fpriv = filp->driver_priv;
>>>>        struct drm_amdgpu_info *info = data;
>>>>        struct amdgpu_mode_info *minfo = &adev->mode_info;
>>>>        void __user *out = (void __user 
>>>> *)(uintptr_t)info->return_pointer;
>>>> @@ -470,6 +471,26 @@ static int amdgpu_info_ioctl(struct drm_device 
>>>> *dev, void *data, struct drm_file
>>>>        switch (info->query) {
>>>>        case AMDGPU_INFO_ACCEL_WORKING:
>>>> +        /* HACK: radv has a fragile open-coded check for drm master
>>>> +         * The pattern for the call is:
>>>> +         *     open(DRM_NODE_PRIMARY)
>>>> +         *     drmModeCommandWrite( query=AMDGPU_INFO_ACCEL_WORKING )
>>>> +         *
>>>> +         * After commit 8059add04 this check stopped working due to
>>>> +         * operations on a primary node from non-master clients 
>>>> becoming
>>>> +         * more permissive.
>>>> +         *
>>>> +         * This backwards compatibility hack targets the calling 
>>>> pattern
>>>> +         * above to fail radv from thinking it has master access 
>>>> to the
>>>> +         * display system ( without reverting 8059add04 ).
>>>> +         *
>>>> +         * Users of libdrm will not be affected as some other 
>>>> ioctls are
>>>> +         * performed between open() and the AMDGPU_INFO_ACCEL_WORKING
>>>> +         * query.
>>>> +         */
>>>> +        if (fpriv->is_first_ioctl && !filp->is_master)
>>>> +            return -EACCES;
>>>> +
>>>>            ui32 = adev->accel_working;
>>>>            return copy_to_user(out, &ui32, min(size, 4u)) ? -EFAULT 
>>>> : 0;
>>>>        case AMDGPU_INFO_CRTC_FROM_ID:
>>>> @@ -987,6 +1008,7 @@ int amdgpu_driver_open_kms(struct drm_device 
>>>> *dev, struct drm_file *file_priv)
>>>>                goto error_vm;
>>>>        }
>>>> +    fpriv->is_first_ioctl = true;
>>>>        mutex_init(&fpriv->bo_list_lock);
>>>>        idr_init(&fpriv->bo_list_handles);
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>



More information about the dri-devel mailing list