[Intel-gfx] [RFC PATCH] drm/radeon: fixup locking inversion between mmap_sem and reservations

Tue Oct 8 18:58:11 CEST 2013

On 10/08/2013 06:47 PM, Jerome Glisse wrote:
> On Tue, Oct 08, 2013 at 06:29:35PM +0200, Thomas Hellstrom wrote:
>> On 10/08/2013 04:55 PM, Jerome Glisse wrote:
>>> On Tue, Oct 08, 2013 at 04:45:18PM +0200, Christian König wrote:
>>>> Am 08.10.2013 16:33, schrieb Jerome Glisse:
>>>>> On Tue, Oct 08, 2013 at 04:14:40PM +0200, Maarten Lankhorst wrote:
>>>>>> Allocate and copy all kernel memory before doing reservations. This prevents a locking
>>>>>> inversion between mmap_sem and reservation_class, and allows us to drop the trylocking
>>>>>> in ttm_bo_vm_fault without upsetting lockdep.
>>>>>>
>>>>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst at canonical.com>
>>>>> I would say NAK. Current code only allocate temporary page in AGP case.
>>>>> So AGP case is userspace -> temp page -> cs checker -> radeon ib.
>>>>>
>>>>> Non AGP is directly memcpy to radeon IB.
>>>>>
>>>>> Your patch allocate memory memcpy userspace to it and it will then be
>>>>> memcpy to IB. Which means you introduce an extra memcpy in the process
>>>>> not something we want.
>>>> Totally agree. Additional to that there is no good reason to provide
>>>> anything else than anonymous system memory to the CS ioctl, so the
>>>> dependency between the mmap_sem and reservations are not really
>>>> clear to me.
>>>>
>>>> Christian.
>>> I think is that in other code path you take mmap_sem first then reserve
>>> bo. But here we reserve bo and then we take mmap_sem because of copy
>> >from user.
>>> Cheers,
>>> Jerome
>>>
>> Actually the log message is a little confusing. I think the mmap_sem
>> locking inversion problem is orthogonal to what's being fixed here.
>>
>> This patch fixes the possible recursive bo::reserve caused by
>> malicious user-space handing a pointer to ttm memory so that the ttm
>> fault handler is called when bos are already reserved. That may
>> cause a (possibly interruptible) livelock.
>>
>> Once that is fixed, we are free to choose the mmap_sem ->
>> bo::reserve locking order. Currently it's bo::reserve->mmap_sem(),
>> but the hack required in the ttm fault handler is admittedly a bit
>> ugly.  The plan is to change the locking order to
>> mmap_sem->bo::reserve
>>
>> I'm not sure if it applies to this particular case, but it should be
>> possible to make sure that copy_from_user_inatomic() will always
>> succeed, by making sure the pages are present using
>> get_user_pages(), and release the pages after
>> copy_from_user_inatomic() is done. That way there's no need for a
>> double memcpy slowpath, but if the copied data is very fragmented I
>> guess the resulting code may look ugly. The get_user_pages()
>> function will return an error if it hits TTM pages.
>>
>> /Thomas
> get_user_pages + copy_from_user_inatomic is overkill. We should just
> do get_user_pages which fails with ttm memory and then use copy_highpage
> helper.
>
> Cheers,
> Jerome
Yeah, it may well be that that's the preferred solution.

/Thomas