Use of copy_from_user in msm_gem_submit.c while holding a spin_lock

Wed Aug 17 18:58:27 UTC 2016

On Wed, Aug 17, 2016 at 2:49 PM, Rob Clark <robdclark at gmail.com> wrote:
> On Wed, Aug 17, 2016 at 1:08 PM, Al Viro <viro at zeniv.linux.org.uk> wrote:
>> On Wed, Aug 17, 2016 at 11:08:46AM -0400, Rob Clark wrote:
>>> On Wed, Aug 17, 2016 at 7:40 AM, Vaishali Thakkar
>>> <vaishali.thakkar at oracle.com> wrote:
>>> > Hello,
>>> >
>>> > I was wondering about the call to copy_from_user in function submit_lookup_objects for drive
>>> > /gpu/drm/msm/msm_gem_submit.c  It calls copy_from_user[1] in a spin_lock, which is not normally
>>> > allowed, due to the possibility of a deadlock.
>>> >
>>> > Is there some reason that I am overlooking why it is OK in this case? Is there some code in the
>>> > same file which ensures that page fault will not occur when we are calling the function holding
>>> > spin_lock?
>>>
>>> hmm, probably just that it isn't typical to use a swap file on these
>>> devices (and that lockdep/etc doesn't warn about it)..  I guess we
>>> probably need some sort of slow-path where we drop the lock and try
>>> again in case there would be a fault..
>>
>> Sigh...  Folks, you don't need swap *at* *all* for copy_from_user() to block.
>>         /* get a zero-filled 64K buffer */
>>         addr = mmap(NULL, 65536, PROT_READ | PROT_WRITE,
>>                     MAP_ANONYMOUS | MAP_SHARED, -1, 0);
>>         if (addr < 0)
>>                 piss off
>>         buffer = (void *)addr;
>>         ....
>>         pass buf to a syscall
>
>
> Sure, I know that.. but if you pass random garbage cmstream to the
> gpu, it will crash (the gpu) too and/or result in corrupt rendering on
> screen, etc.  GPU submit APIs don't exist for random end users, they
> exist for one user that knows what it is doing (ie. mesa).
>
> I'm not saying that I shouldn't fix it (although not quite sure how
> yet.. taking/dropping the spinlock inside the loop is not a good
> option from a performance standpoint).  What I am saying is that this
> is not something that can happen accidentally (as it could in the case
> of swap).  But I agree that I should fix it somehow to avoid issues
> with an intentionally evil userspace.
>
> If there is a copy_from_user() variant that will return an error
> instead of blocking, I think that is really what I want so I can
> implement a slow-path that drops the spin-lock temporarily.

ok, Chris pointed out copy_from_user_atomic() on irc.. that sounds
like what I want.. will put together a patch in a few

BR,
-R

> BR,
> -R
>
>
>> and copy_from_user() in that syscall will have to allocate pages (and possibly
>> page tables as well).  Which can block just fine, no swap involved.  Moreover,
>> if you modify some parts of the buffer first, you will get the pages containing
>> those modifications already present, but anything still untouched will
>>         a) act as if it had been zeroed first and
>>         b) possibly block on the first dereference, be it from kernel or from
>> userland.  Worse yet, there's nothing to stop libc from using the above for
>> calloc() and its ilk, with your application having no way to tell.  As far
>> as application is concerned, it has asked a library function to allocate and
>> zero a piece of memory, got one and yes, it does appear to be properly zeroed.
>>
>> The bottom line is, copy_from_user() can realistically block, without
>> anything fishy going on in the userland setup.