<html> <head> <meta content="text/html; charset=UTF-8" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> <div class="moz-cite-prefix">On 14/04/2014 15:09, Rob Clark wrote: </div> <blockquote cite="mid:CAF6AEGtjvUcwZFkwtPkb9Q3t8gWE0__Xg0sD6BBvDRGHm+jb5A@mail.gmail.com" type="cite"> <pre wrap="">On Mon, Apr 14, 2014 at 8:56 AM, Thomas Hellstrom <a class="moz-txt-link-rfc2396E" href="mailto:thellstrom@vmware.com"><thellstrom@vmware.com></a> wrote: </pre> <blockquote type="cite"> <pre wrap="">On 04/14/2014 02:41 PM, One Thousand Gnomes wrote: </pre> <blockquote type="cite"> <blockquote type="cite"> <pre wrap="">throw out all GPU memory on master drop and block ioctls requiring authentication until master becomes active again. </pre> </blockquote> <pre wrap="">If you have a per driver method then the driver can implement whatever is optimal (possibly including throwing it all out). </pre> <blockquote type="cite"> <pre wrap="">-1: The driver allows an authenticated client to craft command streams that could access any part of system memory. These drivers should be kept in staging until they are fixed. </pre> </blockquote> <pre wrap="">I am not sure they belong in staging even. </pre> <blockquote type="cite"> <pre wrap="">0: Drivers that are vulnerable to any of the above scenarios. 1: Drivers that are immune against all above scenarios but allows any authenticated client with *active* master to access all GPU memory. Any enabled render nodes will be insecure, while primary nodes are secure. 2: Drivers that are immune against all above scenarios and can protect clients from accessing eachother's gpu memory: Render nodes will be secure. Thoughts? </pre> </blockquote> <pre wrap="">Another magic number to read, another case to get wrong where the OS isn't providing security by default. If the driver can be fixed to handle it by flushing out all GPU memory then the driver should be fixed to do so. Adding magic udev nodes is just adding complexity that ought to be made to go away before it even becomes an API. So I think there are three cases - insecure junk driver. Shouldn't even be in staging - hardware isn't as smart enough, or perhaps has a performance problem so sometimes flushes all buffers away on a switch - drivers that behave well Do you then even need a sysfs node and udev hacks (remembering not everyone even deploys udev on their Linux based products) For the other cases - how prevalent are the problem older user space drivers nowdays ? - the fix for "won't fix" drivers is to move them to staging, and then if they are not fixed or do not acquire a new maintainer who will, delete them. - if we have 'can't fix drivers' then its a bit different and we need to understand better *why*. Don't screw the kernel up because there are people who can't be bothered to fix bugs. Moving them out of the tree is a great incentive to find someone to fix it. </pre> </blockquote> <pre wrap=""> On second thought I'm dropping this whole issue. I've brought this and other security issues up before but nobody really seems to care. </pre> </blockquote> <pre wrap=""> I wouldn't say that.. render-nodes, dri3/prime/dmabuf, etc, wouldn't exist if we weren't trying to solve these issues. Like I said earlier, I think we do want some way to expose range of supported security levels, and in case multiple levels are supported by driver some way to configure desired level. Well, "range" may be overkill, I only see two sensible values, either "gpu can access anyone's gpu memory (but not arbitrary system memory)", or "we can also do per-process isolation of gpu buffers". Of course the "I am a root hole" security level has no place in the kernel. BR, -R </pre> </blockquote> I indeed think having a standard way of exposing how much security can be expected from the hw/driver is a good thing! I have never worked on trying to identify the performance hit of using per-process virtual address space but I would expect it to be low-enough since the MMU cannot pagefault (at least, doing so on NVIDIA hw means killing the context). Maybe the performance hit would be at context switching (because the TLB would be reset). I am interested in knowing the performance impact PPGTT on Intel IGPs and if it could be activated on a per-process basis. Of course, applications ran without PPGTT should be trusted by the user as they will be able to access other process' BOs. If the performance impact is high AND it can be deactivated per-process, then it may make sense to allow libdrm to allocate this privileged access to the GPU. However, how will we avoid applications from requesting performance all the time? By-passing the PPGTT security should require proof of the user's intent and right now, we do not have this capability on current desktop environments (although it is worked on [1] and [2]). Do you have any idea how to expose this knob securely? Root could disable PPGTT for all processes, but I don't see how we should securely handle the authorisation for an application to disable the PPGTT without some serious work... As you can see, there is some work to be done before exposing the security/performance trade-off knob. I am not convinced it is necessary but I would definitely reconsider my position when data show up giving us the performance impact of the graphics MMU. However, I would really appreciate if drivers could expose the graphic process isolation level. I do not think we should go for a number, I would rather go for a bitfield. This should be simple-enough to implement in drm. Martin [1] <a class="moz-txt-link-freetext" href="http://mupuf.org/blog/2014/02/19/wayland-compositors-why-and-how-to-handle/">http://mupuf.org/blog/2014/02/19/wayland-compositors-why-and-how-to-handle/</a> [2] <a class="moz-txt-link-freetext" href="http://mupuf.org/blog/2014/03/18/managing-auth-ui-in-linux/">http://mupuf.org/blog/2014/03/18/managing-auth-ui-in-linux/</a> <big></big> </body> </html>