[PATCH v2] drm: enable render-nodes by default

Thu Mar 20 03:28:18 PDT 2014

On 03/20/2014 10:43 AM, David Herrmann wrote:
> Hi
>
> On Thu, Mar 20, 2014 at 10:27 AM, Thomas Hellstrom
> <thellstrom at vmware.com> wrote:
>> A user logs in to a system where DRI clients use render nodes. The
>> system grants rw permission on the render nodes for the console user.
>> User starts editing a secret document, starts some GPGPU structural FEM
>> computations of  the Pentagon building. Locks the screen and goes for lunch.
>>
>> A malicious user logs in using fast user switching and becomes the owner
>> of the render node. Tries to map a couple of random offsets, but that
>> fails, due to security checks. Now crafts a malicious command stream to
>> dump all GPU memory to a file. Steals the first user's secret document
>> and the intermediate Pentagon data. Logs out and starts data mining.
>>
>> Now if we require drivers to block these malicious command streams this
>> can never happen, and distros can reliably grant rw access to the render
>> nodes to the user currently logged into the console.
>>
>> I guest basically what I'm trying to say that with the legacy concept it
>> was OK to access all GPU memory, because an authenticated X user
>> basically had the same permissions.
>>
>> With render nodes we're allowing multiple users into the GPU at the same
>> time, and it's not OK anymore for a client to access another clients GPU
>> buffer through a malicious command stream.
> Yes, I understand the attack scenario, but that's not related to
> render-nodes at all. The exact same races exist on the legacy node:

I was under the impression that render nodes were designed to fix these
issues?

>
> 1) If you can do fast-user switching, you can spawn your own X-server,
> get authenticated on your own server and you are allowed into the GPU.
> You cannot map other user's buffers because they're on a different
> master-object, but you _can_ craft malicious GPU streams and access
> the other user's buffer.

But with legacy nodes, drivers can (and should IMO) throw out all data
from GPU memory on master drop,
and then block dropped master authenticated clients from GPU, until
their master becomes active again or dies (in which case they are
killed). In line with a previous discussion we had. Now you can't do
this with render nodes, so yes they do open up
a new race that requires command stream validation.

>
> 2) If you can do fast-user switching, switch to an empty VT, open the
> legacy node and you automatically become DRM-Master because there is
> no active master. Now you can do anything on the DRM node, including
> crafting malicious GPU streams.

I believe the above solution should work for this case as well.

>
> Given that the legacy node is always around and _always_ has these
> races, why should we prevent render-nodes from appearing just because
> the _driver_ is racy? I mean, there is no gain in that.. if it opens a
> new race, as you assumed, then yes, we should avoid it. But at least
> for all drivers supporting render-nodes so far, they either are
> entirely safe or the just described races exist on both nodes.

My suggestion is actually not to prevent render nodes from appearing,
but rather that we should restrict them to drivers with command stream
verification and / or per process virtual memory, and I also think we
should plug the above races on legacy nodes. That way legacy nodes would
use the old "master owns it all" model, while render nodes could allow
multiple users at the same time.

/Thomas