[PATCH RFC 003/111] staging: etnaviv: add drm driver

Tue Apr 7 16:56:01 PDT 2015

On Tue, Apr 07, 2015 at 02:52:31PM +0200, Lucas Stach wrote:
> Am Dienstag, den 07.04.2015, 11:46 +0100 schrieb Russell King - ARM
> Linux:
> > 
> > For both Vivante and Etnaviv, it's already the accepted way that 2D
> > cores need the full context loaded for each operation, and the few
> > userspace bits we have comply with that today.
> > 
> > With Etnaviv DRM, we already must ensure that the command buffer
> > submitted to the GPU contains all references to buffer objects to be
> > operated on by that command block - or to put it another way, we
> > need to ensure that each GPU operation is complete inside the command
> > submitted buffer.
> > 
> Right that's one thing that I really hadn't thought through until now.
> So this means we must at least emit all states that contain relocs,
> which may further reduce the possibility to do minimal state updates.
> Urghs.

Before trying hard to minimize the number of state emitted i would like
to encourage you to actually benchmark this and see if it really makes
a difference. I was convinced once it would be usefull but simple bench
mark proved me wrong. Like you could draw over and over a simple vbo
with bunch of states versus submitting the same vbo over and over and
submitting the states once.

Turns out on other hw the cost of tracking dirty state (cpu overhead)
was more important that the very small fraction (i think it was barely
significative in respect to the standard deviation) of performance
improvement.

[...]

> > 
> > An important question is whether the context buffer, built by userspace,
> > should be submitted as one of these command buffers, or kept separate so
> > the kernel can keep track of it and decide whether or not to use it
> > according to the state it's tracking.
> > 
> > Another point to bring up here is about how command buffers are submitted.
> > 
> > Consider this scenario:
> > 
> > - Userspace creates a command buffer, and arranges for the initial
> >   commands to be time consuming (eg, long WAIT commands.)  It fills the
> >   rest of the buffer with dummy LOAD STATE commands.
> > - Userspace submits this, the kernel validates the command buffer, and
> >   submits it to the GPU.  The GPU starts executing the buffer.
> > - Userspace, which still has access to the command buffer, overwrites
> >   the LOAD STATE commands with malicious GPU commands.
> > - GPU executes malicious GPU commands.
> > 
> > This brings up several questions:
> > 
> > 1. Do we care about this?
> > 2. If we do care, should we insist that a command buffer is not mapped
> >    in userspace when it is submitted, and prevent an in-use command
> >    buffer being mapped?
> > 3. If we don't care, what's the point of validating the supplied command
> >    buffer?
> > 
> > (2) would be quite an API change over what we have today, and introduce
> > an amount of overhead, though something which could be handled in the
> > userspace library (eg, if we're modelling on etnaviv's five command
> > buffer model, we could copy the command buffer immediately before
> > submission.)
> > 
> > Given this, I think (3) has some value irrespective of the outcome of
> > (1) as it gives us a way to catch silly errors from userspace before
> > they hit the GPU and become a problem.
> > 
> I think we should care.
> I fail to see how this would have to be an API change. Why can't we just
> hand out buffers to userspace like we do now and copy their contents
> into an internal buffer as we validate and apply relocs?
> This model may be beneficial even without the security benefits, as we
> could hand out cached buffers to userspace, so we can read them more
> efficiently for validation and stuff things into an internal
> write-combined buffer.

You should definitly care about that. For instance in the radeon driver
for GPU we can not trust (ie gpu where userspace could access physical
memory through the gpu) we do copy the user space command buffer while
validating it inside the kernel. Yes there is an overhead for doing that
but this is the only way to have security on such GPU.

In case you have virtual address space and userspace can not reprogram
it from the command buffer than yes you can directly execute the user
cmd buffer without copying or checking it.

I would strongly advice not to give up on security.

Cheers,
Jérôme