[PATCH RFC 003/111] staging: etnaviv: add drm driver

Tue Apr 7 05:52:31 PDT 2015

Am Dienstag, den 07.04.2015, 11:46 +0100 schrieb Russell King - ARM
Linux:
> On Tue, Apr 07, 2015 at 11:20:10AM +0200, Lucas Stach wrote:
> > Am Dienstag, den 07.04.2015, 11:04 +0200 schrieb Christian Gmeiner:
> > > What role does GPU power management plays here? For the context switching
> > > it could make sense. But for the 2d core the context is so small that
> > > it does not
> > > hurt to send it with every command stream. For the 3d core it is much
> > > bigger, but
> > > this could be done completely in the kernel. Or I am wrong here?
> >
> > If you power down the GPU you loose the context. You are right that we
> > could save/restore the context from kernel space, but that is really
> > taking a toll on CPU time. It is much better to have userspace provide a
> > context buffer to get the GPU in the expected state, as you then only
> > need to splice this into the execution stream to restore the context
> > instead of pushing it with the CPU. Reading back the context on every
> > switch will kill any performance.
> 
> For both Vivante and Etnaviv, it's already the accepted way that 2D
> cores need the full context loaded for each operation, and the few
> userspace bits we have comply with that today.
> 
> With Etnaviv DRM, we already must ensure that the command buffer
> submitted to the GPU contains all references to buffer objects to be
> operated on by that command block - or to put it another way, we
> need to ensure that each GPU operation is complete inside the command
> submitted buffer.
> 
Right that's one thing that I really hadn't thought through until now.
So this means we must at least emit all states that contain relocs,
which may further reduce the possibility to do minimal state updates.
Urghs.

> The 2D core is rather messy as far as which bits of state need to be
> preserved, especially when you consider that you have the 2D drawing
> and blit ops, as well as a video rasteriser which shares some state
> registers for the destination, but uses different state registers for
> the source.  It quickly becomes rather messy to keep track of the
> GPU state.
> 
> In any case, the amount of state which needs to be loaded for 2D
> operations is small, so I think it really makes sense to require
> userspace to only submit complete, fully described 2D operations
> within a single command buffer.
> 
> > > I tend to get things up and running and do the optimization step if it
> > > is really worth.
> > > Also I like stuff to be stupid simple. There is an other interesting
> > > fact: flushing the
> > > iommuv2 is done via command stream and we need to reserve more space
> > > for the tail of the used bo. So if we reserve some space in the
> > > command buffer, we have other space limits for the tail depending on
> > > used hardware.
> 
> I would much rather we only appended LINK commands to the submitted
> command BO, and added whatever GPU management commands to either the
> ring buffer (which is easy) or a separate GPU management command
> buffer.  Given that I'm already doing this to flush the V1 MMU in
> the kernel ring buffer, this is the option I prefer.
> 
> > You may be aware that once this is upstream there is no easy way to
> > change the userspace interface anymore. So whatever is left out now is
> > likely to be very hard to reintroduce later.
> 
> Indeed - we need to agree what the submited command buffer will contain,
> and how much space to enforce at the end of the command buffer.
> 
> The minimum space is one LINK command, which is 8 bytes.  As long as we
> can add a LINK command, we can redirect the GPU's execution elsewhere to
> do whatever other operations we want to do.
> 
> I think the only danger there is if Vivante produce a GPU with 64-bit
> addressing for the command stream - if they do, commands like LINK will
> most likely change format, and possibly would be a different number of
> bits.
> 
> The simple solution to this would be to introduce into the API a
> property (like is done for the feature bits) which tells userspace the
> minimum number of bytes which must be reserved at the end of the command
> buffer.  If we need to change that in the future, we have the flexibility
> to do so.
> 

Yes, that seems to be the straight forward solution. Export a property
from the kernel to signal the userspace how much free space is needed at
the end of the buffer and reject any buffer violating this.

Though I agree that we should not overuse this and try to do as much as
possible outside of the user command streams.

> > What' the problem with having a command buffer in the kernel to flush
> > the MMUv2? Why do you need to insert those commands into the userspace
> > command stream?
> 
> That's certainly where I would put it.
> 
> > > >> > +#define ETNA_SUBMIT_BO_READ             0x0001
> > > >> > +#define ETNA_SUBMIT_BO_WRITE            0x0002
> > > >> > +struct drm_etnaviv_gem_submit_bo {
> > > >> > +       uint32_t flags;          /* in, mask of ETNA_SUBMIT_BO_x */
> > > >> > +       uint32_t handle;         /* in, GEM handle */
> > > >> > +       uint64_t presumed;       /* in/out, presumed buffer address */
> > > >>
> > > >> presumed support should never hit etnaviv driver.
> > > >>
> > > > As stated in the cover letter I think presumed support will become
> > > > possible with MMUv2 and may provide a good optimization there. So I
> > > > would rather leave this in here and just ignore it for now.
> 
> Could we rename this member 'reserved' if it's something that we think
> we're going to implement in the near future - but also please add a
> flag which indicates whether the presumed address is present or not.
> Zero _can_ be a valid address too!
> 
> > A presumed address can not be a physical address, but is an address in
> > the VM context of that process. Nouveau uses the same thing on NV50+
> > where you have a proper MMU to protect all GPU accesses. I would expect
> > the same thing to be true for Vivante MMUv2.
> 
> I know that there are very strong opinions about exposing _physical_
> addresses to userspace (David, for example, doesn't like it one bit.)
> If it's a GPU address, then that's less useful to userspace.
> 
A GPU address with per-process pagetables and full translation support
on the GPU MMU is as good as a CPU virtual address. I wouldn't expect
any objections against those. MMUv1 is more like a GART window and
doesn't provide full translation, so we shouldn't trust userspace with
MMUv1.

> However, that data has to be treated as suspect coming from userspace.
> 
> You still need to look up the buffer object in the kernel, so that you
> can manage the buffer object's state.  If you have the buffer object's
> state, then you most likely have easy access to its GPU mapping, which
> means you can retrieve the GPU address, which you can then use to
> validate the address passed from userspace... but if you've found the
> GPU address via this method, you haven't saved anything.
> 
> An alternative approach would be to lookup the presumed address in
> (eg) a rbtree to locate the buffer object's state, which would save
> the lookup by object ID, but does this save anything by doing that,
> does it add complexity and additional kernel processing?
> 
> I'm not sure.
> 
I'm not sure about this. With MMUv2 and per-process pagetables there
really is no need to validate the addresses from userspace as each
process is only able to shoot itself in the foot.

> > > Keep it stupid simple. In my libdrm repo, which you hopefully know, I
> > > have implemented the buffer handling from the original libetnaviv. We
> > > allocate 5 command buffers of a defined size and rotate through them.
> > > During command buffer building we reserve space in the stream. if
> > > there is not enough space we flush the current buffer stream and
> > > switch to the next and us it. Then there is a way to explicit flush
> > > a command buffer.
> > > 
> > > For more details see:
> > > https://github.com/laanwj/etna_viv/tree/master/src/etnaviv
> > > https://github.com/austriancoder/libdrm
> > 
> > Same argument as above really. We need at least the context buffer.
> 
> An important question is whether the context buffer, built by userspace,
> should be submitted as one of these command buffers, or kept separate so
> the kernel can keep track of it and decide whether or not to use it
> according to the state it's tracking.
> 
> Another point to bring up here is about how command buffers are submitted.
> 
> Consider this scenario:
> 
> - Userspace creates a command buffer, and arranges for the initial
>   commands to be time consuming (eg, long WAIT commands.)  It fills the
>   rest of the buffer with dummy LOAD STATE commands.
> - Userspace submits this, the kernel validates the command buffer, and
>   submits it to the GPU.  The GPU starts executing the buffer.
> - Userspace, which still has access to the command buffer, overwrites
>   the LOAD STATE commands with malicious GPU commands.
> - GPU executes malicious GPU commands.
> 
> This brings up several questions:
> 
> 1. Do we care about this?
> 2. If we do care, should we insist that a command buffer is not mapped
>    in userspace when it is submitted, and prevent an in-use command
>    buffer being mapped?
> 3. If we don't care, what's the point of validating the supplied command
>    buffer?
> 
> (2) would be quite an API change over what we have today, and introduce
> an amount of overhead, though something which could be handled in the
> userspace library (eg, if we're modelling on etnaviv's five command
> buffer model, we could copy the command buffer immediately before
> submission.)
> 
> Given this, I think (3) has some value irrespective of the outcome of
> (1) as it gives us a way to catch silly errors from userspace before
> they hit the GPU and become a problem.
> 
I think we should care.
I fail to see how this would have to be an API change. Why can't we just
hand out buffers to userspace like we do now and copy their contents
into an internal buffer as we validate and apply relocs?
This model may be beneficial even without the security benefits, as we
could hand out cached buffers to userspace, so we can read them more
efficiently for validation and stuff things into an internal
write-combined buffer.

Regards,
Lucas
-- 
Pengutronix e.K.             | Lucas Stach                 |
Industrial Linux Solutions   | http://www.pengutronix.de/  |