[PATCH RFC 003/111] staging: etnaviv: add drm driver

Tue Apr 7 03:46:02 PDT 2015

On Tue, Apr 07, 2015 at 11:20:10AM +0200, Lucas Stach wrote:
> Am Dienstag, den 07.04.2015, 11:04 +0200 schrieb Christian Gmeiner:
> > What role does GPU power management plays here? For the context switching
> > it could make sense. But for the 2d core the context is so small that
> > it does not
> > hurt to send it with every command stream. For the 3d core it is much
> > bigger, but
> > this could be done completely in the kernel. Or I am wrong here?
>
> If you power down the GPU you loose the context. You are right that we
> could save/restore the context from kernel space, but that is really
> taking a toll on CPU time. It is much better to have userspace provide a
> context buffer to get the GPU in the expected state, as you then only
> need to splice this into the execution stream to restore the context
> instead of pushing it with the CPU. Reading back the context on every
> switch will kill any performance.

For both Vivante and Etnaviv, it's already the accepted way that 2D
cores need the full context loaded for each operation, and the few
userspace bits we have comply with that today.

With Etnaviv DRM, we already must ensure that the command buffer
submitted to the GPU contains all references to buffer objects to be
operated on by that command block - or to put it another way, we
need to ensure that each GPU operation is complete inside the command
submitted buffer.

The 2D core is rather messy as far as which bits of state need to be
preserved, especially when you consider that you have the 2D drawing
and blit ops, as well as a video rasteriser which shares some state
registers for the destination, but uses different state registers for
the source.  It quickly becomes rather messy to keep track of the
GPU state.

In any case, the amount of state which needs to be loaded for 2D
operations is small, so I think it really makes sense to require
userspace to only submit complete, fully described 2D operations
within a single command buffer.

> > I tend to get things up and running and do the optimization step if it
> > is really worth.
> > Also I like stuff to be stupid simple. There is an other interesting
> > fact: flushing the
> > iommuv2 is done via command stream and we need to reserve more space
> > for the tail of the used bo. So if we reserve some space in the
> > command buffer, we have other space limits for the tail depending on
> > used hardware.

I would much rather we only appended LINK commands to the submitted
command BO, and added whatever GPU management commands to either the
ring buffer (which is easy) or a separate GPU management command
buffer.  Given that I'm already doing this to flush the V1 MMU in
the kernel ring buffer, this is the option I prefer.

> You may be aware that once this is upstream there is no easy way to
> change the userspace interface anymore. So whatever is left out now is
> likely to be very hard to reintroduce later.

Indeed - we need to agree what the submited command buffer will contain,
and how much space to enforce at the end of the command buffer.

The minimum space is one LINK command, which is 8 bytes.  As long as we
can add a LINK command, we can redirect the GPU's execution elsewhere to
do whatever other operations we want to do.

I think the only danger there is if Vivante produce a GPU with 64-bit
addressing for the command stream - if they do, commands like LINK will
most likely change format, and possibly would be a different number of
bits.

The simple solution to this would be to introduce into the API a
property (like is done for the feature bits) which tells userspace the
minimum number of bytes which must be reserved at the end of the command
buffer.  If we need to change that in the future, we have the flexibility
to do so.

> What' the problem with having a command buffer in the kernel to flush
> the MMUv2? Why do you need to insert those commands into the userspace
> command stream?

That's certainly where I would put it.

> > >> > +#define ETNA_SUBMIT_BO_READ             0x0001
> > >> > +#define ETNA_SUBMIT_BO_WRITE            0x0002
> > >> > +struct drm_etnaviv_gem_submit_bo {
> > >> > +       uint32_t flags;          /* in, mask of ETNA_SUBMIT_BO_x */
> > >> > +       uint32_t handle;         /* in, GEM handle */
> > >> > +       uint64_t presumed;       /* in/out, presumed buffer address */
> > >>
> > >> presumed support should never hit etnaviv driver.
> > >>
> > > As stated in the cover letter I think presumed support will become
> > > possible with MMUv2 and may provide a good optimization there. So I
> > > would rather leave this in here and just ignore it for now.

Could we rename this member 'reserved' if it's something that we think
we're going to implement in the near future - but also please add a
flag which indicates whether the presumed address is present or not.
Zero _can_ be a valid address too!

> A presumed address can not be a physical address, but is an address in
> the VM context of that process. Nouveau uses the same thing on NV50+
> where you have a proper MMU to protect all GPU accesses. I would expect
> the same thing to be true for Vivante MMUv2.

I know that there are very strong opinions about exposing _physical_
addresses to userspace (David, for example, doesn't like it one bit.)
If it's a GPU address, then that's less useful to userspace.

However, that data has to be treated as suspect coming from userspace.

You still need to look up the buffer object in the kernel, so that you
can manage the buffer object's state.  If you have the buffer object's
state, then you most likely have easy access to its GPU mapping, which
means you can retrieve the GPU address, which you can then use to
validate the address passed from userspace... but if you've found the
GPU address via this method, you haven't saved anything.

An alternative approach would be to lookup the presumed address in
(eg) a rbtree to locate the buffer object's state, which would save
the lookup by object ID, but does this save anything by doing that,
does it add complexity and additional kernel processing?

I'm not sure.

> > Keep it stupid simple. In my libdrm repo, which you hopefully know, I
> > have implemented the buffer handling from the original libetnaviv. We
> > allocate 5 command buffers of a defined size and rotate through them.
> > During command buffer building we reserve space in the stream. if
> > there is not enough space we flush the current buffer stream and
> > switch to the next and us it. Then there is a way to explicit flush
> > a command buffer.
> > 
> > For more details see:
> > https://github.com/laanwj/etna_viv/tree/master/src/etnaviv
> > https://github.com/austriancoder/libdrm
> 
> Same argument as above really. We need at least the context buffer.

An important question is whether the context buffer, built by userspace,
should be submitted as one of these command buffers, or kept separate so
the kernel can keep track of it and decide whether or not to use it
according to the state it's tracking.

Another point to bring up here is about how command buffers are submitted.

Consider this scenario:

- Userspace creates a command buffer, and arranges for the initial
  commands to be time consuming (eg, long WAIT commands.)  It fills the
  rest of the buffer with dummy LOAD STATE commands.
- Userspace submits this, the kernel validates the command buffer, and
  submits it to the GPU.  The GPU starts executing the buffer.
- Userspace, which still has access to the command buffer, overwrites
  the LOAD STATE commands with malicious GPU commands.
- GPU executes malicious GPU commands.

This brings up several questions:

1. Do we care about this?
2. If we do care, should we insist that a command buffer is not mapped
   in userspace when it is submitted, and prevent an in-use command
   buffer being mapped?
3. If we don't care, what's the point of validating the supplied command
   buffer?

(2) would be quite an API change over what we have today, and introduce
an amount of overhead, though something which could be handled in the
userspace library (eg, if we're modelling on etnaviv's five command
buffer model, we could copy the command buffer immediately before
submission.)

Given this, I think (3) has some value irrespective of the outcome of
(1) as it gives us a way to catch silly errors from userspace before
they hit the GPU and become a problem.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.