[PATCH RFC 003/111] staging: etnaviv: add drm driver
Lucas Stach
l.stach at pengutronix.de
Tue Apr 7 03:27:31 PDT 2015
Am Dienstag, den 07.04.2015, 11:58 +0200 schrieb Christian Gmeiner:
> Hi Lucas.
>
> 2015-04-07 11:47 GMT+02:00 Lucas Stach <l.stach at pengutronix.de>:
> > Am Dienstag, den 07.04.2015, 11:40 +0200 schrieb Christian Gmeiner:
> >> 2015-04-07 11:20 GMT+02:00 Lucas Stach <l.stach at pengutronix.de>:
> >> > Am Dienstag, den 07.04.2015, 11:04 +0200 schrieb Christian Gmeiner:
> >> >> Hi Lucas
> >> >>
> >> >> 2015-04-07 9:35 GMT+02:00 Lucas Stach <l.stach at pengutronix.de>:
> >> >> > Am Sonntag, den 05.04.2015, 21:26 +0200 schrieb Christian Gmeiner:
> >> >> >> 2015-04-02 17:29 GMT+02:00 Lucas Stach <l.stach at pengutronix.de>:
> >> >> >> > From: Christian Gmeiner <christian.gmeiner at gmail.com>
> >> >> >> >
> >> >> >> > This is a consolidation by Russell King of Christian's drm work.
> >> >> >> >
> >> >> >> > Signed-off-by: Christian Gmeiner <christian.gmeiner at gmail.com>
> >> >> >> > Signed-off-by: Russell King <rmk+kernel at arm.linux.org.uk>
> >> >> >
> >> >> >> > ---
> >> >> > [...]
> >> >> >
> >> >> >> > +#endif /* STATE_HI_XML */
> >> >> >> > diff --git a/include/uapi/drm/etnaviv_drm.h b/include/uapi/drm/etnaviv_drm.h
> >> >> >> > new file mode 100644
> >> >> >> > index 000000000000..f7b5ac6f3842
> >> >> >> > --- /dev/null
> >> >> >> > +++ b/include/uapi/drm/etnaviv_drm.h
> >> >> >> > @@ -0,0 +1,225 @@
> >> >> >> > +/*
> >> >> >> > + * Copyright (C) 2013 Red Hat
> >> >> >> > + * Author: Rob Clark <robdclark at gmail.com>
> >> >> >> > + *
> >> >> >> > + * This program is free software; you can redistribute it and/or modify it
> >> >> >> > + * under the terms of the GNU General Public License version 2 as published by
> >> >> >> > + * the Free Software Foundation.
> >> >> >> > + *
> >> >> >> > + * This program is distributed in the hope that it will be useful, but WITHOUT
> >> >> >> > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> >> >> >> > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
> >> >> >> > + * more details.
> >> >> >> > + *
> >> >> >> > + * You should have received a copy of the GNU General Public License along with
> >> >> >> > + * this program. If not, see <http://www.gnu.org/licenses/>.
> >> >> >> > + */
> >> >> >> > +
> >> >> >> > +#ifndef __ETNAVIV_DRM_H__
> >> >> >> > +#define __ETNAVIV_DRM_H__
> >> >> >> > +
> >> >> >> > +#include <stddef.h>
> >> >> >> > +#include <drm/drm.h>
> >> >> >> > +
> >> >> >> > +/* Please note that modifications to all structs defined here are
> >> >> >> > + * subject to backwards-compatibility constraints:
> >> >> >> > + * 1) Do not use pointers, use uint64_t instead for 32 bit / 64 bit
> >> >> >> > + * user/kernel compatibility
> >> >> >> > + * 2) Keep fields aligned to their size
> >> >> >> > + * 3) Because of how drm_ioctl() works, we can add new fields at
> >> >> >> > + * the end of an ioctl if some care is taken: drm_ioctl() will
> >> >> >> > + * zero out the new fields at the tail of the ioctl, so a zero
> >> >> >> > + * value should have a backwards compatible meaning. And for
> >> >> >> > + * output params, userspace won't see the newly added output
> >> >> >> > + * fields.. so that has to be somehow ok.
> >> >> >> > + */
> >> >> >> > +
> >> >> >> > +#define ETNA_PIPE_3D 0x00
> >> >> >> > +#define ETNA_PIPE_2D 0x01
> >> >> >> > +#define ETNA_PIPE_VG 0x02
> >> >> >> > +
> >> >> >> > +#define ETNA_MAX_PIPES 3
> >> >> >> > +
> >> >> >> > +/* timeouts are specified in clock-monotonic absolute times (to simplify
> >> >> >> > + * restarting interrupted ioctls). The following struct is logically the
> >> >> >> > + * same as 'struct timespec' but 32/64b ABI safe.
> >> >> >> > + */
> >> >> >> > +struct drm_etnaviv_timespec {
> >> >> >> > + int64_t tv_sec; /* seconds */
> >> >> >> > + int64_t tv_nsec; /* nanoseconds */
> >> >> >> > +};
> >> >> >> > +
> >> >> >> > +#define ETNAVIV_PARAM_GPU_MODEL 0x01
> >> >> >> > +#define ETNAVIV_PARAM_GPU_REVISION 0x02
> >> >> >> > +#define ETNAVIV_PARAM_GPU_FEATURES_0 0x03
> >> >> >> > +#define ETNAVIV_PARAM_GPU_FEATURES_1 0x04
> >> >> >> > +#define ETNAVIV_PARAM_GPU_FEATURES_2 0x05
> >> >> >> > +#define ETNAVIV_PARAM_GPU_FEATURES_3 0x06
> >> >> >> > +#define ETNAVIV_PARAM_GPU_FEATURES_4 0x07
> >> >> >> > +
> >> >> >> > +#define ETNAVIV_PARAM_GPU_STREAM_COUNT 0x10
> >> >> >> > +#define ETNAVIV_PARAM_GPU_REGISTER_MAX 0x11
> >> >> >> > +#define ETNAVIV_PARAM_GPU_THREAD_COUNT 0x12
> >> >> >> > +#define ETNAVIV_PARAM_GPU_VERTEX_CACHE_SIZE 0x13
> >> >> >> > +#define ETNAVIV_PARAM_GPU_SHADER_CORE_COUNT 0x14
> >> >> >> > +#define ETNAVIV_PARAM_GPU_PIXEL_PIPES 0x15
> >> >> >> > +#define ETNAVIV_PARAM_GPU_VERTEX_OUTPUT_BUFFER_SIZE 0x16
> >> >> >> > +#define ETNAVIV_PARAM_GPU_BUFFER_SIZE 0x17
> >> >> >> > +#define ETNAVIV_PARAM_GPU_INSTRUCTION_COUNT 0x18
> >> >> >> > +#define ETNAVIV_PARAM_GPU_NUM_CONSTANTS 0x19
> >> >> >> > +
> >> >> >> > +//#define MSM_PARAM_GMEM_SIZE 0x02
> >> >> >> > +
> >> >> >> > +struct drm_etnaviv_param {
> >> >> >> > + uint32_t pipe; /* in, ETNA_PIPE_x */
> >> >> >> > + uint32_t param; /* in, ETNAVIV_PARAM_x */
> >> >> >> > + uint64_t value; /* out (get_param) or in (set_param) */
> >> >> >> > +};
> >> >> >> > +
> >> >> >> > +/*
> >> >> >> > + * GEM buffers:
> >> >> >> > + */
> >> >> >> > +
> >> >> >> > +#define ETNA_BO_CMDSTREAM 0x00000001
> >> >> >> > +#define ETNA_BO_CACHE_MASK 0x000f0000
> >> >> >> > +/* cache modes */
> >> >> >> > +#define ETNA_BO_CACHED 0x00010000
> >> >> >> > +#define ETNA_BO_WC 0x00020000
> >> >> >> > +#define ETNA_BO_UNCACHED 0x00040000
> >> >> >> > +
> >> >> >> > +struct drm_etnaviv_gem_new {
> >> >> >> > + uint64_t size; /* in */
> >> >> >> > + uint32_t flags; /* in, mask of ETNA_BO_x */
> >> >> >> > + uint32_t handle; /* out */
> >> >> >> > +};
> >> >> >> > +
> >> >> >> > +struct drm_etnaviv_gem_info {
> >> >> >> > + uint32_t handle; /* in */
> >> >> >> > + uint32_t pad;
> >> >> >> > + uint64_t offset; /* out, offset to pass to mmap() */
> >> >> >> > +};
> >> >> >> > +
> >> >> >> > +#define ETNA_PREP_READ 0x01
> >> >> >> > +#define ETNA_PREP_WRITE 0x02
> >> >> >> > +#define ETNA_PREP_NOSYNC 0x04
> >> >> >> > +
> >> >> >> > +struct drm_etnaviv_gem_cpu_prep {
> >> >> >> > + uint32_t handle; /* in */
> >> >> >> > + uint32_t op; /* in, mask of ETNA_PREP_x */
> >> >> >> > + struct drm_etnaviv_timespec timeout; /* in */
> >> >> >> > +};
> >> >> >> > +
> >> >> >> > +struct drm_etnaviv_gem_cpu_fini {
> >> >> >> > + uint32_t handle; /* in */
> >> >> >> > +};
> >> >> >> > +
> >> >> >> > +/*
> >> >> >> > + * Cmdstream Submission:
> >> >> >> > + */
> >> >> >> > +
> >> >> >> > +/* The value written into the cmdstream is logically:
> >> >> >> > + *
> >> >> >> > + * ((relocbuf->gpuaddr + reloc_offset) << shift) | or
> >> >> >> > + *
> >> >> >> > + * When we have GPU's w/ >32bit ptrs, it should be possible to deal
> >> >> >> > + * with this by emit'ing two reloc entries with appropriate shift
> >> >> >> > + * values. Or a new ETNA_SUBMIT_CMD_x type would also be an option.
> >> >> >> > + *
> >> >> >> > + * NOTE that reloc's must be sorted by order of increasing submit_offset,
> >> >> >> > + * otherwise EINVAL.
> >> >> >> > + */
> >> >> >> > +struct drm_etnaviv_gem_submit_reloc {
> >> >> >> > + uint32_t submit_offset; /* in, offset from submit_bo */
> >> >> >> > + uint32_t or; /* in, value OR'd with result */
> >> >> >> > + int32_t shift; /* in, amount of left shift (can be negative) */
> >> >> >> > + uint32_t reloc_idx; /* in, index of reloc_bo buffer */
> >> >> >> > + uint64_t reloc_offset; /* in, offset from start of reloc_bo */
> >> >> >> > +};
> >> >> >> > +
> >> >> >> > +/* submit-types:
> >> >> >> > + * BUF - this cmd buffer is executed normally.
> >> >> >> > + * IB_TARGET_BUF - this cmd buffer is an IB target. Reloc's are
> >> >> >> > + * processed normally, but the kernel does not setup an IB to
> >> >> >> > + * this buffer in the first-level ringbuffer
> >> >> >> > + * CTX_RESTORE_BUF - only executed if there has been a GPU context
> >> >> >> > + * switch since the last SUBMIT ioctl
> >> >> >> > + */
> >> >> >> > +#define ETNA_SUBMIT_CMD_BUF 0x0001
> >> >> >> > +#define ETNA_SUBMIT_CMD_IB_TARGET_BUF 0x0002
> >> >> >> > +#define ETNA_SUBMIT_CMD_CTX_RESTORE_BUF 0x0003 is a
> >> >> >> > +struct drm_etnaviv_gem_submit_cmd {
> >> >> >> > + uint32_t type; /* in, one of ETNA_SUBMIT_CMD_x */
> >> >> >>
> >> >> >> Do we need different types? I did not use this in my kernel tree.
> >> >> >>
> >> >> >
> >> >> > Please also not the commit cleaning this API, which changes this a bit.
> >> >> >
> >> >>
> >> >> Ah yes.. I see it.
> >> >>
> >> >> > But yes we need different types. At least the context restore buffer
> >> >> > type is needed to properly implement GPU power management and context
> >> >> > switching.
> >> >> >
> >> >>
> >> >> What role does GPU power management plays here? For the context switching
> >> >> it could make sense. But for the 2d core the context is so small that
> >> >> it does not
> >> >> hurt to send it with every command stream. For the 3d core it is much
> >> >> bigger, but
> >> >> this could be done completely in the kernel. Or I am wrong here?
> >> >>
> >> > If you power down the GPU you loose the context. You are right that we
> >> > could save/restore the context from kernel space, but that is really
> >> > taking a toll on CPU time. It is much better to have userspace provide a
> >> > context buffer to get the GPU in the expected state, as you then only
> >> > need to splice this into the execution stream to restore the context
> >> > instead of pushing it with the CPU. Reading back the context on every
> >> > switch will kill any performance.
> >>
> >> And for this you need a own command type? The context is nothing special. Only
> >> load state commands in the command buffer. You can have an internal
> >> representation
> >> of the context in the user space (as libetnaviv does it right now )and
> >> work with it.
> >> Then if you want to submit some render calls etc you can look if the
> >> state is dirty and submit
> >> the whole or the changes values. So I am not sure if there is a need
> >> for a context
> >> buffer type as it is nothing special.
> >
> > How would the userspace know if another process / a GPU hang / a power
> > management event has dirtied the state of the GPU? It is only the kernel
> > who knows if the state of the GPU has changed since the last submit of
> > this process.
> >
>
> Okay got it.
>
> > In the common case when nothing has disturbed the context we don't want
> > to insert the context buffer, as we really want minimal state updates in
> > that case. We need a way to tell the kernel which command buffer is the
> > context buffer, so the kernel only splices this buffer into the stream
> > if the context is dirty.
>
> So the context buffer holds the full GPU context and the kernel does the partial
> update of the current hardware context. This makes the user space a lot simpler
> as we can send the whole context and do not need take care of partial updates.
>
> I like the idea.
>
I still think your understanding is not completely what I wanted to say
with that.
You are right that I want to kick out all this state tracking on
individual registers. But I don't want to move it into the kernel, but
scrap it altogether.
Let me try to explain this in a bit more detail:
First let's postulate that we already have pretty good dirty state
tracking on the gallium state object level. I don't think it buys us
anything to do more fine grained state tracking.
The gallium userspace driver only pushes state _changes_ to a normal
command buffer. For example this means that if nothing has changed since
the last submit of this process except some vertex buffers the only
thing that will be contained in the command buffer are a few SET_STATEs
for the vertex buffer addresses and the DRAW call.
The context buffer in contrast holds the full GPU context as of the last
flush. So if you flush the stream MESA dumps the full Gallium state into
the context buffer.
Now if you submit both buffers together the kernel can check if your
context is still valid (nothing other has touched the GPU since your
last submit) and in that case only splice the normal command bo into the
stream. If something has changed since your last submit the kernel will
splice the context buffer first, then the command buffer. This way you
always get a predictable state without tracking any GPU state on a
register level.
Regards,
Lucas
--
Pengutronix e.K. | Lucas Stach |
Industrial Linux Solutions | http://www.pengutronix.de/ |
More information about the dri-devel
mailing list