[Mesa-dev] [PATCH 2/5] vc4: Switch back to using a local copy of vc4_drm.h.

Emil Velikov emil.l.velikov at gmail.com
Thu Jun 29 12:17:16 UTC 2017


Hi Eric,

On 29 June 2017 at 02:15, Eric Anholt <eric at anholt.net> wrote:
> Needing to get our uapi header from libdrm has only complicated things.
> Follow intel's lead and drop our requirement for it.
>
Humble question: when you get the chance please describe the
complications that you've seen.

That aside, there's a couple of fixes which you'd want to address.

> --- /dev/null
> +++ b/include/drm-uapi/vc4_drm.h

> +struct drm_vc4_submit_rcl_surface {
> +       __u32 hindex; /* Handle index, or ~0 if not present. */
> +       __u32 offset; /* Offset to start of buffer. */
> +       /*
> +        * Bits for either render config (color_write) or load/store packet.
> +        * Bits should all be 0 for MSAA load/stores.
> +        */
> +       __u16 bits;
> +
> +#define VC4_SUBMIT_RCL_SURFACE_READ_IS_FULL_RES                (1 << 0)
> +       __u16 flags;
> +};
UAPI: I think I may have noticed some compat breakage here.

The sizeof above struct will differ across 32/64bit builds and since
it's embedded below I doubt one can change it now.

> +
> +/**
> + * struct drm_vc4_submit_cl - ioctl argument for submitting commands to the 3D
> + * engine.
> + *
> + * Drivers typically use GPU BOs to store batchbuffers / command lists and
> + * their associated state.  However, because the VC4 lacks an MMU, we have to
> + * do validation of memory accesses by the GPU commands.  If we were to store
> + * our commands in BOs, we'd need to do uncached readback from them to do the
> + * validation process, which is too expensive.  Instead, userspace accumulates
> + * commands and associated state in plain memory, then the kernel copies the
> + * data to its own address space, and then validates and stores it in a GPU
> + * BO.
> + */
> +struct drm_vc4_submit_cl {
> +       /* Pointer to the binner command list.
> +        *
> +        * This is the first set of commands executed, which runs the
> +        * coordinate shader to determine where primitives land on the screen,
> +        * then writes out the state updates and draw calls necessary per tile
> +        * to the tile allocation BO.
> +        */
> +       __u64 bin_cl;
> +
> +       /* Pointer to the shader records.
> +        *
> +        * Shader records are the structures read by the hardware that contain
> +        * pointers to uniforms, shaders, and vertex attributes.  The
> +        * reference to the shader record has enough information to determine
> +        * how many pointers are necessary (fixed number for shaders/uniforms,
> +        * and an attribute count), so those BO indices into bo_handles are
> +        * just stored as __u32s before each shader record passed in.
> +        */
> +       __u64 shader_rec;
> +
> +       /* Pointer to uniform data and texture handles for the textures
> +        * referenced by the shader.
> +        *
> +        * For each shader state record, there is a set of uniform data in the
> +        * order referenced by the record (FS, VS, then CS).  Each set of
> +        * uniform data has a __u32 index into bo_handles per texture
> +        * sample operation, in the order the QPU_W_TMUn_S writes appear in
> +        * the program.  Following the texture BO handle indices is the actual
> +        * uniform data.
> +        *
> +        * The individual uniform state blocks don't have sizes passed in,
> +        * because the kernel has to determine the sizes anyway during shader
> +        * code validation.
> +        */
> +       __u64 uniforms;
> +       __u64 bo_handles;
> +
> +       /* Size in bytes of the binner command list. */
> +       __u32 bin_cl_size;
> +       /* Size in bytes of the set of shader records. */
> +       __u32 shader_rec_size;
> +       /* Number of shader records.
> +        *
> +        * This could just be computed from the contents of shader_records and
> +        * the address bits of references to them from the bin CL, but it
> +        * keeps the kernel from having to resize some allocations it makes.
> +        */
> +       __u32 shader_rec_count;
> +       /* Size in bytes of the uniform state. */
> +       __u32 uniforms_size;
> +
> +       /* Number of BO handles passed in (size is that times 4). */
> +       __u32 bo_handle_count;
> +
> +       /* RCL setup: */
> +       __u16 width;
> +       __u16 height;
> +       __u8 min_x_tile;
> +       __u8 min_y_tile;
> +       __u8 max_x_tile;
> +       __u8 max_y_tile;
> +       struct drm_vc4_submit_rcl_surface color_read;
> +       struct drm_vc4_submit_rcl_surface color_write;
> +       struct drm_vc4_submit_rcl_surface zs_read;
> +       struct drm_vc4_submit_rcl_surface zs_write;
> +       struct drm_vc4_submit_rcl_surface msaa_color_write;
> +       struct drm_vc4_submit_rcl_surface msaa_zs_write;
> +       __u32 clear_color[2];
> +       __u32 clear_z;
> +       __u8 clear_s;
> +
> +       __u32 pad:24;
> +
> +#define VC4_SUBMIT_CL_USE_CLEAR_COLOR                  (1 << 0)
> +       __u32 flags;
> +
> +       /* Returned value of the seqno of this render job (for the
> +        * wait ioctl).
> +        */
> +       __u64 seqno;
> +};
Regardless of the drm_vc4_submit_rcl_surface issue mentioned above, I
think you're _u32 short here.
Most likely you've missed the [2] - I know I did the first time I've read this.

If that's intentional and/or kernel has respective compat ioctl,
please add a small comment. This way the next time/person that skims
through won't nag you about it.

I haven't looked through the rest of the file, but I would encourage
you to take a second look.


> --- a/src/gallium/drivers/vc4/Makefile.sources
> +++ b/src/gallium/drivers/vc4/Makefile.sources
> @@ -14,6 +14,7 @@ C_SOURCES := \
>         vc4_context.c \
>         vc4_context.h \
>         vc4_draw.c \
> +       vc4_drm.h \
File is not in this folder, so should drop this.

-Emil


More information about the mesa-dev mailing list