[Lima] [PATCH v2 6/8] gallium: add lima driver

Thu Mar 28 03:29:10 UTC 2019

V3 driver could be reviewed at:
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/465

Regards,
Qiang

On Tue, Mar 26, 2019 at 2:38 PM Qiang Yu <yuq825 at gmail.com> wrote:
>
> On Tue, Mar 26, 2019 at 11:33 AM Alyssa Rosenzweig <alyssa at rosenzweig.io> wrote:
> >
> > > +   [PPIR_INSTR_SLOT_ALU_VEC_MUL] = ppir_codegen_encode_vec_mul,
> > > +   [PPIR_INSTR_SLOT_ALU_SCL_MUL] = ppir_codegen_encode_scl_mul,
> > > +   [PPIR_INSTR_SLOT_ALU_VEC_ADD] = ppir_codegen_encode_vec_add,
> > > +   [PPIR_INSTR_SLOT_ALU_SCL_ADD] = ppir_codegen_encode_scl_add,
> >
> > vmul/smul/vadd/sadd are the (official?) names for this on Midgard, fwiw.
> >
> > > +static const int ppir_codegen_field_size[] = {
> > > +   34, 62, 41, 43, 30, 44, 31, 30, 41, 73
> > > +};
> >
> > What is this?
> PP instruction field (i.e. vadd part length) length in bit.
>
> >
> > > +static inline int align_to_word(int size)
> > > +{
> > > +   return ((size + 0x1f) >> 5);
> > > +}
> >
> > Maybe use the align() macro (it's in mesa/src/util/) for clarity?
> >
> > > +   ppir_codegen_combine_scalar_op_atan  = 8, /* Arc Tangent Part 1 */
> > > +   ppir_codegen_combine_scalar_op_atan2 = 9, /* Arc Tangent 2 Part 1 */
> >
> > Any plan to use these ops? Midgard has them too, but inverse trig gets
> > lowered away anyway...
> >
> Not yet.
>
> > > +            if (node->op == ppir_op_neg)
> > > +               src->negate = !src->negate;
> > > +            else if (node->op == ppir_op_abs)
> > > +               src->absolute = true;
> >
> > I'm confused. NIR has neg/abs modifiers natively. You can get all
> > fneg/fabs instructions lowered in NIR to these modifiers via
> > the lower_source_mods pass. If you don't want them, you'll never have
> > neg/abs ops at all; they'll all be modifiers. Why duplicate this in
> > ppir?
>
> Didn't know this pass, I can use it, thanks.
>
> >
> > > +   uint32_t va;
> >
> > I don't know the Utgard MMU, but will this work on 64-bit?
> >
> It's fine as Utgard MMU only support 32bit address.
>
> >
> > > +   uint32_t dubya;
> >
> > ???
> Don't know for me either, just copy from lima-ng.
>
> >
> > > +   uint32_t mrt_bits;
> > > +   uint32_t mrt_pitch;
> >
> > Utgard supports MRT?!
> >
> Utgard can render to 3 FB (WB0~2) in one pass.
>
> >
> > > +   uint32_t width = fui(fb->width);
> > > +   uint32_t height = fui(fb->height);
> > > +   uint32_t reload_gl_pos[] = {
> > > +      width,      0x00000000, 0x00000000, 0x3f800000, /* 0x00000000 */
> > > +      0x00000000, 0x00000000, 0x00000000, 0x3f800000, /* 0x00000010 */
> > > +      0x00000000, height,     0x00000000, 0x3f800000, /* 0x00000020 */
> > > +   };
> > > +   memcpy(cpu + lima_reload_gl_pos_offset, reload_gl_pos,
> > > +          sizeof(reload_gl_pos));
> > > +
> > > +   uint32_t reload_varying[] = {
> > > +      width,      0x00000000, 0x00000000, 0x00000000, /* 0x00000000 */
> > > +      0x00000000, height,     0x00000000, 0x00000000, /* 0x00000010 */
> > > +   };
> > > +   memcpy(cpu + lima_reload_varying_offset, reload_varying,
> > > +          sizeof(reload_varying));
> > > +
> >
> > Why not actual float arrays?
> >
> Right.
>
> >
> > > +   uint32_t clear_shader[] = {
> > > +      0x00021025, 0x0000000c,
> > > +      (ctx->clear.color_16pc << 12) | 0x000007cf,
> > > +      ctx->clear.color_16pc >> 12,
> > > +      ctx->clear.color_16pc >> 44,
> > > +   };
> >
> > Please don't include shader blobs. This should maybe be generated
> > from NIR, but at minimum, please include the corresponding annotated
> > assembly.
> >
> I think no need to compile this kind of util shader, will add some
> comments about it.
>
> > > +      clear->color_16pc =
> > > +         ((uint64_t)float_to_ushort(color->f[3]) << 48) |
> > > +         ((uint64_t)float_to_ushort(color->f[2]) << 32) |
> > > +         ((uint64_t)float_to_ushort(color->f[1]) << 16) |
> > > +         float_to_ubyte(color->f[0]);
> >
> > Should that last line be float_to_ushort?
> >
> Right, will fix it.
>
> >
> > > +enum lima_attrib_type {
> > > +   LIMA_ATTRIB_FLOAT = 0x000,
> > > +   /* todo: find out what lives here. */
> >
> > LIMA_ATTRIB_FP16 maybe.
> > LIMA_ATTRIB_I32/LIMA_ATTRIB_U32?.
> >
> >
> > > +static bool
> > > +is_modifier_ok(const uint64_t *modifiers, int count, uint64_t mod)
> >
> > drm_find_modifier in util/u_drm.h
> >
> > > +   /* fs program for clear buffer? */
> > > +   static const uint32_t pp_clear_program[] = {
> > > +      0x00020425, 0x0000000c, 0x01e007cf, 0xb0000000, /* 0x00000000 */
> > > +      0x000005f5, 0x00000000, 0x00000000, 0x00000000, /* 0x00000010 */
> > > +   };
> > > +   memcpy(lima_bo_map(screen->pp_buffer) + pp_clear_program_offset,
> > > +          pp_clear_program, sizeof(pp_clear_program));
> > > +
> > > +   /* copy texture to framebuffer, used to reload gpu tile buffer */
> > > +   static const uint32_t pp_reload_program[] = {
> > > +      0x000005e6, 0xf1003c20, 0x00000000, 0x39001000, /* 0x00000000 */
> > > +      0x00000e4e, 0x000007cf, 0x00000000, 0x00000000, /* 0x00000010 */
> > > +   };
> > > +   memcpy(lima_bo_map(screen->pp_buffer) + pp_reload_program_offset,
> > > +          pp_reload_program, sizeof(pp_reload_program));
> > > +
> > > +   /* 0/1/2 vertex index for reload/clear draw */
> > > +   static const uint32_t pp_shared_index[] = {
> > > +      0x00020100, 0x00000000, 0x00000000, 0x00000000, /* 0x00000000 */
> > > +   };
> > > +   memcpy(lima_bo_map(screen->pp_buffer) + pp_shared_index_offset,
> > > +          pp_shared_index, sizeof(pp_shared_index));
> > > +
> > > +   /* 4096x4096 gl pos used for partial clear */
> > > +   static const uint32_t pp_clear_gl_pos[] = {
> > > +      0x45800000, 0x00000000, 0x3f800000, 0x3f800000, /* 0x00000000 */
> > > +      0x00000000, 0x00000000, 0x3f800000, 0x3f800000, /* 0x00000010 */
> > > +      0x00000000, 0x45800000, 0x3f800000, 0x3f800000, /* 0x00000020 */
> > > +   };
> > > +   memcpy(lima_bo_map(screen->pp_buffer) + pp_clear_gl_pos_offset,
> > > +          pp_clear_gl_pos, sizeof(pp_clear_gl_pos));
> >
> > Again, please no shader blobs.
> >
> > > +
> > > +   /* reverse calculate the parameter of glViewport */
> > > +   ctx->viewport.x = viewport->translate[0] - viewport->scale[0];
> > > +   ctx->viewport.y = fabsf(viewport->translate[1] - fabsf(viewport->scale[1]));
> > > +   ctx->viewport.width = viewport->scale[0] * 2;
> > > +   ctx->viewport.height = fabsf(viewport->scale[1] * 2);
> >
> > Careful about flipped-y viewports...?
> >
> Right.
>
> > > +++ b/src/gallium/drivers/lima/lima_tiling.c
> >
> > We'll maybe want to share this routine somewhere... Also, what's your
> > plan for autogenerating mipmaps?
>
> Arno did the mipmap implementation, so far no further work from him.