GC7000L - comparison to GC3000

Mon Oct 16 18:47:10 UTC 2017

L.S.

As you might know I'm currently working on GC7000L ("HALTI5") support in
etnaviv. I have done much of the initial reverse-engineering, so here
are some findings:

\/\/\/\/\/\/\

- A significant new addition is the BLT engine. This is a flexible copy-engine, certainly
  compared to the old RS (which it completely replaces). It can do:

    - Fill/Clear image (glClear), also arbitrary subsets of the image
    - Copy image (texture uploading, framebuffer blit)
    - Copy buffer (glBuffer(Sub)Data)
    - In-place resolve
    - Compute mipmap chains in one go (glGenerateMipmap)
    - YUV tiling, converting various planar YUV formats to tiled YUV for
      texturing (glTexDirectVIV)

  I have low-level tests for the BLT in: 
  https://github.com/etnaviv/etnaviv_gpu_tests/blob/master/src/blttest_gc7000.c
  https://github.com/etnaviv/etnaviv_gpu_tests/blob/master/src/blttest2_gc7000.c
  https://github.com/etnaviv/etnaviv_gpu_tests/blob/master/src/bltdemo_gc7000.c

- Texturing is done through texture descriptors. This means that most of the
  texture parameters are in a structure in memory, and the state in the
  registers only points there. This adds a layer of indirection which I suppose
  is to accommodate Vulkan.

  I've documented the layout of a texture descriptor here:
  https://github.com/etnaviv/etna_viv/blob/master/rnndb/texdesc_3d.xml#L34

- TS (Tile Status) has two "cache modes": 128 bytes per tile, 256 bytes per
  tile. This affects the number of TS bits used for a surface, as well as the number
  of tiles for auto-disable and in-place resolve. but from what I've seen not
  the actual texture tiling format.

- The 32x32 supertiled format appears to be used universally by the blob: for
  textures, for render targets, even for mipmapped textures. This is
  interesting as it means the lower mipmaps will still be rounded up to 32x32 in
  memory, wasting some space. But apparently it's deemed it worth it.

- Shaders use ICACHE only - this means they are always loaded from memory, not
  written in-line to a state range by the driver. With GC3000 this was only
  used for large shaders which didn't fit in available state.

- Vertex attribute setting changed, moved, adding some new state. Haven't figured
  out the new state yet, but I think I have what is needed for basic rendering:
  https://github.com/etnaviv/etna_viv/blob/master/rnndb/state.xml#L392
  There is a test replaying the command stream for a cube (interleaved vertex buffer) here:
  https://github.com/etnaviv/etnaviv_gpu_tests/blob/master/src/cube_gc7000.c

- Drawing is always done with DRAW_INSTANCED (same as GC3000), there is a new
  DRAW_INDIRECT command to handle glDraw*Indirect. See 
  https://github.com/etnaviv/etna_viv/blob/master/rnndb/cmdstream.xml#L331

- For the command stream there is a new SNAPPAGES command, I don't know what
  this does but the kernel driver does it at flushes. There is also WAIT_FENCE, which,
  as the name implies, blocks the frontend waiting for a fence.

- Something strange seems to be the case with pixel shaders, I haven't
  investigated in detail yet but at least texture instructions are sometimes 
  duplicated:

   void main()
   {
       gl_FragColor = 3.0 * vVaryingColor * texture2D(in_texture, coord);\n"
   }

Gets assembled to:

   0: texld      t3, tex0, ?4?1.xyyy, void, void ; !bit_3_13=1!
   1: texld      t3, tex0, ?4?1.zwww, void, void ; !bit_3_24=1!
   2: mul        t1, u0.xxxx, t2, void   ; tex not used but fields non-zero (id=0,amode=5,swiz=0)
   3: mul        t1, t1, t3, void        ; tex not used but fields non-zero (id=0,amode=5,swiz=0)

Could it be that two invocations of the pixel shader are somehow combined into one shader?
It might be that this is only done for simple pixel shaders.

- There is support for rendering to R8 and R8I (yay?).

\/\/\/\/\/\/\

Let me know if you have any questions!

Now that I've mostly figured the necessary state, next I'm going to look
at more specific shader isa differences. I'm also going to work on Mesa integration
(on a separate branch for now).

Regards,
Wladimir