GC7000L - comparison to GC3000
Wladimir J. van der Laan
laanwj at gmail.com
Mon Oct 16 18:47:10 UTC 2017
L.S.
As you might know I'm currently working on GC7000L ("HALTI5") support in
etnaviv. I have done much of the initial reverse-engineering, so here
are some findings:
\/\/\/\/\/\/\
- A significant new addition is the BLT engine. This is a flexible copy-engine, certainly
compared to the old RS (which it completely replaces). It can do:
- Fill/Clear image (glClear), also arbitrary subsets of the image
- Copy image (texture uploading, framebuffer blit)
- Copy buffer (glBuffer(Sub)Data)
- In-place resolve
- Compute mipmap chains in one go (glGenerateMipmap)
- YUV tiling, converting various planar YUV formats to tiled YUV for
texturing (glTexDirectVIV)
I have low-level tests for the BLT in:
https://github.com/etnaviv/etnaviv_gpu_tests/blob/master/src/blttest_gc7000.c
https://github.com/etnaviv/etnaviv_gpu_tests/blob/master/src/blttest2_gc7000.c
https://github.com/etnaviv/etnaviv_gpu_tests/blob/master/src/bltdemo_gc7000.c
- Texturing is done through texture descriptors. This means that most of the
texture parameters are in a structure in memory, and the state in the
registers only points there. This adds a layer of indirection which I suppose
is to accommodate Vulkan.
I've documented the layout of a texture descriptor here:
https://github.com/etnaviv/etna_viv/blob/master/rnndb/texdesc_3d.xml#L34
- TS (Tile Status) has two "cache modes": 128 bytes per tile, 256 bytes per
tile. This affects the number of TS bits used for a surface, as well as the number
of tiles for auto-disable and in-place resolve. but from what I've seen not
the actual texture tiling format.
- The 32x32 supertiled format appears to be used universally by the blob: for
textures, for render targets, even for mipmapped textures. This is
interesting as it means the lower mipmaps will still be rounded up to 32x32 in
memory, wasting some space. But apparently it's deemed it worth it.
- Shaders use ICACHE only - this means they are always loaded from memory, not
written in-line to a state range by the driver. With GC3000 this was only
used for large shaders which didn't fit in available state.
- Vertex attribute setting changed, moved, adding some new state. Haven't figured
out the new state yet, but I think I have what is needed for basic rendering:
https://github.com/etnaviv/etna_viv/blob/master/rnndb/state.xml#L392
There is a test replaying the command stream for a cube (interleaved vertex buffer) here:
https://github.com/etnaviv/etnaviv_gpu_tests/blob/master/src/cube_gc7000.c
- Drawing is always done with DRAW_INSTANCED (same as GC3000), there is a new
DRAW_INDIRECT command to handle glDraw*Indirect. See
https://github.com/etnaviv/etna_viv/blob/master/rnndb/cmdstream.xml#L331
- For the command stream there is a new SNAPPAGES command, I don't know what
this does but the kernel driver does it at flushes. There is also WAIT_FENCE, which,
as the name implies, blocks the frontend waiting for a fence.
- Something strange seems to be the case with pixel shaders, I haven't
investigated in detail yet but at least texture instructions are sometimes
duplicated:
void main()
{
gl_FragColor = 3.0 * vVaryingColor * texture2D(in_texture, coord);\n"
}
Gets assembled to:
0: texld t3, tex0, ?4?1.xyyy, void, void ; !bit_3_13=1!
1: texld t3, tex0, ?4?1.zwww, void, void ; !bit_3_24=1!
2: mul t1, u0.xxxx, t2, void ; tex not used but fields non-zero (id=0,amode=5,swiz=0)
3: mul t1, t1, t3, void ; tex not used but fields non-zero (id=0,amode=5,swiz=0)
Could it be that two invocations of the pixel shader are somehow combined into one shader?
It might be that this is only done for simple pixel shaders.
- There is support for rendering to R8 and R8I (yay?).
\/\/\/\/\/\/\
Let me know if you have any questions!
Now that I've mostly figured the necessary state, next I'm going to look
at more specific shader isa differences. I'm also going to work on Mesa integration
(on a separate branch for now).
Regards,
Wladimir
More information about the etnaviv
mailing list