[Mesa-dev] [PATCH 00/13] RadeonSI: Reduce user SGPR usage
nhaehnle at gmail.com
Tue Feb 20 16:46:33 UTC 2018
With a small comment on patch 6, patches 1-8:
Reviewed-by: Nicolai Hähnle <nicolai.haehnle at amd.com>
However, I'm unhappy about how complex this is all getting. 32- vs.
64-bit, merged vs. non-merged, monolithic vs. non-monolithic, and then
special user SGPR uses like for blits and soon VBO descriptors, it feels
like it's becoming too much.
The problem is I don't have a good answer to it all :)
Perhaps some of it could be helped by having an explicit userdata
staging area, i.e.
uint32_t userdata_XX; // or 32
Then si_upload_descriptors would write its pointers into userdata_XX in
the right location and set the appropriate dirty bit(s), and a separate
emit_userdata function would use the contiguous bit scan to actually
emit all the userdata together -- this would include VS state bits, tess
state info, and blit shader SGPRs.
I do think this would be cleaner especially than the current
si_emit_shader_pointer_* code, and it would coalesce more SH reg writes
as a side bonus. What do you think?
The other half of it is how the LLVM functions are created.
On 17.02.2018 20:43, Marek Olšák wrote:
> This series has the following effect on user SGPRs:
> 64-bit pointers:
> TCS: 14 -> 12
> Merged VS-TCS: 24 -> 20
> Merged VS-GS: 18 -> 16
> Merged TES-GS: 18 -> 14
> 32-bit pointers:
> TCS: 10 -> 8
> Merged VS-TCS: 16 -> 12
> Merged VS-GS: 11 -> 9
> Merged TES-GS: 11 -> 6
> I tested both monolithic and non-monolithic shaders, and both 64-bit
> and 32-bit pointers. (4 combinations)
> This series is a prerequisite for VBO descriptors in user SGPRs.
> Note that merged LS-HS and ES-GS don't even use s[6:7] input SGPRs
> yet. Those only provide 40 bits of scalar data (not 64 bits like
> Please review.
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
More information about the mesa-dev