[Mesa-dev] [PATCH 00/13] RadeonSI: Reduce user SGPR usage
maraeo at gmail.com
Thu Feb 22 16:17:29 UTC 2018
I don't think that adding "uint32_t userdata_XX;" would simplify anything.
The bottom line is, patches 9-13 are prerequisites for VBO descriptors
in user SGPRs, so they block that optimization as long as they sit on
the mailing list.
On Tue, Feb 20, 2018 at 8:51 PM, Marek Olšák <maraeo at gmail.com> wrote:
> The user SGPRs for blits are kinda a separate thing where the standard
> emit paths are disabled. 64-bit pointers are a short-term issue and
> will be removed in 2 years (or 1.5 years or when we want to kill off
> old LLVM support). VBO descriptors in user SGPRs will require 32-bit
> pointers. Next-gen will also require 32-bit pointers. The number of
> codepaths will be reduced to merged/non-merged and mono/non-mono
> again. For gfx9 and later, the only codepaths will be mono/non-mono.
> There will just be a transitory period when both 32-bit and 64-bit
> pointers will be supported, and both the old and new way of setting up
> VBO descriptors will be supported. However, next-gen will only support
> one way - the newer way.
> Overall, I don't see an increase in complexity other than the transitory period.
> On Tue, Feb 20, 2018 at 5:46 PM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
>> With a small comment on patch 6, patches 1-8:
>> Reviewed-by: Nicolai Hähnle <nicolai.haehnle at amd.com>
>> for now.
>> However, I'm unhappy about how complex this is all getting. 32- vs. 64-bit,
>> merged vs. non-merged, monolithic vs. non-monolithic, and then special user
>> SGPR uses like for blits and soon VBO descriptors, it feels like it's
>> becoming too much.
>> The problem is I don't have a good answer to it all :)
>> Perhaps some of it could be helped by having an explicit userdata staging
>> area, i.e.
>> uint32_t userdata_XX; // or 32
>> uint32_t userdata_XX_dirty;
>> Then si_upload_descriptors would write its pointers into userdata_XX in the
>> right location and set the appropriate dirty bit(s), and a separate
>> emit_userdata function would use the contiguous bit scan to actually emit
>> all the userdata together -- this would include VS state bits, tess state
>> info, and blit shader SGPRs.
>> I do think this would be cleaner especially than the current
>> si_emit_shader_pointer_* code, and it would coalesce more SH reg writes as a
>> side bonus. What do you think?
>> The other half of it is how the LLVM functions are created.
>> On 17.02.2018 20:43, Marek Olšák wrote:
>>> This series has the following effect on user SGPRs:
>>> 64-bit pointers:
>>> TCS: 14 -> 12
>>> Merged VS-TCS: 24 -> 20
>>> Merged VS-GS: 18 -> 16
>>> Merged TES-GS: 18 -> 14
>>> 32-bit pointers:
>>> TCS: 10 -> 8
>>> Merged VS-TCS: 16 -> 12
>>> Merged VS-GS: 11 -> 9
>>> Merged TES-GS: 11 -> 6
>>> I tested both monolithic and non-monolithic shaders, and both 64-bit
>>> and 32-bit pointers. (4 combinations)
>>> This series is a prerequisite for VBO descriptors in user SGPRs.
>>> Note that merged LS-HS and ES-GS don't even use s[6:7] input SGPRs
>>> yet. Those only provide 40 bits of scalar data (not 64 bits like
>>> Please review.
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
>> Lerne, wie die Welt wirklich ist,
>> Aber vergiss niemals, wie sie sein sollte.
More information about the mesa-dev