[Mesa-dev] [PATCH 1/2] tgsi: texture types
Roland Scheidegger
sroland at vmware.com
Tue Jun 9 06:24:20 PDT 2015
Am 09.06.2015 um 11:01 schrieb Jose Fonseca:
> On 09/06/15 04:03, Rob Clark wrote:
>> On Mon, Jun 8, 2015 at 10:50 PM, Roland Scheidegger
>> <sroland at vmware.com> wrote:
>>> Am 09.06.2015 um 04:40 schrieb Rob Clark:
>>>> On Mon, Jun 8, 2015 at 10:36 PM, Roland Scheidegger
>>>> <sroland at vmware.com> wrote:
>>>>> Am 09.06.2015 um 04:20 schrieb Rob Clark:
>>>>>> On Mon, Jun 8, 2015 at 9:40 PM, Roland Scheidegger
>>>>>> <sroland at vmware.com> wrote:
>>>>>>> Am 09.06.2015 um 03:20 schrieb Rob Clark:
>>>>>>>> On Mon, Jun 8, 2015 at 8:35 PM, Roland Scheidegger
>>>>>>>> <sroland at vmware.com> wrote:
>>>>>>>>> Am 08.06.2015 um 20:15 schrieb Rob Clark:
>>>>>>>>>> From: Rob Clark <robclark at freedesktop.org>
>>>>>>>>>>
>>>>>>>>>> Freedreno needs sampler type information to deal with int/uint
>>>>>>>>>> textures.
>>>>>>>>>> To accomplish this, start creating sampler-view declarations, as
>>>>>>>>>> suggested here:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> http://lists.freedesktop.org/archives/mesa-dev/2014-November/071583.html
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> create a sampler-view with index matching the sampler, to
>>>>>>>>>> encode the
>>>>>>>>>> texture type (ie. SINT/UINT/FLOAT). Ie:
>>>>>>>>>>
>>>>>>>>>> DCL SVIEW[n], 2D, UINT
>>>>>>>>>> DCL SAMP[n]
>>>>>>>>>> TEX OUT[1], IN[1], SAMP[n]
>>>>>>>>>>
>>>>>>>>>> For tgsi texture instructions which do not take an explicit SVIEW
>>>>>>>>>> argument, the SVIEW index is implied by the SAMP index.
>>>>>>>>> I wonder if there should be some doc update somewhere.
>>>>>>>>>
>>>>>>>>
>>>>>>>> yeah, perhaps.. I wasn't quite sure where, tgsi.rst only talks
>>>>>>>> about
>>>>>>>> SVIEW in terms of the SAMPLE and related opc's (ie. the ones
>>>>>>>> which do
>>>>>>>> take an SVIEW arg), and the way things are with this patch, other
>>>>>>>> drivers simply need to ignore the extra SVIEW decl's and not
>>>>>>>> randomly
>>>>>>>> assert for things to continue working as before..
>>>>>>>>
>>>>>>>> hypothetically if something in tree actually created SVIEW decl
>>>>>>>> currently, and the shader also had TEX* style instructions, it
>>>>>>>> would
>>>>>>>> have to take care to not have conflicting SVIEW's. But afaict
>>>>>>>> nothing
>>>>>>>> in-tree creates sampler view decl's currently.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Rob Clark <robclark at freedesktop.org>
>>>>>>>>>> ---
>>>>>>>>>> src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 29
>>>>>>>>>> +++++++++++++++++++++++++++++
>>>>>>>>>> 1 file changed, 29 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>>>>>>>> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>>>>>>>> index 0e60d95..ce8f2ea 100644
>>>>>>>>>> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>>>>>>>> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>>>>>>>> @@ -239,6 +239,7 @@ public:
>>>>>>>>>> st_src_reg sampler; /**< sampler register */
>>>>>>>>>> int sampler_array_size; /**< 1-based size of sampler
>>>>>>>>>> array, 1 if not array */
>>>>>>>>>> int tex_target; /**< One of TEXTURE_*_INDEX */
>>>>>>>>>> + glsl_base_type tex_type;
>>>>>>>>>> GLboolean tex_shadow;
>>>>>>>>>>
>>>>>>>>>> st_src_reg tex_offsets[MAX_GLSL_TEXTURE_OFFSET];
>>>>>>>>>> @@ -345,6 +346,8 @@ public:
>>>>>>>>>>
>>>>>>>>>> int num_address_regs;
>>>>>>>>>> int samplers_used;
>>>>>>>>>> + glsl_base_type sampler_types[PIPE_MAX_SAMPLERS];
>>>>>>>>>> + int sampler_targets[PIPE_MAX_SAMPLERS]; /**< One of
>>>>>>>>>> TGSI_TEXTURE_* */
>>>>>>>>>> bool indirect_addr_consts;
>>>>>>>>>> int wpos_transform_const;
>>>>>>>>>>
>>>>>>>>>> @@ -3323,6 +3326,8 @@ glsl_to_tgsi_visitor::visit(ir_texture *ir)
>>>>>>>>>> assert(!"Should not get here.");
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> + inst->tex_type = ir->type->base_type;
>>>>>>>>>> +
>>>>>>>>>> this->result = result_src;
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> @@ -3471,6 +3476,11 @@ count_resources(glsl_to_tgsi_visitor
>>>>>>>>>> *v, gl_program *prog)
>>>>>>>>>> for (int i = 0; i < inst->sampler_array_size; i++) {
>>>>>>>>>> v->samplers_used |= 1 << (inst->sampler.index + i);
>>>>>>>>>>
>>>>>>>>>> + debug_assert(i < (int)ARRAY_SIZE(v->sampler_types));
>>>>>>>>>> + v->sampler_types[i] = inst->tex_type;
>>>>>>>>>> + v->sampler_targets[i] =
>>>>>>>>>> + st_translate_texture_target(inst->tex_target,
>>>>>>>>>> inst->tex_shadow);
>>>>>>>>>> +
>>>>>>>>>> if (inst->tex_shadow) {
>>>>>>>>>> prog->ShadowSamplers |= 1 <<
>>>>>>>>>> (inst->sampler.index + i);
>>>>>>>>>> }
>>>>>>>>>> @@ -5529,7 +5539,26 @@ st_translate_program(
>>>>>>>>>> /* texture samplers */
>>>>>>>>>> for (i = 0; i <
>>>>>>>>>> ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxTextureImageUnits;
>>>>>>>>>> i++) {
>>>>>>>>>> if (program->samplers_used & (1 << i)) {
>>>>>>>>>> + unsigned type;
>>>>>>>>>> +
>>>>>>>>>> t->samplers[i] = ureg_DECL_sampler(ureg, i);
>>>>>>>>>> +
>>>>>>>>>> + switch (program->sampler_types[i]) {
>>>>>>>>>> + case GLSL_TYPE_INT:
>>>>>>>>>> + type = TGSI_RETURN_TYPE_SINT;
>>>>>>>>>> + break;
>>>>>>>>>> + case GLSL_TYPE_UINT:
>>>>>>>>>> + type = TGSI_RETURN_TYPE_UINT;
>>>>>>>>>> + break;
>>>>>>>>>> + case GLSL_TYPE_FLOAT:
>>>>>>>>>> + type = TGSI_RETURN_TYPE_FLOAT;
>>>>>>>>>> + break;
>>>>>>>>>> + default:
>>>>>>>>>> + unreachable("not reached");
>>>>>>>>>> + }
>>>>>>>>>> +
>>>>>>>>>> + ureg_DECL_sampler_view( ureg, i,
>>>>>>>>>> program->sampler_targets[i],
>>>>>>>>>> + type, type, type, type );
>>>>>>>>>> }
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This indeed seems like a consistent solution. Hopefully drivers
>>>>>>>>> don't
>>>>>>>>> assert when they see sampler view dcls...
>>>>>>>>
>>>>>>>> well, tgsi_to_nir did before I fixed it :-(
>>>>>>>>
>>>>>>>> (mostly just because of liberal debug_assert() usage)
>>>>>>>>
>>>>>>>> it's at least not something I'd land right before a release branch
>>>>>>>> point. But I guess the fix is easy enough (ie. remove a bogus
>>>>>>>> assert), and 10.7 branch point is quite some time from now..
>>>>>>>>
>>>>>>>>> I am also somewhat worried that this only changes the
>>>>>>>>> glsl-to-tgsi path,
>>>>>>>>> not other paths. llvmpipe and draw rely on either having sview
>>>>>>>>> declarations for all (sample) instructions or for none, and
>>>>>>>>> with draw
>>>>>>>>> possibly inserting tex opcodes (for aalines etc.) on its own
>>>>>>>>> this could
>>>>>>>>> probably break (these draw paths aren't used with d3d10 sample
>>>>>>>>> opcodes
>>>>>>>>> as it's more or less illegal to use tex and sample opcodes in
>>>>>>>>> the same
>>>>>>>>> shader, but that doesn't really matter here). I think using sview
>>>>>>>>> declarations is the right thing to do but it probably should be
>>>>>>>>> illegal
>>>>>>>>> then to NOT have them in some places.
>>>>>>>>
>>>>>>>> fwiw, the way I implemented it in tgsi_to_nir (and what I'd
>>>>>>>> recommend
>>>>>>>> for other drivers) is that for the "tex style" opcodes, if there is
>>>>>>>> not a matching SVIEW decl, assume float. That seemed more sane
>>>>>>>> than
>>>>>>>> fixing up *all* the other state trackers..
>>>>>>>>
>>>>>>>
>>>>>>> Yeah but for llvmpipe/draw it's not just the type. The problem is
>>>>>>> how
>>>>>>> they are parsing this stuff - if there's at least one sview decl
>>>>>>> they'll
>>>>>>> construct the static texture state based on the defined sampler
>>>>>>> views,
>>>>>>> but if there's none (up to now for gl state tracker) they'll do this
>>>>>>> based on the defined samplers, and for some there simply won't be
>>>>>>> any
>>>>>>> such definitions (if draw inserted some stage). llvmpipe can
>>>>>>> handle tex
>>>>>>> opcodes without sview dcls, it should also be able to handle them
>>>>>>> if you
>>>>>>> have them (but I wouldn't bet on it without testing as that's
>>>>>>> definitely
>>>>>>> going to hit some code paths never seen before), but it can't handle
>>>>>>> having "some" sview dcls (see for example lp_state_fs.c, line
>>>>>>> 3050 and
>>>>>>> following why it can't work). This is fixable but only with some
>>>>>>> awkwardness (since due to d3d10 sample opcodes not having 1:1
>>>>>>> mapping
>>>>>>> between samplers and sampler views). Hence my proposal of making it
>>>>>>> mandatory to have sview decls instead of having them optionally.
>>>>>>> That
>>>>>>> way could also ditch that code which distinguishes having sviews
>>>>>>> or not.
>>>>>>
>>>>>> hmm, I could squash in something like:
>>>>>>
>>>>>> ----
>>>>>> diff --git a/src/gallium/drivers/llvmpipe/lp_state_fs.c
>>>>>> b/src/gallium/drivers/llvmpipe/lp_state_fs.c
>>>>>> index b5ce868..2878c49 100644
>>>>>> --- a/src/gallium/drivers/llvmpipe/lp_state_fs.c
>>>>>> +++ b/src/gallium/drivers/llvmpipe/lp_state_fs.c
>>>>>> @@ -3059,27 +3059,11 @@ make_variant_key(struct llvmpipe_context *lp,
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> - /*
>>>>>> - * XXX If TGSI_FILE_SAMPLER_VIEW exists assume all texture
>>>>>> opcodes
>>>>>> - * are dx10-style? Can't really have mixed opcodes, at least not
>>>>>> - * if we want to skip the holes here (without rescanning tgsi).
>>>>>> - */
>>>>>> - if (shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] != -1) {
>>>>>> - key->nr_sampler_views =
>>>>>> shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] + 1;
>>>>>> - for(i = 0; i < key->nr_sampler_views; ++i) {
>>>>>> - if(shader->info.base.file_mask[TGSI_FILE_SAMPLER_VIEW] &
>>>>>> (1 << i)) {
>>>>>> -
>>>>>> lp_sampler_static_texture_state(&key->state[i].texture_state,
>>>>>> -
>>>>>> lp->sampler_views[PIPE_SHADER_FRAGMENT][i]);
>>>>>> - }
>>>>>> - }
>>>>>> - }
>>>>>> - else {
>>>>>> - key->nr_sampler_views = key->nr_samplers;
>>>>>> - for(i = 0; i < key->nr_sampler_views; ++i) {
>>>>>> - if(shader->info.base.file_mask[TGSI_FILE_SAMPLER] & (1
>>>>>> << i)) {
>>>>>> -
>>>>>> lp_sampler_static_texture_state(&key->state[i].texture_state,
>>>>>> -
>>>>>> lp->sampler_views[PIPE_SHADER_FRAGMENT][i]);
>>>>>> - }
>>>>>> + key->nr_sampler_views = key->nr_samplers;
>>>>>> + for(i = 0; i < key->nr_sampler_views; ++i) {
>>>>>> + if(shader->info.base.file_mask[TGSI_FILE_SAMPLER] & (1 <<
>>>>>> i)) {
>>>>>> +
>>>>>> lp_sampler_static_texture_state(&key->state[i].texture_state,
>>>>>> +
>>>>>> lp->sampler_views[PIPE_SHADER_FRAGMENT][i]);
>>>>>> }
>>>>>> }
>>>>>> }
>>>>> No that won't work for d3d10 state trackers, where you typically
>>>>> have more
>>>>> views than samplers (though the opposite is possible as well and
>>>>> you can
>>>>> have holes in both). I think it really would look nicer if you could
>>>>> just rely on sampler views being present.
>>>>
>>>> what d3d10 state tracker?
>>>>
>>>
>>> Make that "some non-public code which uses the sample opcodes"...
>>
>> well, not to be rude, but I can't really be expected to fix /
>> avoid-breaking code that doesn't exist (in upstream mesa)..
>
> Rob,
>
> Of course. Nobody expects or asked you to do such thing.
>
> Roland's just explaining why your proposed llvmpipe workaround is not
> sustainable. Breaking our closed state tracker is OK and happens all
> the time FYI. What we can't have is irreparable breakage. In short, we
> just need to workout a solution that doesn't leave us with our hands
> tied to fix it.
>
> And in fact, Roland's suggestion of requiring all TGSI producers
> including SVIEW dcl does not require fixing any closed-source state
> tracker but rather just fix TGSI generators in Mesa to put the SVIEW.
>
> Anyway I really like your approach, appreciate your perseverance so far,
> and want to see it through. So please just bear with us a bit more.
>
>
> Roland, would it be possible to swap the
> `shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] != -1` condition
> with something else? Just temporarily, so that we can decouple things,
> allowing to commit this as is, then enforce the SVIEW everywhere on a
> 2nd round.
I think it should work pretty easily when just going through the max of
nr_samplers and nr_sampler_views. I think mostly the reason why actually
it even checks the sampler and sampler view file for specific entries to
be present before deciding to fill in the key is not correctness (unused
sampler views should not matter), but performance - unused views in the
key may cause the shader to be different unnecessarily. I don't think
though we've actually measured this.
(The same should be true for the draw path I believe.)
Roland
>
> Specially because enforcing this everywhere will require some work,
> sepcially on u_blit / u_mipmap_gen helpers, which need yet another key.
>
>
> For example, what about actually using the TGSI_OPCODE_SAMPLE_* &
> friends opcode count? In fact, we could go even farther, and compute
> the opcount for TGSI_OPCODE_TEX & friends and ensure that they are never
> non-zero simultanouesly.
>
>
> Jose
>
More information about the mesa-dev
mailing list