[Mesa-dev] [PATCH 1/2] tgsi: texture types

Roland Scheidegger sroland at vmware.com
Wed Jun 10 15:32:40 PDT 2015


Am 10.06.2015 um 23:31 schrieb Rob Clark:
> On Wed, Jun 10, 2015 at 5:13 PM, Jose Fonseca <jfonseca at vmware.com> wrote:
>> I think it make sense for us to start using SVIEW regardless, and uniformize
>> things.
>>
>>
>> Even if GLSL will never support independent texture/samplers, D3D10, OpenCL,
>> Metal, and potential SPIR-V all do.
>>
>> Roland, could you prepare a patch for llvmpipe, so that it infers from
>> SAMPLE_* opcode count instead of SVIEW for, and therefore unblock Rob?
>>
> 
> fwiw, I've got the three tgsi_transform_shader() cases which insert a
> SAMP updated to insert also an SVIEW if the shader is already using
> SVIEW's.. this should be enough to keep llvmpipe happy without
> changes.  I'll send this plus tgsi.rst update shortly.
Sounds like a plan.


> Going forward, I will have to make u_blitter keep track of
> int/unsigned/float variants (which it seems to do already for msaa
> resolve but not other cases), and similar treatment for some freedreno
> internally generated shaders (used to restore the tile buffer).
> 
>>
>> BTW, even if you avoid intermediate TGSI on GLSL -> NIR, don't you still
>> need to handle TGSI generated by the state tracker? (For things like blits,
>> and mipmap generation, identity shaders, clear shaders, etc)
>>
> 
> yeah..  rough idea was add a 'const void *preferred_ir' to
> pipe_shader_state, which could be NIR or whatever based on
> PIPE_SHADER_CAP_PREFERRED_IR.  Or it could be NULL in which case you
> get tgsi tokens ptr instead and have to call tgsi_to_nir yourself.  I
> didn't see an easy way around that, given u_blitter, other state
> trackers, etc.
> 
> The biggest reason that I expect to want to go straight glsl->nir in
> the future is when nir gains mediump support, since that can be a bit
> perf win on mobile (gles oriented) hw, and it didn't seem like fun to
> have to plumb that through tgsi and tgsi_to_nir..
Yes the untyped register file seems to start to look a bit limiting...
That said, d3d11.1 also supports "true" lower precision values (half
floats were supported in hlsl before but in the low-level assembly that
wasn't visible at all, like all the other types which were just using
untyped 32bit wide registers). I have no idea though how this is
actually represented in the assembly, but this shows it should be
possible to do in some hopefully sane way.

Roland


> 
> BR,
> -R
> 
>>
>> Jose
>>
>>
>>
>> On 10/06/15 21:07, Rob Clark wrote:
>>>
>>> that is starting to look more attractive, mostly just because
>>> tgsi_transform stuff is so cumbersome..
>>>
>>> (I did start thinking about just adding type to decl's in general,
>>> since really it would be better to have for type information for IN's
>>> and OUT's too.. but then decided I'd probably rather spend my time on
>>> support in mesa st to go straight from glsl to nir and bypass tgsi,
>>> rather than going down that rabbit hole)
>>>
>>> BR,
>>> -R
>>>
>>> On Wed, Jun 10, 2015 at 3:51 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>>>
>>>> There is also the option of adding the sampler type to either the SAMP
>>>> declaration or texture instructions. This will move us further away
>>>> from adopting SVIEW, but I don't see that happening for OpenGL anyway.
>>>>
>>>> Marek
>>>>
>>>> On Wed, Jun 10, 2015 at 8:59 PM, Rob Clark <robdclark at gmail.com> wrote:
>>>>>
>>>>> So, afaiu, everything that might insert a sampler is using
>>>>> tgsi_transform_shader()?  There aren't too many of those, and I think
>>>>> I can fix them up to not violate that constraint.
>>>>>
>>>>> (It does occur to me that I may end up needing to fix u_blitter to
>>>>> differentiate between blitting float vs int to avoid some regressions
>>>>> in freedreno, since I'd no longer be using shader variants based on
>>>>> bound samplers.. but I guess that is unrelated and a separate patch)
>>>>>
>>>>> BR,
>>>>> -R
>>>>>
>>>>> On Wed, Jun 10, 2015 at 2:55 PM, Roland Scheidegger <sroland at vmware.com>
>>>>> wrote:
>>>>>>
>>>>>> My biggest problem with that is the initial case I saw as a problem:
>>>>>> draw is going to modify these shaders in some cases (aaline stage for
>>>>>> example), adding its own sampler, and it doesn't know anything about
>>>>>> distinguishing shaders with sampler views or without.
>>>>>> The same goes for any other potential code which may modify shaders
>>>>>> similarly - needs to be modified not just to always use sampler views
>>>>>> but use them based on if the incoming shader already uses them or not.
>>>>>> Which conceptually looks worse to me. But otherwise I agree this should
>>>>>> work.
>>>>>>
>>>>>> Roland
>>>>>>
>>>>>>
>>>>>> Am 10.06.2015 um 20:30 schrieb Rob Clark:
>>>>>>>
>>>>>>> Hmm, at least tgsi_text_translate() doesn't appear to use tgsi_ureg..
>>>>>>> and there are still a number of users of tgsi_text_translate().. I
>>>>>>> guess handling this in tgsi_ureg would avoid fixing all the tgsi_ureg
>>>>>>> users, but that still leaves a lot of others.  Changing them all still
>>>>>>> seems to be too intrusive to me.
>>>>>>>
>>>>>>> (And also, I have a large collection of saved tgsi shaders that I use
>>>>>>> for standalone testing of my shader compiler and don't really like the
>>>>>>> idea of fixing up 700 or 800 tgsi shaders by hand :-P)
>>>>>>>
>>>>>>> That said, looking at the code like llvmpipe where Roland/Jose where
>>>>>>> thinking we might have problems..  by making the assumption that we
>>>>>>> never mix TEX* and SAMPLE* opc's, I think we can loosen the
>>>>>>> restriction to:
>>>>>>>
>>>>>>>    for TEX* instructions, the tgsi must either *not* include SVIEW, or
>>>>>>> *must* include a matching SVIEW[idx] for every SAMP[idx]
>>>>>>>
>>>>>>> Which is a restriction that glsl_to_tgsi follows.
>>>>>>>
>>>>>>> If you follow this restriction, then for TEX* shaders which have
>>>>>>> SVIEW's, file_max[TGSI_FILE_SAMPLER_VIEW] ==
>>>>>>> file_max[TGSI_FILE_SAMPLER] and file_mask[TGSI_FILE_SAMPLER_VIEW] ==
>>>>>>> file_mask[TGSI_FILE_SAMPLER].. so code takes a different but
>>>>>>> equivalent path.   And for TEX* shaders which don't have SVIEW's
>>>>>>> everything continues to work as before.
>>>>>>>
>>>>>>> With this approach, we don't have to fix up everything to create
>>>>>>> SVIEW[idx] for every SAMP[idx], as long as glsl_to_tgsi always creates
>>>>>>> SVIEW[idx] for each SAMP[idx], and any other tgsi generator that later
>>>>>>> adds SVIEW support for TEX* instructions follows the same pattern.
>>>>>>>
>>>>>>> So, tl;dr: I think really all I need to add to this patch is add blurb
>>>>>>> in tgsi.rst to explain this restriction and usage of SVIEW for TEX*
>>>>>>>
>>>>>>> Thoughts?
>>>>>>>
>>>>>>> BR,
>>>>>>> -R
>>>>>>>
>>>>>>> On Tue, Jun 9, 2015 at 1:20 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>>>>>>>
>>>>>>>> If you only want to modify TGSI and not all the users, you only have
>>>>>>>> to fix tgsi_ureg. tgsi_ureg is a layer that can hide a lot of small
>>>>>>>> ugly details if needed, including sampler view declarations when the
>>>>>>>> users don't even know about them.
>>>>>>>>
>>>>>>>> Marek
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jun 9, 2015 at 6:01 PM, Rob Clark <robdclark at gmail.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> On Tue, Jun 9, 2015 at 9:32 AM, Roland Scheidegger
>>>>>>>>> <sroland at vmware.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Am 09.06.2015 um 15:00 schrieb Rob Clark:
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jun 9, 2015 at 5:01 AM, Jose Fonseca <jfonseca at vmware.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On 09/06/15 04:03, Rob Clark wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Jun 8, 2015 at 10:50 PM, Roland Scheidegger
>>>>>>>>>>>>> <sroland at vmware.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Am 09.06.2015 um 04:40 schrieb Rob Clark:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Jun 8, 2015 at 10:36 PM, Roland Scheidegger
>>>>>>>>>>>>>>> <sroland at vmware.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Am 09.06.2015 um 04:20 schrieb Rob Clark:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Jun 8, 2015 at 9:40 PM, Roland Scheidegger
>>>>>>>>>>>>>>>>> <sroland at vmware.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Am 09.06.2015 um 03:20 schrieb Rob Clark:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Mon, Jun 8, 2015 at 8:35 PM, Roland Scheidegger
>>>>>>>>>>>>>>>>>>> <sroland at vmware.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Am 08.06.2015 um 20:15 schrieb Rob Clark:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> From: Rob Clark <robclark at freedesktop.org>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Freedreno needs sampler type information to deal with
>>>>>>>>>>>>>>>>>>>>> int/uint
>>>>>>>>>>>>>>>>>>>>> textures.
>>>>>>>>>>>>>>>>>>>>> To accomplish this, start creating sampler-view
>>>>>>>>>>>>>>>>>>>>> declarations, as
>>>>>>>>>>>>>>>>>>>>> suggested here:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> http://lists.freedesktop.org/archives/mesa-dev/2014-November/071583.html
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> create a sampler-view with index matching the sampler,
>>>>>>>>>>>>>>>>>>>>> to encode
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> texture type (ie. SINT/UINT/FLOAT).  Ie:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>      DCL SVIEW[n], 2D, UINT
>>>>>>>>>>>>>>>>>>>>>      DCL SAMP[n]
>>>>>>>>>>>>>>>>>>>>>      TEX OUT[1], IN[1], SAMP[n]
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> For tgsi texture instructions which do not take an
>>>>>>>>>>>>>>>>>>>>> explicit SVIEW
>>>>>>>>>>>>>>>>>>>>> argument, the SVIEW index is implied by the SAMP index.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I wonder if there should be some doc update somewhere.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> yeah, perhaps.. I wasn't quite sure where, tgsi.rst only
>>>>>>>>>>>>>>>>>>> talks about
>>>>>>>>>>>>>>>>>>> SVIEW in terms of the SAMPLE and related opc's (ie. the
>>>>>>>>>>>>>>>>>>> ones which
>>>>>>>>>>>>>>>>>>> do
>>>>>>>>>>>>>>>>>>> take an SVIEW arg), and the way things are with this
>>>>>>>>>>>>>>>>>>> patch, other
>>>>>>>>>>>>>>>>>>> drivers simply need to ignore the extra SVIEW decl's and
>>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>> randomly
>>>>>>>>>>>>>>>>>>> assert for things to continue working as before..
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> hypothetically if something in tree actually created SVIEW
>>>>>>>>>>>>>>>>>>> decl
>>>>>>>>>>>>>>>>>>> currently, and the shader also had TEX* style
>>>>>>>>>>>>>>>>>>> instructions, it would
>>>>>>>>>>>>>>>>>>> have to take care to not have conflicting SVIEW's.  But
>>>>>>>>>>>>>>>>>>> afaict
>>>>>>>>>>>>>>>>>>> nothing
>>>>>>>>>>>>>>>>>>> in-tree creates sampler view decl's currently.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Signed-off-by: Rob Clark <robclark at freedesktop.org>
>>>>>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>>>>>    src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 29
>>>>>>>>>>>>>>>>>>>>> +++++++++++++++++++++++++++++
>>>>>>>>>>>>>>>>>>>>>    1 file changed, 29 insertions(+)
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>>>>>>>>>>>>>>>>>>> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>>>>>>>>>>>>>>>>>>> index 0e60d95..ce8f2ea 100644
>>>>>>>>>>>>>>>>>>>>> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>>>>>>>>>>>>>>>>>>> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>>>>>>>>>>>>>>>>>>> @@ -239,6 +239,7 @@ public:
>>>>>>>>>>>>>>>>>>>>>       st_src_reg sampler; /**< sampler register */
>>>>>>>>>>>>>>>>>>>>>       int sampler_array_size; /**< 1-based size of
>>>>>>>>>>>>>>>>>>>>> sampler array, 1
>>>>>>>>>>>>>>>>>>>>> if not array */
>>>>>>>>>>>>>>>>>>>>>       int tex_target; /**< One of TEXTURE_*_INDEX */
>>>>>>>>>>>>>>>>>>>>> +   glsl_base_type tex_type;
>>>>>>>>>>>>>>>>>>>>>       GLboolean tex_shadow;
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>       st_src_reg tex_offsets[MAX_GLSL_TEXTURE_OFFSET];
>>>>>>>>>>>>>>>>>>>>> @@ -345,6 +346,8 @@ public:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>       int num_address_regs;
>>>>>>>>>>>>>>>>>>>>>       int samplers_used;
>>>>>>>>>>>>>>>>>>>>> +   glsl_base_type sampler_types[PIPE_MAX_SAMPLERS];
>>>>>>>>>>>>>>>>>>>>> +   int sampler_targets[PIPE_MAX_SAMPLERS];   /**< One
>>>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>> TGSI_TEXTURE_* */
>>>>>>>>>>>>>>>>>>>>>       bool indirect_addr_consts;
>>>>>>>>>>>>>>>>>>>>>       int wpos_transform_const;
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> @@ -3323,6 +3326,8 @@
>>>>>>>>>>>>>>>>>>>>> glsl_to_tgsi_visitor::visit(ir_texture *ir)
>>>>>>>>>>>>>>>>>>>>>          assert(!"Should not get here.");
>>>>>>>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> +   inst->tex_type = ir->type->base_type;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>       this->result = result_src;
>>>>>>>>>>>>>>>>>>>>>    }
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> @@ -3471,6 +3476,11 @@
>>>>>>>>>>>>>>>>>>>>> count_resources(glsl_to_tgsi_visitor *v,
>>>>>>>>>>>>>>>>>>>>> gl_program *prog)
>>>>>>>>>>>>>>>>>>>>>             for (int i = 0; i <
>>>>>>>>>>>>>>>>>>>>> inst->sampler_array_size; i++) {
>>>>>>>>>>>>>>>>>>>>>                v->samplers_used |= 1 <<
>>>>>>>>>>>>>>>>>>>>> (inst->sampler.index + i);
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> +            debug_assert(i <
>>>>>>>>>>>>>>>>>>>>> (int)ARRAY_SIZE(v->sampler_types));
>>>>>>>>>>>>>>>>>>>>> +            v->sampler_types[i] = inst->tex_type;
>>>>>>>>>>>>>>>>>>>>> +            v->sampler_targets[i] =
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> st_translate_texture_target(inst->tex_target,
>>>>>>>>>>>>>>>>>>>>> inst->tex_shadow);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>                if (inst->tex_shadow) {
>>>>>>>>>>>>>>>>>>>>>                   prog->ShadowSamplers |= 1 <<
>>>>>>>>>>>>>>>>>>>>> (inst->sampler.index
>>>>>>>>>>>>>>>>>>>>> + i);
>>>>>>>>>>>>>>>>>>>>>                }
>>>>>>>>>>>>>>>>>>>>> @@ -5529,7 +5539,26 @@ st_translate_program(
>>>>>>>>>>>>>>>>>>>>>       /* texture samplers */
>>>>>>>>>>>>>>>>>>>>>       for (i = 0; i <
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxTextureImageUnits; i++) {
>>>>>>>>>>>>>>>>>>>>>          if (program->samplers_used & (1 << i)) {
>>>>>>>>>>>>>>>>>>>>> +         unsigned type;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>             t->samplers[i] = ureg_DECL_sampler(ureg, i);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +         switch (program->sampler_types[i]) {
>>>>>>>>>>>>>>>>>>>>> +         case GLSL_TYPE_INT:
>>>>>>>>>>>>>>>>>>>>> +            type = TGSI_RETURN_TYPE_SINT;
>>>>>>>>>>>>>>>>>>>>> +            break;
>>>>>>>>>>>>>>>>>>>>> +         case GLSL_TYPE_UINT:
>>>>>>>>>>>>>>>>>>>>> +            type = TGSI_RETURN_TYPE_UINT;
>>>>>>>>>>>>>>>>>>>>> +            break;
>>>>>>>>>>>>>>>>>>>>> +         case GLSL_TYPE_FLOAT:
>>>>>>>>>>>>>>>>>>>>> +            type = TGSI_RETURN_TYPE_FLOAT;
>>>>>>>>>>>>>>>>>>>>> +            break;
>>>>>>>>>>>>>>>>>>>>> +         default:
>>>>>>>>>>>>>>>>>>>>> +            unreachable("not reached");
>>>>>>>>>>>>>>>>>>>>> +         }
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +         ureg_DECL_sampler_view( ureg, i,
>>>>>>>>>>>>>>>>>>>>> program->sampler_targets[i],
>>>>>>>>>>>>>>>>>>>>> +                                 type, type, type, type
>>>>>>>>>>>>>>>>>>>>> );
>>>>>>>>>>>>>>>>>>>>>          }
>>>>>>>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> This indeed seems like a consistent solution. Hopefully
>>>>>>>>>>>>>>>>>>>> drivers
>>>>>>>>>>>>>>>>>>>> don't
>>>>>>>>>>>>>>>>>>>> assert when they see sampler view dcls...
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> well, tgsi_to_nir did before I fixed it :-(
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> (mostly just because of liberal debug_assert() usage)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> it's at least not something I'd land right before a
>>>>>>>>>>>>>>>>>>> release branch
>>>>>>>>>>>>>>>>>>> point.  But I guess the fix is easy enough (ie. remove a
>>>>>>>>>>>>>>>>>>> bogus
>>>>>>>>>>>>>>>>>>> assert), and 10.7 branch point is quite some time from
>>>>>>>>>>>>>>>>>>> now..
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I am also somewhat worried that this only changes the
>>>>>>>>>>>>>>>>>>>> glsl-to-tgsi
>>>>>>>>>>>>>>>>>>>> path,
>>>>>>>>>>>>>>>>>>>> not other paths. llvmpipe and draw rely on either having
>>>>>>>>>>>>>>>>>>>> sview
>>>>>>>>>>>>>>>>>>>> declarations for all (sample) instructions or for none,
>>>>>>>>>>>>>>>>>>>> and with
>>>>>>>>>>>>>>>>>>>> draw
>>>>>>>>>>>>>>>>>>>> possibly inserting tex opcodes (for aalines etc.) on its
>>>>>>>>>>>>>>>>>>>> own this
>>>>>>>>>>>>>>>>>>>> could
>>>>>>>>>>>>>>>>>>>> probably break (these draw paths aren't used with d3d10
>>>>>>>>>>>>>>>>>>>> sample
>>>>>>>>>>>>>>>>>>>> opcodes
>>>>>>>>>>>>>>>>>>>> as it's more or less illegal to use tex and sample
>>>>>>>>>>>>>>>>>>>> opcodes in the
>>>>>>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>>>>> shader, but that doesn't really matter here). I think
>>>>>>>>>>>>>>>>>>>> using sview
>>>>>>>>>>>>>>>>>>>> declarations is the right thing to do but it probably
>>>>>>>>>>>>>>>>>>>> should be
>>>>>>>>>>>>>>>>>>>> illegal
>>>>>>>>>>>>>>>>>>>> then to NOT have them in some places.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> fwiw, the way I implemented it in tgsi_to_nir (and what
>>>>>>>>>>>>>>>>>>> I'd
>>>>>>>>>>>>>>>>>>> recommend
>>>>>>>>>>>>>>>>>>> for other drivers) is that for the "tex style" opcodes, if
>>>>>>>>>>>>>>>>>>> there is
>>>>>>>>>>>>>>>>>>> not a matching SVIEW decl, assume float.  That seemed more
>>>>>>>>>>>>>>>>>>> sane than
>>>>>>>>>>>>>>>>>>> fixing up *all* the other state trackers..
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Yeah but for llvmpipe/draw it's not just the type. The
>>>>>>>>>>>>>>>>>> problem is how
>>>>>>>>>>>>>>>>>> they are parsing this stuff - if there's at least one sview
>>>>>>>>>>>>>>>>>> decl
>>>>>>>>>>>>>>>>>> they'll
>>>>>>>>>>>>>>>>>> construct the static texture state based on the defined
>>>>>>>>>>>>>>>>>> sampler
>>>>>>>>>>>>>>>>>> views,
>>>>>>>>>>>>>>>>>> but if there's none (up to now for gl state tracker)
>>>>>>>>>>>>>>>>>> they'll do this
>>>>>>>>>>>>>>>>>> based on the defined samplers, and for some there simply
>>>>>>>>>>>>>>>>>> won't be any
>>>>>>>>>>>>>>>>>> such definitions (if draw inserted some stage). llvmpipe
>>>>>>>>>>>>>>>>>> can handle
>>>>>>>>>>>>>>>>>> tex
>>>>>>>>>>>>>>>>>> opcodes without sview dcls, it should also be able to
>>>>>>>>>>>>>>>>>> handle them if
>>>>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>>> have them (but I wouldn't bet on it without testing as
>>>>>>>>>>>>>>>>>> that's
>>>>>>>>>>>>>>>>>> definitely
>>>>>>>>>>>>>>>>>> going to hit some code paths never seen before), but it
>>>>>>>>>>>>>>>>>> can't handle
>>>>>>>>>>>>>>>>>> having "some" sview dcls (see for example lp_state_fs.c,
>>>>>>>>>>>>>>>>>> line 3050
>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> following why it can't work). This is fixable but only with
>>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>>> awkwardness (since due to d3d10 sample opcodes not having
>>>>>>>>>>>>>>>>>> 1:1 mapping
>>>>>>>>>>>>>>>>>> between samplers and sampler views). Hence my proposal of
>>>>>>>>>>>>>>>>>> making it
>>>>>>>>>>>>>>>>>> mandatory to have sview decls instead of having them
>>>>>>>>>>>>>>>>>> optionally. That
>>>>>>>>>>>>>>>>>> way could also ditch that code which distinguishes having
>>>>>>>>>>>>>>>>>> sviews or
>>>>>>>>>>>>>>>>>> not.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> hmm, I could squash in something like:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ----
>>>>>>>>>>>>>>>>> diff --git a/src/gallium/drivers/llvmpipe/lp_state_fs.c
>>>>>>>>>>>>>>>>> b/src/gallium/drivers/llvmpipe/lp_state_fs.c
>>>>>>>>>>>>>>>>> index b5ce868..2878c49 100644
>>>>>>>>>>>>>>>>> --- a/src/gallium/drivers/llvmpipe/lp_state_fs.c
>>>>>>>>>>>>>>>>> +++ b/src/gallium/drivers/llvmpipe/lp_state_fs.c
>>>>>>>>>>>>>>>>> @@ -3059,27 +3059,11 @@ make_variant_key(struct
>>>>>>>>>>>>>>>>> llvmpipe_context *lp,
>>>>>>>>>>>>>>>>>          }
>>>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -   /*
>>>>>>>>>>>>>>>>> -    * XXX If TGSI_FILE_SAMPLER_VIEW exists assume all
>>>>>>>>>>>>>>>>> texture opcodes
>>>>>>>>>>>>>>>>> -    * are dx10-style? Can't really have mixed opcodes, at
>>>>>>>>>>>>>>>>> least not
>>>>>>>>>>>>>>>>> -    * if we want to skip the holes here (without rescanning
>>>>>>>>>>>>>>>>> tgsi).
>>>>>>>>>>>>>>>>> -    */
>>>>>>>>>>>>>>>>> -   if (shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW]
>>>>>>>>>>>>>>>>> != -1) {
>>>>>>>>>>>>>>>>> -      key->nr_sampler_views =
>>>>>>>>>>>>>>>>> shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] + 1;
>>>>>>>>>>>>>>>>> -      for(i = 0; i < key->nr_sampler_views; ++i) {
>>>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>>>> if(shader->info.base.file_mask[TGSI_FILE_SAMPLER_VIEW] & (1
>>>>>>>>>>>>>>>>> << i)) {
>>>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> lp_sampler_static_texture_state(&key->state[i].texture_state,
>>>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>>>> lp->sampler_views[PIPE_SHADER_FRAGMENT][i]);
>>>>>>>>>>>>>>>>> -         }
>>>>>>>>>>>>>>>>> -      }
>>>>>>>>>>>>>>>>> -   }
>>>>>>>>>>>>>>>>> -   else {
>>>>>>>>>>>>>>>>> -      key->nr_sampler_views = key->nr_samplers;
>>>>>>>>>>>>>>>>> -      for(i = 0; i < key->nr_sampler_views; ++i) {
>>>>>>>>>>>>>>>>> -         if(shader->info.base.file_mask[TGSI_FILE_SAMPLER]
>>>>>>>>>>>>>>>>> & (1 <<
>>>>>>>>>>>>>>>>> i)) {
>>>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> lp_sampler_static_texture_state(&key->state[i].texture_state,
>>>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>>>> lp->sampler_views[PIPE_SHADER_FRAGMENT][i]);
>>>>>>>>>>>>>>>>> -         }
>>>>>>>>>>>>>>>>> +   key->nr_sampler_views = key->nr_samplers;
>>>>>>>>>>>>>>>>> +   for(i = 0; i < key->nr_sampler_views; ++i) {
>>>>>>>>>>>>>>>>> +      if(shader->info.base.file_mask[TGSI_FILE_SAMPLER] &
>>>>>>>>>>>>>>>>> (1 << i)) {
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> lp_sampler_static_texture_state(&key->state[i].texture_state,
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> lp->sampler_views[PIPE_SHADER_FRAGMENT][i]);
>>>>>>>>>>>>>>>>>          }
>>>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>>>>    }
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> No that won't work for d3d10 state trackers, where you
>>>>>>>>>>>>>>>> typically have
>>>>>>>>>>>>>>>> more
>>>>>>>>>>>>>>>> views than samplers (though the opposite is possible as well
>>>>>>>>>>>>>>>> and you
>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>> have holes in both). I think it really would look nicer if
>>>>>>>>>>>>>>>> you could
>>>>>>>>>>>>>>>> just rely on sampler views being present.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> what d3d10 state tracker?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Make that "some non-public code which uses the sample
>>>>>>>>>>>>>> opcodes"...
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> well, not to be rude, but I can't really be expected to fix /
>>>>>>>>>>>>> avoid-breaking code that doesn't exist (in upstream mesa)..
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Rob,
>>>>>>>>>>>>
>>>>>>>>>>>> Of course.  Nobody expects or asked you to do such thing.
>>>>>>>>>>>>
>>>>>>>>>>>> Roland's just explaining why your proposed llvmpipe workaround is
>>>>>>>>>>>> not
>>>>>>>>>>>> sustainable.  Breaking our closed state tracker is OK and happens
>>>>>>>>>>>> all the
>>>>>>>>>>>> time FYI.  What we can't have is irreparable breakage. In short,
>>>>>>>>>>>> we just
>>>>>>>>>>>> need to workout a solution that doesn't leave us with our hands
>>>>>>>>>>>> tied to fix
>>>>>>>>>>>> it.
>>>>>>>>>>>>
>>>>>>>>>>>> And in fact, Roland's suggestion of requiring all TGSI producers
>>>>>>>>>>>> including
>>>>>>>>>>>> SVIEW dcl does not require fixing any closed-source state tracker
>>>>>>>>>>>> but rather
>>>>>>>>>>>> just fix TGSI generators in Mesa to put the SVIEW.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Hmm, well there are a *lot* of things in mesa which generate tgsi
>>>>>>>>>>> (vl,
>>>>>>>>>>> gallium9, etc)..  fixing them all, esp. when I don't necessarily
>>>>>>>>>>> know
>>>>>>>>>>> all the other state trackers well enough to know what they
>>>>>>>>>>> *should* be
>>>>>>>>>>> doing, didn't seem reasonable.
>>>>>>>>>>>
>>>>>>>>>>> I was hoping we could just leave SVIEW decl as optional to avoid
>>>>>>>>>>> having to fix up everything.  It just means that something which
>>>>>>>>>>> does
>>>>>>>>>>> use SVIEW/SAMPLE* needs to avoid having SVIEW index's which
>>>>>>>>>>> conflict
>>>>>>>>>>> with SAMP decl indexes.
>>>>>>>>>>>
>>>>>>>>>>> Although maybe w/ the draw stuff, if it is potentially inserting
>>>>>>>>>>> SAMP/TEX* into shaders already using SVIEW/SAMPLE*, maybe it needs
>>>>>>>>>>> to
>>>>>>>>>>> take care to insert SAMP decl's with #'s that start above max
>>>>>>>>>>> SVIEW
>>>>>>>>>>> index.  I'm not sure, I haven't used the aux/draw stuff yet, so
>>>>>>>>>>> I'm
>>>>>>>>>>> not terribly familiar with it.
>>>>>>>>>>>
>>>>>>>>>>> Not sure if it would help, but I could add an shader cap and for
>>>>>>>>>>> now
>>>>>>>>>>> only insert the SVIEW decl for drivers that want them.  Which I
>>>>>>>>>>> think,
>>>>>>>>>>> at the moment, is just tgsi_to_nir (and currently the two
>>>>>>>>>>> tgsi_to_nir
>>>>>>>>>>> users, vc4 and freedreno, use any of the aux/draw stuff).
>>>>>>>>>>>
>>>>>>>>>>>> Anyway I really like your approach, appreciate your perseverance
>>>>>>>>>>>> so far, and
>>>>>>>>>>>> want to see it through.  So please just bear with us a bit more.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> fair enough.. I was mostly just trying to avoid having to fix
>>>>>>>>>>> massive
>>>>>>>>>>> amounts of state trackers, etc, which I am not necessarily
>>>>>>>>>>> terribly
>>>>>>>>>>> familiar with or even have a way to test.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Roland, would it be possible to swap the
>>>>>>>>>>>> `shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] != -1`
>>>>>>>>>>>> condition with
>>>>>>>>>>>> something else?  Just temporarily, so that we can decouple
>>>>>>>>>>>> things, allowing
>>>>>>>>>>>> to commit this as is, then enforce the SVIEW everywhere on a 2nd
>>>>>>>>>>>> round.
>>>>>>>>>>>>
>>>>>>>>>>>> Specially because enforcing this everywhere will require some
>>>>>>>>>>>> work,
>>>>>>>>>>>> sepcially on u_blit / u_mipmap_gen helpers, which need yet
>>>>>>>>>>>> another key.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> For example, what about actually using the TGSI_OPCODE_SAMPLE_* &
>>>>>>>>>>>> friends
>>>>>>>>>>>> opcode count?  In fact, we could go even farther, and compute the
>>>>>>>>>>>> opcount
>>>>>>>>>>>> for TGSI_OPCODE_TEX & friends and ensure that they are never
>>>>>>>>>>>> non-zero
>>>>>>>>>>>> simultanouesly.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> hmm, possibly something I didn't understand properly..  if you
>>>>>>>>>>> never
>>>>>>>>>>> have both TEX* and SAMPLE* at same time, this sounds much easier..
>>>>>>>>>>> Ie. just use # of opcodes instead of file_max.  Could even just
>>>>>>>>>>> add a
>>>>>>>>>>> trivial thing to tgsi_scan to count up # of opcodes of each
>>>>>>>>>>> category
>>>>>>>>>>> (ie. TEX* and SAMPLE*)
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> That doesn't tell you how many dcls (either samplers or sampler
>>>>>>>>>> views)
>>>>>>>>>> you actually have. Even with GL reusing the same texture unit with
>>>>>>>>>> a
>>>>>>>>>> different instruction is of course possible (and quite frequently
>>>>>>>>>> done).
>>>>>>>>>> Though you are right that because both sample and tex opcodes can't
>>>>>>>>>> be
>>>>>>>>>> used in the same shader, scanning the shader could determine if
>>>>>>>>>> it's
>>>>>>>>>> either tex or sample and then use the appropriate shader file if
>>>>>>>>>> we'd
>>>>>>>>>> want to keep unused views out of the shader. That may also be ok as
>>>>>>>>>> an
>>>>>>>>>> interim measure.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> yeah, that was basically my idea.. not so much caring about the
>>>>>>>>> count,
>>>>>>>>> as much of which one is non-zero..
>>>>>>>>>
>>>>>>>>> BR,
>>>>>>>>> -R
>>>>>>>>>
>>>>>>>>>> Roland
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>



More information about the mesa-dev mailing list