[Mesa-dev] [PATCH 1/2] tgsi: texture types
Rob Clark
robdclark at gmail.com
Tue Jun 9 09:01:08 PDT 2015
On Tue, Jun 9, 2015 at 9:32 AM, Roland Scheidegger <sroland at vmware.com> wrote:
> Am 09.06.2015 um 15:00 schrieb Rob Clark:
>> On Tue, Jun 9, 2015 at 5:01 AM, Jose Fonseca <jfonseca at vmware.com> wrote:
>>> On 09/06/15 04:03, Rob Clark wrote:
>>>>
>>>> On Mon, Jun 8, 2015 at 10:50 PM, Roland Scheidegger <sroland at vmware.com>
>>>> wrote:
>>>>>
>>>>> Am 09.06.2015 um 04:40 schrieb Rob Clark:
>>>>>>
>>>>>> On Mon, Jun 8, 2015 at 10:36 PM, Roland Scheidegger <sroland at vmware.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Am 09.06.2015 um 04:20 schrieb Rob Clark:
>>>>>>>>
>>>>>>>> On Mon, Jun 8, 2015 at 9:40 PM, Roland Scheidegger
>>>>>>>> <sroland at vmware.com> wrote:
>>>>>>>>>
>>>>>>>>> Am 09.06.2015 um 03:20 schrieb Rob Clark:
>>>>>>>>>>
>>>>>>>>>> On Mon, Jun 8, 2015 at 8:35 PM, Roland Scheidegger
>>>>>>>>>> <sroland at vmware.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Am 08.06.2015 um 20:15 schrieb Rob Clark:
>>>>>>>>>>>>
>>>>>>>>>>>> From: Rob Clark <robclark at freedesktop.org>
>>>>>>>>>>>>
>>>>>>>>>>>> Freedreno needs sampler type information to deal with int/uint
>>>>>>>>>>>> textures.
>>>>>>>>>>>> To accomplish this, start creating sampler-view declarations, as
>>>>>>>>>>>> suggested here:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> http://lists.freedesktop.org/archives/mesa-dev/2014-November/071583.html
>>>>>>>>>>>>
>>>>>>>>>>>> create a sampler-view with index matching the sampler, to encode
>>>>>>>>>>>> the
>>>>>>>>>>>> texture type (ie. SINT/UINT/FLOAT). Ie:
>>>>>>>>>>>>
>>>>>>>>>>>> DCL SVIEW[n], 2D, UINT
>>>>>>>>>>>> DCL SAMP[n]
>>>>>>>>>>>> TEX OUT[1], IN[1], SAMP[n]
>>>>>>>>>>>>
>>>>>>>>>>>> For tgsi texture instructions which do not take an explicit SVIEW
>>>>>>>>>>>> argument, the SVIEW index is implied by the SAMP index.
>>>>>>>>>>>
>>>>>>>>>>> I wonder if there should be some doc update somewhere.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> yeah, perhaps.. I wasn't quite sure where, tgsi.rst only talks about
>>>>>>>>>> SVIEW in terms of the SAMPLE and related opc's (ie. the ones which
>>>>>>>>>> do
>>>>>>>>>> take an SVIEW arg), and the way things are with this patch, other
>>>>>>>>>> drivers simply need to ignore the extra SVIEW decl's and not
>>>>>>>>>> randomly
>>>>>>>>>> assert for things to continue working as before..
>>>>>>>>>>
>>>>>>>>>> hypothetically if something in tree actually created SVIEW decl
>>>>>>>>>> currently, and the shader also had TEX* style instructions, it would
>>>>>>>>>> have to take care to not have conflicting SVIEW's. But afaict
>>>>>>>>>> nothing
>>>>>>>>>> in-tree creates sampler view decl's currently.
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: Rob Clark <robclark at freedesktop.org>
>>>>>>>>>>>> ---
>>>>>>>>>>>> src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 29
>>>>>>>>>>>> +++++++++++++++++++++++++++++
>>>>>>>>>>>> 1 file changed, 29 insertions(+)
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>>>>>>>>>> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>>>>>>>>>> index 0e60d95..ce8f2ea 100644
>>>>>>>>>>>> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>>>>>>>>>> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>>>>>>>>>> @@ -239,6 +239,7 @@ public:
>>>>>>>>>>>> st_src_reg sampler; /**< sampler register */
>>>>>>>>>>>> int sampler_array_size; /**< 1-based size of sampler array, 1
>>>>>>>>>>>> if not array */
>>>>>>>>>>>> int tex_target; /**< One of TEXTURE_*_INDEX */
>>>>>>>>>>>> + glsl_base_type tex_type;
>>>>>>>>>>>> GLboolean tex_shadow;
>>>>>>>>>>>>
>>>>>>>>>>>> st_src_reg tex_offsets[MAX_GLSL_TEXTURE_OFFSET];
>>>>>>>>>>>> @@ -345,6 +346,8 @@ public:
>>>>>>>>>>>>
>>>>>>>>>>>> int num_address_regs;
>>>>>>>>>>>> int samplers_used;
>>>>>>>>>>>> + glsl_base_type sampler_types[PIPE_MAX_SAMPLERS];
>>>>>>>>>>>> + int sampler_targets[PIPE_MAX_SAMPLERS]; /**< One of
>>>>>>>>>>>> TGSI_TEXTURE_* */
>>>>>>>>>>>> bool indirect_addr_consts;
>>>>>>>>>>>> int wpos_transform_const;
>>>>>>>>>>>>
>>>>>>>>>>>> @@ -3323,6 +3326,8 @@ glsl_to_tgsi_visitor::visit(ir_texture *ir)
>>>>>>>>>>>> assert(!"Should not get here.");
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> + inst->tex_type = ir->type->base_type;
>>>>>>>>>>>> +
>>>>>>>>>>>> this->result = result_src;
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> @@ -3471,6 +3476,11 @@ count_resources(glsl_to_tgsi_visitor *v,
>>>>>>>>>>>> gl_program *prog)
>>>>>>>>>>>> for (int i = 0; i < inst->sampler_array_size; i++) {
>>>>>>>>>>>> v->samplers_used |= 1 << (inst->sampler.index + i);
>>>>>>>>>>>>
>>>>>>>>>>>> + debug_assert(i < (int)ARRAY_SIZE(v->sampler_types));
>>>>>>>>>>>> + v->sampler_types[i] = inst->tex_type;
>>>>>>>>>>>> + v->sampler_targets[i] =
>>>>>>>>>>>> + st_translate_texture_target(inst->tex_target,
>>>>>>>>>>>> inst->tex_shadow);
>>>>>>>>>>>> +
>>>>>>>>>>>> if (inst->tex_shadow) {
>>>>>>>>>>>> prog->ShadowSamplers |= 1 << (inst->sampler.index
>>>>>>>>>>>> + i);
>>>>>>>>>>>> }
>>>>>>>>>>>> @@ -5529,7 +5539,26 @@ st_translate_program(
>>>>>>>>>>>> /* texture samplers */
>>>>>>>>>>>> for (i = 0; i <
>>>>>>>>>>>> ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxTextureImageUnits; i++) {
>>>>>>>>>>>> if (program->samplers_used & (1 << i)) {
>>>>>>>>>>>> + unsigned type;
>>>>>>>>>>>> +
>>>>>>>>>>>> t->samplers[i] = ureg_DECL_sampler(ureg, i);
>>>>>>>>>>>> +
>>>>>>>>>>>> + switch (program->sampler_types[i]) {
>>>>>>>>>>>> + case GLSL_TYPE_INT:
>>>>>>>>>>>> + type = TGSI_RETURN_TYPE_SINT;
>>>>>>>>>>>> + break;
>>>>>>>>>>>> + case GLSL_TYPE_UINT:
>>>>>>>>>>>> + type = TGSI_RETURN_TYPE_UINT;
>>>>>>>>>>>> + break;
>>>>>>>>>>>> + case GLSL_TYPE_FLOAT:
>>>>>>>>>>>> + type = TGSI_RETURN_TYPE_FLOAT;
>>>>>>>>>>>> + break;
>>>>>>>>>>>> + default:
>>>>>>>>>>>> + unreachable("not reached");
>>>>>>>>>>>> + }
>>>>>>>>>>>> +
>>>>>>>>>>>> + ureg_DECL_sampler_view( ureg, i,
>>>>>>>>>>>> program->sampler_targets[i],
>>>>>>>>>>>> + type, type, type, type );
>>>>>>>>>>>> }
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This indeed seems like a consistent solution. Hopefully drivers
>>>>>>>>>>> don't
>>>>>>>>>>> assert when they see sampler view dcls...
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> well, tgsi_to_nir did before I fixed it :-(
>>>>>>>>>>
>>>>>>>>>> (mostly just because of liberal debug_assert() usage)
>>>>>>>>>>
>>>>>>>>>> it's at least not something I'd land right before a release branch
>>>>>>>>>> point. But I guess the fix is easy enough (ie. remove a bogus
>>>>>>>>>> assert), and 10.7 branch point is quite some time from now..
>>>>>>>>>>
>>>>>>>>>>> I am also somewhat worried that this only changes the glsl-to-tgsi
>>>>>>>>>>> path,
>>>>>>>>>>> not other paths. llvmpipe and draw rely on either having sview
>>>>>>>>>>> declarations for all (sample) instructions or for none, and with
>>>>>>>>>>> draw
>>>>>>>>>>> possibly inserting tex opcodes (for aalines etc.) on its own this
>>>>>>>>>>> could
>>>>>>>>>>> probably break (these draw paths aren't used with d3d10 sample
>>>>>>>>>>> opcodes
>>>>>>>>>>> as it's more or less illegal to use tex and sample opcodes in the
>>>>>>>>>>> same
>>>>>>>>>>> shader, but that doesn't really matter here). I think using sview
>>>>>>>>>>> declarations is the right thing to do but it probably should be
>>>>>>>>>>> illegal
>>>>>>>>>>> then to NOT have them in some places.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> fwiw, the way I implemented it in tgsi_to_nir (and what I'd
>>>>>>>>>> recommend
>>>>>>>>>> for other drivers) is that for the "tex style" opcodes, if there is
>>>>>>>>>> not a matching SVIEW decl, assume float. That seemed more sane than
>>>>>>>>>> fixing up *all* the other state trackers..
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yeah but for llvmpipe/draw it's not just the type. The problem is how
>>>>>>>>> they are parsing this stuff - if there's at least one sview decl
>>>>>>>>> they'll
>>>>>>>>> construct the static texture state based on the defined sampler
>>>>>>>>> views,
>>>>>>>>> but if there's none (up to now for gl state tracker) they'll do this
>>>>>>>>> based on the defined samplers, and for some there simply won't be any
>>>>>>>>> such definitions (if draw inserted some stage). llvmpipe can handle
>>>>>>>>> tex
>>>>>>>>> opcodes without sview dcls, it should also be able to handle them if
>>>>>>>>> you
>>>>>>>>> have them (but I wouldn't bet on it without testing as that's
>>>>>>>>> definitely
>>>>>>>>> going to hit some code paths never seen before), but it can't handle
>>>>>>>>> having "some" sview dcls (see for example lp_state_fs.c, line 3050
>>>>>>>>> and
>>>>>>>>> following why it can't work). This is fixable but only with some
>>>>>>>>> awkwardness (since due to d3d10 sample opcodes not having 1:1 mapping
>>>>>>>>> between samplers and sampler views). Hence my proposal of making it
>>>>>>>>> mandatory to have sview decls instead of having them optionally. That
>>>>>>>>> way could also ditch that code which distinguishes having sviews or
>>>>>>>>> not.
>>>>>>>>
>>>>>>>>
>>>>>>>> hmm, I could squash in something like:
>>>>>>>>
>>>>>>>> ----
>>>>>>>> diff --git a/src/gallium/drivers/llvmpipe/lp_state_fs.c
>>>>>>>> b/src/gallium/drivers/llvmpipe/lp_state_fs.c
>>>>>>>> index b5ce868..2878c49 100644
>>>>>>>> --- a/src/gallium/drivers/llvmpipe/lp_state_fs.c
>>>>>>>> +++ b/src/gallium/drivers/llvmpipe/lp_state_fs.c
>>>>>>>> @@ -3059,27 +3059,11 @@ make_variant_key(struct llvmpipe_context *lp,
>>>>>>>> }
>>>>>>>> }
>>>>>>>>
>>>>>>>> - /*
>>>>>>>> - * XXX If TGSI_FILE_SAMPLER_VIEW exists assume all texture opcodes
>>>>>>>> - * are dx10-style? Can't really have mixed opcodes, at least not
>>>>>>>> - * if we want to skip the holes here (without rescanning tgsi).
>>>>>>>> - */
>>>>>>>> - if (shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] != -1) {
>>>>>>>> - key->nr_sampler_views =
>>>>>>>> shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] + 1;
>>>>>>>> - for(i = 0; i < key->nr_sampler_views; ++i) {
>>>>>>>> - if(shader->info.base.file_mask[TGSI_FILE_SAMPLER_VIEW] & (1
>>>>>>>> << i)) {
>>>>>>>> -
>>>>>>>> lp_sampler_static_texture_state(&key->state[i].texture_state,
>>>>>>>> -
>>>>>>>> lp->sampler_views[PIPE_SHADER_FRAGMENT][i]);
>>>>>>>> - }
>>>>>>>> - }
>>>>>>>> - }
>>>>>>>> - else {
>>>>>>>> - key->nr_sampler_views = key->nr_samplers;
>>>>>>>> - for(i = 0; i < key->nr_sampler_views; ++i) {
>>>>>>>> - if(shader->info.base.file_mask[TGSI_FILE_SAMPLER] & (1 <<
>>>>>>>> i)) {
>>>>>>>> -
>>>>>>>> lp_sampler_static_texture_state(&key->state[i].texture_state,
>>>>>>>> -
>>>>>>>> lp->sampler_views[PIPE_SHADER_FRAGMENT][i]);
>>>>>>>> - }
>>>>>>>> + key->nr_sampler_views = key->nr_samplers;
>>>>>>>> + for(i = 0; i < key->nr_sampler_views; ++i) {
>>>>>>>> + if(shader->info.base.file_mask[TGSI_FILE_SAMPLER] & (1 << i)) {
>>>>>>>> +
>>>>>>>> lp_sampler_static_texture_state(&key->state[i].texture_state,
>>>>>>>> +
>>>>>>>> lp->sampler_views[PIPE_SHADER_FRAGMENT][i]);
>>>>>>>> }
>>>>>>>> }
>>>>>>>> }
>>>>>>>
>>>>>>> No that won't work for d3d10 state trackers, where you typically have
>>>>>>> more
>>>>>>> views than samplers (though the opposite is possible as well and you
>>>>>>> can
>>>>>>> have holes in both). I think it really would look nicer if you could
>>>>>>> just rely on sampler views being present.
>>>>>>
>>>>>>
>>>>>> what d3d10 state tracker?
>>>>>>
>>>>>
>>>>> Make that "some non-public code which uses the sample opcodes"...
>>>>
>>>>
>>>> well, not to be rude, but I can't really be expected to fix /
>>>> avoid-breaking code that doesn't exist (in upstream mesa)..
>>>
>>>
>>> Rob,
>>>
>>> Of course. Nobody expects or asked you to do such thing.
>>>
>>> Roland's just explaining why your proposed llvmpipe workaround is not
>>> sustainable. Breaking our closed state tracker is OK and happens all the
>>> time FYI. What we can't have is irreparable breakage. In short, we just
>>> need to workout a solution that doesn't leave us with our hands tied to fix
>>> it.
>>>
>>> And in fact, Roland's suggestion of requiring all TGSI producers including
>>> SVIEW dcl does not require fixing any closed-source state tracker but rather
>>> just fix TGSI generators in Mesa to put the SVIEW.
>>
>> Hmm, well there are a *lot* of things in mesa which generate tgsi (vl,
>> gallium9, etc).. fixing them all, esp. when I don't necessarily know
>> all the other state trackers well enough to know what they *should* be
>> doing, didn't seem reasonable.
>>
>> I was hoping we could just leave SVIEW decl as optional to avoid
>> having to fix up everything. It just means that something which does
>> use SVIEW/SAMPLE* needs to avoid having SVIEW index's which conflict
>> with SAMP decl indexes.
>>
>> Although maybe w/ the draw stuff, if it is potentially inserting
>> SAMP/TEX* into shaders already using SVIEW/SAMPLE*, maybe it needs to
>> take care to insert SAMP decl's with #'s that start above max SVIEW
>> index. I'm not sure, I haven't used the aux/draw stuff yet, so I'm
>> not terribly familiar with it.
>>
>> Not sure if it would help, but I could add an shader cap and for now
>> only insert the SVIEW decl for drivers that want them. Which I think,
>> at the moment, is just tgsi_to_nir (and currently the two tgsi_to_nir
>> users, vc4 and freedreno, use any of the aux/draw stuff).
>>
>>> Anyway I really like your approach, appreciate your perseverance so far, and
>>> want to see it through. So please just bear with us a bit more.
>>>
>>
>> fair enough.. I was mostly just trying to avoid having to fix massive
>> amounts of state trackers, etc, which I am not necessarily terribly
>> familiar with or even have a way to test.
>>
>>>
>>> Roland, would it be possible to swap the
>>> `shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] != -1` condition with
>>> something else? Just temporarily, so that we can decouple things, allowing
>>> to commit this as is, then enforce the SVIEW everywhere on a 2nd round.
>>>
>>> Specially because enforcing this everywhere will require some work,
>>> sepcially on u_blit / u_mipmap_gen helpers, which need yet another key.
>>>
>>>
>>> For example, what about actually using the TGSI_OPCODE_SAMPLE_* & friends
>>> opcode count? In fact, we could go even farther, and compute the opcount
>>> for TGSI_OPCODE_TEX & friends and ensure that they are never non-zero
>>> simultanouesly.
>>>
>>
>> hmm, possibly something I didn't understand properly.. if you never
>> have both TEX* and SAMPLE* at same time, this sounds much easier..
>> Ie. just use # of opcodes instead of file_max. Could even just add a
>> trivial thing to tgsi_scan to count up # of opcodes of each category
>> (ie. TEX* and SAMPLE*)
>>
>
> That doesn't tell you how many dcls (either samplers or sampler views)
> you actually have. Even with GL reusing the same texture unit with a
> different instruction is of course possible (and quite frequently done).
> Though you are right that because both sample and tex opcodes can't be
> used in the same shader, scanning the shader could determine if it's
> either tex or sample and then use the appropriate shader file if we'd
> want to keep unused views out of the shader. That may also be ok as an
> interim measure.
yeah, that was basically my idea.. not so much caring about the count,
as much of which one is non-zero..
BR,
-R
> Roland
>
>
More information about the mesa-dev
mailing list