[Mesa-dev] [PATCH 1/2] gallium: add texture gather support to gallium

Roland Scheidegger sroland at vmware.com
Fri Feb 7 15:57:42 PST 2014


Am 07.02.2014 23:25, schrieb Dave Airlie:
>>>>
>>> Doh, yes because GL has ARB_texture_gather then has stuff hidden away
>>> in ARB_gpu_shader5 I forgot to add the extra bits which I suppose we should do.
>>>
>>> So I've reposted with the component selection in src1 now.
>>
>> Hmm seems a bit excessive to use an extra reg for that (gather4 but only
>> in d3d11 form uses a src_sel on the sampler reg, but that might not work).
>> I realize this is actually more messy than I thought, since the initial
>> ARB_texture_gather had the ability to query if multi-channel formats are
>> allowed, but had no way to select the channel (somewhat relying on
>> ARB_texture_swizzle to do it, though of course you can't issue multiple
>> gathers with the same texture to get different channels that way).
>> But glsl 4.00 version could select the channel.
>> Is the ARB_texture_gather version actually all that useful or could you
>> merge the two caps? That is, if you have the ability to fetch from
>> multi-channel textures, assume you can also select the channel. The sm4
>> version of gather4 also has the single-channel format restriction - I
>> guess though some hw really can do 4 channels without channel selection.
> 
> Yeah I think I'll rethink this stuff, it looks like two caps, one for
> MAX_COMPONENTS for ARB_texture_gather4, and just one cap for
> TEXTURE_GATHER_SM5 support which would denote support for all the
> ARB_GPU_shader5 bits.
> 
>> Other than that, what about shadow samplers? Gather4 of course can't do
>> it (because the d3d10-style opcodes have different opcodes for shadow
>> comparisons), but the GL style opcodes are usually the same if shadow
>> samplers or not are used. Maybe you don't want to handle that right now,
>> just saying that if you'd want to use the same opcode you'd be missing a
>> component in case of texture cube arrays... Since this can't be used for
>> fixed function though I'd guess nothing would stop you from using a
>> different opcode for shadow samplers.
> 
> 
> I've gotten shadow samplers to work with the current opcodes, though I
> have to see about cube arrays if we have the running out of space to
> put everything.
> 
> Also the GPU_shader5 spec has a few more oddities, so you have
> textureGatherOffset which can take a non-constant set of offset values
> to apply to all 4 texels, then you have textureGatherOffsets which
> only takes constants again, but 4 of them, one per texel. Looking at
> radeon hw it appears fglrx decomposes textureGatherOffsets into
> multiple gather instructions at the hw level but using the
> non-constant hw support to do this. So I'm not sure if the gallium
> interface should just support non-constant for all offsets and just
> restrict the GL.
> 
> I've reworked the state tracker code already,
>  https://urldefense.proofpoint.com/v1/url?u=http://cgit.freedesktop.org/~airlied/mesa/commit/?h%3Dr600g-texture-gather%26id%3D444bc1c8118d51600a58af8a84088e94d0800b22&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0A&m=i2Hco%2Fvjh172WfH7eGwBq60zrN3fohKlxGyQpozcOls%3D%0A&s=2556e11e95af04d43f205d2107f51fa1d1d6d325c2cac1b5205c37ccd6605e78
> 
> but I suspect I've a bit further down the rabbit hole to go.
> 
> Dave
> 

Hmm yes it's fairly interesting there's so many different possiblities.
It looks like the non-constant offsets were borrowed from d3d11, but it
looks like 4 different constant offsets is a GL exclusive feature.
(Though if you have to issue that as 4 individual gathers, there's not
really that much left of "gather"...). Interestingly, if I see that
right from the docs some hw (radeonsi) could actually support doing
gathers with ordinary mipmap sampling rather than just use mip 0 :-).
In any case, at least for the cap bits those could be changed rather
easily even your original proposal would be ok by me and more cap bits
could be added later (or some replaced). Instructions though are a bit
harder to change, though I'm not sure if you'd really want one which can
encode everything - maybe the non-constant offset one should have a
separate encoding, might get messy otherwise. I dunno though whatever
works...

Roland


More information about the mesa-dev mailing list