[Mesa-dev] [PATCH 2/3] gallium: add texture gather support to gallium (v3)
Dave Airlie
airlied at gmail.com
Mon Feb 24 00:33:12 PST 2014
On Wed, Feb 12, 2014 at 9:10 AM, Roland Scheidegger <sroland at vmware.com> wrote:
> Am 11.02.2014 22:58, schrieb Dave Airlie:
>>>> dst.z = texture_depth(unit, lod)
>>>>
>>>> +.. opcode:: TG4 - Texture Gather (as per ARB_texture_gather)
>>>> + Gathers the four texels to be used in a bi-linear
>>>> + filtering operation and packs them into a single register.
>>>> + Only works with 2D, 2D array, cubemaps, and cubemaps arrays.
>>>> + For 2D textures, only the addressing modes of the sampler and
>>>> + the top level of any mip pyramid are used. Set W to zero.
>>>> + It behaves like the TEX instruction, but a filtered
>>>> + sample is not generated. The four samples that contribute
>>>> + to filtering are placed into xyzw in clockwise order,
>>>> + starting with the (u,v) texture coordinate delta at the
>>>> + following locations (-, +), (+, +), (+, -), (-, -), where
>>>> + the magnitude of the deltas are half a texel.
>>>> +
>>>> + PIPE_CAP_TEXTURE_SM5 enhances this instruction to support
>>>> + shadow per-sample depth compares, single component selection,
>>>> + and a non-constant offset. It doesn't allow support for the
>>>> + GL independent offset to get i0,j0. This would require another
>>>> + CAP is hw can do it natively. For now we lower that before
>>>> + TGSI.
>>>> +
>>>> +.. math::
>>>> +
>>>> + coord = src0
>>>> +
>>>> + component = src1
>>>> +
>>>> + dst = texture_gather4 (unit, coord, component)
>>>> +
>>>> +(with SM5 - cube array shadow)
>>>> +
>>>> + coord = src0
>>>> +
>>>> + compare = src1
>>>> +
>>>> + dst = texture_gather (uint, coord, compare)
>>>> +
>>> So how does component selection work with the latter version?
>>> I think it would be nice if you wouldn't really need two versions (so if
>>> you don't support comparisons, the src would just be unused).
>>
>> That's docs not being clear enough if you read it like that. The
>> second version is only for cube array shadow compares, which have no
>> components. The first version is the same for non-shadow compares.
> Ah right that works, I forgot you don't need the channel select with
> shadow comparisons (not that I'm a big fan of such "overloaded" sources
> but that's nothing new really).
>
>>
>>> Also, FWIW for llvmpipe you'd probably wanted a native 4 offsets
>>> versions, I don't think llvm could eliminate the huge amount of
>>> duplicated code completely if you generate 4 texture lookups. Of course,
>>> someone would need to implement it first (shouldn't be too difficult).
>>
>> Yeah llvmpipe might be in the category for using the extra CAP, I'm
>> really hoping nvidia hw does do this, but the interface is kinda
>> arbitrary and maybe we should consider another opcode,
>>
>> Since we have for SM5 nonconstant ones something like,
>>
>> TG4 TEMP[1], TEMP[1], SAMP[0] , TEMP[2].xyz
>> which will sample around temp[1] i0,j0 - i1, j1 at the offset in temp[2]
>>
>> and
>> TG4 TEMP[1], TEMP[1], SAMP[0], TEMP[2].xyz, TEMP[3].xyz, TEMP[4].xyz,
>> TEMP[5].xyz
>> which will sample i0,j0 from TEMP[1] and the respective offsets.
>>
>
> Yes since the offsets are in separate offset structure and the amount of
> offsets is indicated I think it should just work actually if a driver
> wants to implement multiple offsets natively.
So you okay with this version I think it covers everything, and we can
add a CAP if/when someone works out hw/llvmpipe for the 4 offset case.
Dave.
More information about the mesa-dev
mailing list