[Mesa-dev] [PATCH 2/3] gallium: add texture gather support to gallium (v3)

Roland Scheidegger sroland at vmware.com
Mon Feb 24 08:17:43 PST 2014


Am 24.02.2014 09:33, schrieb Dave Airlie:
> On Wed, Feb 12, 2014 at 9:10 AM, Roland Scheidegger <sroland at vmware.com> wrote:
>> Am 11.02.2014 22:58, schrieb Dave Airlie:
>>>>>    dst.z = texture_depth(unit, lod)
>>>>>
>>>>> +.. opcode:: TG4 - Texture Gather (as per ARB_texture_gather)
>>>>> +               Gathers the four texels to be used in a bi-linear
>>>>> +               filtering operation and packs them into a single register.
>>>>> +               Only works with 2D, 2D array, cubemaps, and cubemaps arrays.
>>>>> +               For 2D textures, only the addressing modes of the sampler and
>>>>> +               the top level of any mip pyramid are used. Set W to zero.
>>>>> +               It behaves like the TEX instruction, but a filtered
>>>>> +               sample is not generated. The four samples that contribute
>>>>> +               to filtering are placed into xyzw in clockwise order,
>>>>> +               starting with the (u,v) texture coordinate delta at the
>>>>> +               following locations (-, +), (+, +), (+, -), (-, -), where
>>>>> +               the magnitude of the deltas are half a texel.
>>>>> +
>>>>> +               PIPE_CAP_TEXTURE_SM5 enhances this instruction to support
>>>>> +               shadow per-sample depth compares, single component selection,
>>>>> +               and a non-constant offset. It doesn't allow support for the
>>>>> +               GL independent offset to get i0,j0. This would require another
>>>>> +               CAP is hw can do it natively. For now we lower that before
>>>>> +               TGSI.
>>>>> +
>>>>> +.. math::
>>>>> +
>>>>> +   coord = src0
>>>>> +
>>>>> +   component = src1
>>>>> +
>>>>> +   dst = texture_gather4 (unit, coord, component)
>>>>> +
>>>>> +(with SM5 - cube array shadow)
>>>>> +
>>>>> +   coord = src0
>>>>> +
>>>>> +   compare = src1
>>>>> +
>>>>> +   dst = texture_gather (uint, coord, compare)
>>>>> +
>>>> So how does component selection work with the latter version?
>>>> I think it would be nice if you wouldn't really need two versions (so if
>>>> you don't support comparisons, the src would just be unused).
>>>
>>> That's docs not being clear enough if you read it like that. The
>>> second version is only for cube array shadow compares, which have no
>>> components. The first version is the same for non-shadow compares.
>> Ah right that works, I forgot you don't need the channel select with
>> shadow comparisons (not that I'm a big fan of such "overloaded" sources
>> but that's nothing new really).
>>
>>>
>>>> Also, FWIW for llvmpipe you'd probably wanted a native 4 offsets
>>>> versions, I don't think llvm could eliminate the huge amount of
>>>> duplicated code completely if you generate 4 texture lookups. Of course,
>>>> someone would need to implement it first (shouldn't be too difficult).
>>>
>>> Yeah llvmpipe might be in the category for using the extra CAP, I'm
>>> really hoping nvidia hw does do this, but the interface is kinda
>>> arbitrary and maybe we should consider another opcode,
>>>
>>> Since we have for SM5 nonconstant ones something like,
>>>
>>> TG4 TEMP[1], TEMP[1], SAMP[0] , TEMP[2].xyz
>>> which will sample around temp[1] i0,j0 - i1, j1 at the offset in temp[2]
>>>
>>> and
>>> TG4 TEMP[1], TEMP[1], SAMP[0], TEMP[2].xyz, TEMP[3].xyz, TEMP[4].xyz,
>>> TEMP[5].xyz
>>> which will sample i0,j0 from TEMP[1] and the respective offsets.
>>>
>>
>> Yes since the offsets are in separate offset structure and the amount of
>> offsets is indicated I think it should just work actually if a driver
>> wants to implement multiple offsets natively.
> 
> So you okay with this version I think it covers everything, and we can
> add a CAP if/when someone works out hw/llvmpipe for the 4 offset case.
> 
> Dave
> 

Yes, looks good to me.

Roland


More information about the mesa-dev mailing list