[Mesa-dev] [PATCH 2/3] gallium: add texture gather support to gallium (v3)

Roland Scheidegger sroland at vmware.com
Tue Feb 11 15:10:35 PST 2014


Am 11.02.2014 22:58, schrieb Dave Airlie:
>>>    dst.z = texture_depth(unit, lod)
>>>
>>> +.. opcode:: TG4 - Texture Gather (as per ARB_texture_gather)
>>> +               Gathers the four texels to be used in a bi-linear
>>> +               filtering operation and packs them into a single register.
>>> +               Only works with 2D, 2D array, cubemaps, and cubemaps arrays.
>>> +               For 2D textures, only the addressing modes of the sampler and
>>> +               the top level of any mip pyramid are used. Set W to zero.
>>> +               It behaves like the TEX instruction, but a filtered
>>> +               sample is not generated. The four samples that contribute
>>> +               to filtering are placed into xyzw in clockwise order,
>>> +               starting with the (u,v) texture coordinate delta at the
>>> +               following locations (-, +), (+, +), (+, -), (-, -), where
>>> +               the magnitude of the deltas are half a texel.
>>> +
>>> +               PIPE_CAP_TEXTURE_SM5 enhances this instruction to support
>>> +               shadow per-sample depth compares, single component selection,
>>> +               and a non-constant offset. It doesn't allow support for the
>>> +               GL independent offset to get i0,j0. This would require another
>>> +               CAP is hw can do it natively. For now we lower that before
>>> +               TGSI.
>>> +
>>> +.. math::
>>> +
>>> +   coord = src0
>>> +
>>> +   component = src1
>>> +
>>> +   dst = texture_gather4 (unit, coord, component)
>>> +
>>> +(with SM5 - cube array shadow)
>>> +
>>> +   coord = src0
>>> +
>>> +   compare = src1
>>> +
>>> +   dst = texture_gather (uint, coord, compare)
>>> +
>> So how does component selection work with the latter version?
>> I think it would be nice if you wouldn't really need two versions (so if
>> you don't support comparisons, the src would just be unused).
> 
> That's docs not being clear enough if you read it like that. The
> second version is only for cube array shadow compares, which have no
> components. The first version is the same for non-shadow compares.
Ah right that works, I forgot you don't need the channel select with
shadow comparisons (not that I'm a big fan of such "overloaded" sources
but that's nothing new really).

> 
>> Also, FWIW for llvmpipe you'd probably wanted a native 4 offsets
>> versions, I don't think llvm could eliminate the huge amount of
>> duplicated code completely if you generate 4 texture lookups. Of course,
>> someone would need to implement it first (shouldn't be too difficult).
> 
> Yeah llvmpipe might be in the category for using the extra CAP, I'm
> really hoping nvidia hw does do this, but the interface is kinda
> arbitrary and maybe we should consider another opcode,
> 
> Since we have for SM5 nonconstant ones something like,
> 
> TG4 TEMP[1], TEMP[1], SAMP[0] , TEMP[2].xyz
> which will sample around temp[1] i0,j0 - i1, j1 at the offset in temp[2]
> 
> and
> TG4 TEMP[1], TEMP[1], SAMP[0], TEMP[2].xyz, TEMP[3].xyz, TEMP[4].xyz,
> TEMP[5].xyz
> which will sample i0,j0 from TEMP[1] and the respective offsets.
> 

Yes since the offsets are in separate offset structure and the amount of
offsets is indicated I think it should just work actually if a driver
wants to implement multiple offsets natively.

Roland


More information about the mesa-dev mailing list