[Mesa-dev] [PATCH 2/3] gallium: add texture gather support to gallium (v3)
Roland Scheidegger
sroland at vmware.com
Tue Feb 11 15:10:35 PST 2014
Am 11.02.2014 22:58, schrieb Dave Airlie:
>>> dst.z = texture_depth(unit, lod)
>>>
>>> +.. opcode:: TG4 - Texture Gather (as per ARB_texture_gather)
>>> + Gathers the four texels to be used in a bi-linear
>>> + filtering operation and packs them into a single register.
>>> + Only works with 2D, 2D array, cubemaps, and cubemaps arrays.
>>> + For 2D textures, only the addressing modes of the sampler and
>>> + the top level of any mip pyramid are used. Set W to zero.
>>> + It behaves like the TEX instruction, but a filtered
>>> + sample is not generated. The four samples that contribute
>>> + to filtering are placed into xyzw in clockwise order,
>>> + starting with the (u,v) texture coordinate delta at the
>>> + following locations (-, +), (+, +), (+, -), (-, -), where
>>> + the magnitude of the deltas are half a texel.
>>> +
>>> + PIPE_CAP_TEXTURE_SM5 enhances this instruction to support
>>> + shadow per-sample depth compares, single component selection,
>>> + and a non-constant offset. It doesn't allow support for the
>>> + GL independent offset to get i0,j0. This would require another
>>> + CAP is hw can do it natively. For now we lower that before
>>> + TGSI.
>>> +
>>> +.. math::
>>> +
>>> + coord = src0
>>> +
>>> + component = src1
>>> +
>>> + dst = texture_gather4 (unit, coord, component)
>>> +
>>> +(with SM5 - cube array shadow)
>>> +
>>> + coord = src0
>>> +
>>> + compare = src1
>>> +
>>> + dst = texture_gather (uint, coord, compare)
>>> +
>> So how does component selection work with the latter version?
>> I think it would be nice if you wouldn't really need two versions (so if
>> you don't support comparisons, the src would just be unused).
>
> That's docs not being clear enough if you read it like that. The
> second version is only for cube array shadow compares, which have no
> components. The first version is the same for non-shadow compares.
Ah right that works, I forgot you don't need the channel select with
shadow comparisons (not that I'm a big fan of such "overloaded" sources
but that's nothing new really).
>
>> Also, FWIW for llvmpipe you'd probably wanted a native 4 offsets
>> versions, I don't think llvm could eliminate the huge amount of
>> duplicated code completely if you generate 4 texture lookups. Of course,
>> someone would need to implement it first (shouldn't be too difficult).
>
> Yeah llvmpipe might be in the category for using the extra CAP, I'm
> really hoping nvidia hw does do this, but the interface is kinda
> arbitrary and maybe we should consider another opcode,
>
> Since we have for SM5 nonconstant ones something like,
>
> TG4 TEMP[1], TEMP[1], SAMP[0] , TEMP[2].xyz
> which will sample around temp[1] i0,j0 - i1, j1 at the offset in temp[2]
>
> and
> TG4 TEMP[1], TEMP[1], SAMP[0], TEMP[2].xyz, TEMP[3].xyz, TEMP[4].xyz,
> TEMP[5].xyz
> which will sample i0,j0 from TEMP[1] and the respective offsets.
>
Yes since the offsets are in separate offset structure and the amount of
offsets is indicated I think it should just work actually if a driver
wants to implement multiple offsets natively.
Roland
More information about the mesa-dev
mailing list