[Mesa-dev] [PATCH 28/53] st/nine: Match REP implementation to LOOP
Ilia Mirkin
imirkin at alum.mit.edu
Wed Jan 7 14:41:55 PST 2015
On Wed, Jan 7, 2015 at 5:33 PM, Axel Davy <axel.davy at ens.fr> wrote:
> Le 07/01/2015 21:13, Ilia Mirkin a écrit :
>
>> On Wed, Jan 7, 2015 at 11:36 AM, Axel Davy <axel.davy at ens.fr> wrote:
>>>
>>> Previous implementation was fine,
>>> just instead of having increasing counter,
>>> have a decreasing counter.
>>>
>>> Signed-off-by: Axel Davy <axel.davy at ens.fr>
>>> ---
>>> src/gallium/state_trackers/nine/nine_shader.c | 41
>>> +++++++++++++++------------
>>> 1 file changed, 23 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/src/gallium/state_trackers/nine/nine_shader.c
>>> b/src/gallium/state_trackers/nine/nine_shader.c
>>> index 21b06ce..88d4c07 100644
>>> --- a/src/gallium/state_trackers/nine/nine_shader.c
>>> +++ b/src/gallium/state_trackers/nine/nine_shader.c
>>> @@ -1562,9 +1562,7 @@ DECL_SPECIAL(REP)
>>> unsigned *label;
>>> struct ureg_src rep = tx_src_param(tx, &tx->insn.src[0]);
>>> struct ureg_dst ctr;
>>> - struct ureg_dst tmp = tx_scratch_scalar(tx);
>>> - struct ureg_src imm =
>>> - tx->native_integers ? ureg_imm1u(ureg, 0) : ureg_imm1f(ureg,
>>> 0.0f);
>>> + struct ureg_dst tmp;
>>>
>>> label = tx_bgnloop(tx);
>>> ctr = tx_get_loopctr(tx, FALSE);
>>> @@ -1572,33 +1570,40 @@ DECL_SPECIAL(REP)
>>> /* NOTE: rep must be constant, so we don't have to save the count
>>> */
>>> assert(rep.File == TGSI_FILE_CONSTANT || rep.File ==
>>> TGSI_FILE_IMMEDIATE);
>>>
>>> - ureg_MOV(ureg, ctr, imm);
>>> + ureg_MOV(ureg, ureg_writemask(ctr, NINED3DSP_WRITEMASK_0), rep);
>>> + /* in the case ctr is float, remove 0.5 to avoid precision issues
>>> for comparisons */
>>> + if (!tx->native_integers)
>>> + ureg_ADD(ureg, ureg_writemask(ctr, NINED3DSP_WRITEMASK_0),
>>> ureg_src(ctr), ureg_imm1f(ureg, -0.5f));
>>> +
>>> ureg_BGNLOOP(ureg, label);
>>> - if (tx->native_integers)
>>> - {
>>> - ureg_USGE(ureg, tmp, tx_src_scalar(ctr), rep);
>>> - ureg_UIF(ureg, tx_src_scalar(tmp), tx_cond(tx));
>>> - }
>>> - else
>>> - {
>>> - ureg_SGE(ureg, tmp, tx_src_scalar(ctr), rep);
>>> + tmp = tx_scratch_scalar(tx);
>>> +
>>> + /* stop when crt.x <= 0 */
>>
>> ctr.x
>>
>>> + if (!tx->native_integers) {
>>> + ureg_SLE(ureg, tmp, ureg_scalar(ureg_src(ctr), TGSI_SWIZZLE_X),
>>> ureg_imm1f(ureg, 0.0f));
>>
>> It would be less confusing if this were the same as below, i.e. switch
>> it to SGT. Or switch the other one to ISLT. Having one be greater and
>> one less-than is weird.
>>
>>> ureg_IF(ureg, tx_src_scalar(tmp), tx_cond(tx));
>>> + } else {
>>> + ureg_ISGE(ureg, tmp, ureg_imm1i(ureg, 0),
>>> ureg_scalar(ureg_src(ctr), TGSI_SWIZZLE_X));
>>> + ureg_UIF(ureg, tx_src_scalar(tmp), tx_cond(tx));
>>> }
>>> ureg_BRK(ureg);
>>> tx_endcond(tx);
>>> ureg_ENDIF(ureg);
>>>
>>> - if (tx->native_integers) {
>>> - ureg_UADD(ureg, ctr, tx_src_scalar(ctr), ureg_imm1u(ureg, 1));
>>> - } else {
>>> - ureg_ADD(ureg, ctr, tx_src_scalar(ctr), ureg_imm1f(ureg, 1.0f));
>>> - }
>>> -
>>> return D3D_OK;
>>> }
>>>
>>> DECL_SPECIAL(ENDREP)
>>> {
>>> + struct ureg_program *ureg = tx->ureg;
>>> + struct ureg_dst ctr = tx_get_loopctr(tx, FALSE);
>>> +
>>> + if (!tx->native_integers) {
>>> + ureg_ADD(ureg, ureg_writemask(ctr, NINED3DSP_WRITEMASK_0),
>>> ureg_src(ctr), ureg_imm1f(ureg, -1.0f));
>>> + } else {
>>> + ureg_UADD(ureg, ureg_writemask(ctr, NINED3DSP_WRITEMASK_0),
>>> ureg_src(ctr), ureg_imm1i(ureg, -1.0));
>>
>> -1 -- presumably this is an integer...
>
> Yep, should be -1. Strange compiler didn't warn.
>>
>>
>> But at what point does ctr get converted to an integer? At the
>> beginning, when you subtract 0.5, it appears to still be a float... I
>> think that ADD is wrong and needs to be a UADD? (But then obviously
>> -0.5 doesn't make sense).
>
> ctr is either an integer from the start, or not.
>
> When the card doesn't support integers, ctr is a float. To avoid precision
> issues when doing comparison to 0, I choosed to substract 0.5 to ctr.
>
> I realise now it makes more sense to compare ctr to 0.5 and not substract
> anything at the start.
Oh oops, I misread that add as happening if native ints, not if
!native ints. Having both paths being SGE (or SLT) would definitely
make for easier reading.
-ilia
More information about the mesa-dev
mailing list