[Mesa-dev] [RFC] llvmpipe texture coordinate rounding
Roland Scheidegger
sroland at vmware.com
Fri Feb 14 14:35:55 PST 2014
Am 14.02.2014 19:59, schrieb Jeff Muizelaar:
>
> On Feb 14, 2014, at 1:00 PM, Roland Scheidegger <sroland at vmware.com
> <mailto:sroland at vmware.com>> wrote:
>
>> Am 14.02.2014 18:07, schrieb Jeff Muizelaar:
>>
>> I'll need to take another look and run some tests, though I've got some
>> quick comments:
>>
>>
>> @@ -1031,16 +1082,28 @@ lp_build_sample_image_linear(struct
>> lp_build_sample_context *bld,
>> s = lp_build_mul_imm(&bld->coord_bld, s, 256);
>> if (dims >= 2)
>> t = lp_build_mul_imm(&bld->coord_bld, t, 256);
>> if (dims >= 3)
>> r = lp_build_mul_imm(&bld->coord_bld, r, 256);
>> }
>>
>> /* convert float to int */
>> + half = lp_build_const_vec(bld->gallivm, bld->coord_bld.type, 0.5);
>> + s = lp_build_add(&bld->coord_bld, s, half);
>> + s = LLVMBuildFPToSI(builder, s, i32_vec_type, "");
>> + if (dims >= 2) {
>> + t = lp_build_add(&bld->coord_bld, t, half);
>> + t = LLVMBuildFPToSI(builder, t, i32_vec_type, "");
>> + }
>> + if (dims >= 3) {
>> + r = lp_build_add(&bld->coord_bld, r, half);
>> + r = LLVMBuildFPToSI(builder, r, i32_vec_type, "");
>> + }
>> +
>> s = LLVMBuildFPToSI(builder, s, i32_vec_type, "");
>> if (dims >= 2)
>> t = LLVMBuildFPToSI(builder, t, i32_vec_type, "");
>> if (dims >= 3)
>> r = LLVMBuildFPToSI(builder, r, i32_vec_type, "");
>> This looks quite incorrect you're converting the s/t/r coords twice from
>> float to int.
>
> Yep. I forgot to remove this second hunk.
>
>> /* subtract 0.5 (add -128) */
>> i32_c128 = lp_build_const_int_vec(bld->gallivm, i32.type, -128);
>>
>>
>> Also, the add looks iffy as it won't work correctly if the coords are
>> negative, since the FPToSI is of course trunc, not floor.
>
> I think it will be ok because the REPEAT case avoids negative coord
> before converting to int and the other cases clamp to 0.
I think for clamp_to_edge you're right, but repeat will use these coords
for the POT case (npot indeed won't care at all since it doesn't
actually use these values at all).
>
>> Maybe instead of using add + fptosi should just use lp_build_iround
>> (which is just one sse instruction too on x86 though if you're targeting
>> another arch it will definitely be more code at least unless someone
>> adds an intrinsic for it if the cpu even has one). Might not matter
>> though depending on address mode…
>
> Yeah, that might be a better idea.
>
>>
>> And I might be missing something why do you think the new repeat code is
>> faster? Though that might also depend on arch_rounding being available
>> and such but at first looks it seems slightly more complex to me.
>
> The current code converts integer and fractional parts to integer
> separately. It also does the subtract 0.5 in floating point instead of
> integer arithmetic (-128).
Yeah, you're probably right. With simple instruction counting I end up
with the same number of instructions actually (assuming arch_rounding is
available, I miscounted previously and thought it were two more), but
the ones in your version ought to be a bit cheaper.
I suspect actually nearest filtering suffers from the same inaccuracy,
we actually also do the mul by 256 so we get 8 fractional bits, then
toss away those 8 bits and just use the non-fractional part. That looks
to me like we only need these 8 bits so we get "reasonably correct"
rounding, but we still would chose the wrong sampling point sometimes
(if it's less than 1/256 pixels away from the center between two
texels). I actually noticed that one a while ago but I can't remember
why I didn't do something about it...
Roland
More information about the mesa-dev
mailing list