[Mesa-dev] [RFC] llvmpipe texture coordinate rounding

Fri Feb 14 10:00:19 PST 2014

Am 14.02.2014 18:07, schrieb Jeff Muizelaar:
> In doing some testing we’ve noticed that trying to draw pixel aligned
> textures does not work very well with linear filtering in llvmpipe.
> 
> Here’s an example of the problem.
> 
> Imagine wanting to paint a 100x100 texture. After being scaled by 100
> the texture coordinates will end up as:
> 0.5, 1.5, 2.5, 3.5, 4.5..
> 
> These are then multiplied by 256 and converted to integers giving us:
> 128, 384, 640, 896, 1152..
> 
> We subtract the 128:
> 0, 256, 512, 768, 1024..
> 
> Then mask to get the fractional component and shift to get the integer
> component:
> 0,0, 1,0, 2,0, 3,0, 4,0...
> 
> However, if for example 3.5 ends up as 3.4999999 we get:
> 895.9999744 -> 895 -> 767 -> 2,255 instead of 3,0
> 
> When we lerp using this value we end up including some of the pixel
> value at 2 instead of just at 3.
> 
> If we add 0.5 before converting to an integer this problem goes away.
> 
> The attached patch does this. It also changes the REPEAT mode code to
> use similar integer conversion code as the non-REPEAT path. The new
> generated code should be more efficient than the old code.
> 


I'll need to take another look and run some tests, though I've got some
quick comments:


@@ -1031,16 +1082,28 @@ lp_build_sample_image_linear(struct
lp_build_sample_context *bld,
       s = lp_build_mul_imm(&bld->coord_bld, s, 256);
       if (dims >= 2)
          t = lp_build_mul_imm(&bld->coord_bld, t, 256);
       if (dims >= 3)
          r = lp_build_mul_imm(&bld->coord_bld, r, 256);
    }

    /* convert float to int */
+   half = lp_build_const_vec(bld->gallivm, bld->coord_bld.type, 0.5);
+   s = lp_build_add(&bld->coord_bld, s, half);
+   s = LLVMBuildFPToSI(builder, s, i32_vec_type, "");
+   if (dims >= 2) {
+      t = lp_build_add(&bld->coord_bld, t, half);
+      t = LLVMBuildFPToSI(builder, t, i32_vec_type, "");
+   }
+   if (dims >= 3) {
+      r = lp_build_add(&bld->coord_bld, r, half);
+      r = LLVMBuildFPToSI(builder, r, i32_vec_type, "");
+   }
+
    s = LLVMBuildFPToSI(builder, s, i32_vec_type, "");
    if (dims >= 2)
       t = LLVMBuildFPToSI(builder, t, i32_vec_type, "");
    if (dims >= 3)
       r = LLVMBuildFPToSI(builder, r, i32_vec_type, "");
This looks quite incorrect you're converting the s/t/r coords twice from
float to int.


    /* subtract 0.5 (add -128) */
    i32_c128 = lp_build_const_int_vec(bld->gallivm, i32.type, -128);


Also, the add looks iffy as it won't work correctly if the coords are
negative, since the FPToSI is of course trunc, not floor.
Maybe instead of using add + fptosi should just use lp_build_iround
(which is just one sse instruction too on x86 though if you're targeting
another arch it will definitely be more code at least unless someone
adds an intrinsic for it if the cpu even has one). Might not matter
though depending on address mode...

And I might be missing something why do you think the new repeat code is
faster? Though that might also depend on arch_rounding being available
and such but at first looks it seems slightly more complex to me.

Roland