[Mesa-stable] [PATCH] i965: fix textureGrad for cubemaps
Tapani Pälli
tapani.palli at intel.com
Thu Sep 17 05:17:56 PDT 2015
On 09/17/2015 02:10 PM, Iago Toral wrote:
> Hi Tapani, Kevin,
>
> awesome work! I was curious about how to fix this, at least when I was
> looking at the specs for this stuff it was not obvious that the Math
> involved for this was so different, I only recall seeing the reference
> that texure coordinates had to be normalized to a [-1, 1] space after
> selecting the face in the cube, but I did not see formulas to implement
> all this like they had for the normal case. It looks like the Math
> involved is quite different.
>
> I added some minor comments below:
>
> On Thu, 2015-09-17 at 08:12 +0300, Tapani Pälli wrote:
>> Fixes regression caused by commit
>> 2b1cdb0eddb73f62e4848d4b64840067f1f70865 in:
>> ES3-CTS.gtf.GL3Tests.shadow.shadow_execution_frag
>>
>> No regressions observed in deqp, CTS or Piglit.
>>
>> Signed-off-by: Tapani Pälli <tapani.palli at intel.com>
>> Signed-off-by: Kevin Rogovin <kevin.rogovin at intel.com>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91114
>> Cc: "11.0 10.7" <mesa-stable at lists.freedesktop.org>
>> ---
>> .../dri/i965/brw_lower_texture_gradients.cpp | 172 ++++++++++++++++++++-
>> 1 file changed, 169 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_lower_texture_gradients.cpp b/src/mesa/drivers/dri/i965/brw_lower_texture_gradients.cpp
>> index 7a5f983..f8a31b7 100644
>> --- a/src/mesa/drivers/dri/i965/brw_lower_texture_gradients.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_lower_texture_gradients.cpp
>> @@ -48,6 +48,7 @@ public:
>>
>> private:
>> void emit(ir_variable *, ir_rvalue *);
>> + ir_variable *temp(void *ctx, const glsl_type *type, const char *name);
>> };
>>
>> /**
>> @@ -60,6 +61,17 @@ lower_texture_grad_visitor::emit(ir_variable *var, ir_rvalue *value)
>> base_ir->insert_before(assign(var, value));
>> }
>>
>> +/**
>> + * Emit a temporary variable declaration
>> + */
>> +ir_variable *
>> +lower_texture_grad_visitor::temp(void *ctx, const glsl_type *type, const char *name)
>> +{
>> + ir_variable *var = new(ctx) ir_variable(type, name, ir_var_temporary);
>> + base_ir->insert_before(var);
>> + return var;
>> +}
>> +
>> static const glsl_type *
>> txs_type(const glsl_type *type)
>> {
>> @@ -162,9 +174,163 @@ lower_texture_grad_visitor::visit_leave(ir_texture *ir)
>> */
>> ir->op = ir_txl;
>> if (ir->sampler->type->sampler_dimensionality == GLSL_SAMPLER_DIM_CUBE) {
>> - ir->lod_info.lod = expr(ir_binop_add,
>> - expr(ir_unop_log2, rho),
>> - new(mem_ctx) ir_constant(-1.0f));
>
> It seems that in this case we don't need rho at all, so we should
> probably move the rho computation to the else branch entirely.
agreed, will move rho generation to else branch only
>> + /* Cubemap texture lookups first generate a texture coordinate normalized
>> + to [-1, 1] on the appropiate face. The appropiate face is determined
>> + by which component has largest magnitude and its sign. The texture
>> + coordinate is the quotient of the remaining texture coordinates against
>> + that absolute value of the component of largest magnitude. This division
>> + requires that the computing of the derivative of the texel coordinate
>> + must use the quotient rule. The high level GLSL code is as follows:
>
> Great comment! Where did you get this from? Is this detailed somwhere in
> the spec? In that case maybe we want to add a reference to that as well.
I'll let Kevin reply to this (got it as a patch from him)
>> + Step 1: selection
>> +
>> + vec3 abs_p, Q, dQdx, dQdy;
>> + abs_p = abs(ir->coordinate);
>> + if (abs_p.x >= max(abs_p.y, abs_p.z)) {
>> + Q = ir->coordinate.yzx;
>> + dQdx = ir->lod_info.grad.dPdx.yzx;
>> + dQdy = ir->lod_info.grad.dPdy.yzx;
>> + }
>> + if (abs_p.y >= max(abs_p.x, abs_p.z)) {
>> + Q = ir->coordinate.xzy;
>> + dQdx = ir->lod_info.grad.dPdx.xzy;
>> + dQdy = ir->lod_info.grad.dPdy.xzy;
>> + }
>> + if (abs_p.z >= max(abs_p.x, abs_p.y)) {
>> + Q = ir->coordinate;
>> + dQdx = ir->lod_info.grad.dPdx;
>> + dQdy = ir->lod_info.grad.dPdy;
>> + }
>
> This is a nitpick: you use 'Q, dQdx and dQdy' above, and 'q, dqdx, dqdy'
> below... you probably want to be consistent with the capitalization and
> use Q everywhere, since that is what you use in the actual
> implementation.
yes, I think this happened because some documentation is copy pasted
from glsl code in the bug .., will change all to match.
>> + Step 2: use quotient rule to compute derivative. The normalized to [-1, 1]
>> + texel coordinate is given by Q.xy / (sign(Q.z) * Q.z). We are only concerned
>> + with the magnitudes of the derivatives whose values are not affected by the
>> + sign. We drop the sign from the computation.
>> +
>> + vec2 dx, dy;
>> + float recip;
>> +
>> + recip = 1.0 / Q.z;
>> + dx = recip * ( dqdx.xy - q.xy * (dqdx.z * recip) );
>> + dy = recip * ( dqdy.xy - q.xy * (dqdy.z * recip) );
>> +
>> + Step 3: compute LOD. At this point we have the derivatives of the
>> + texture coordinates normalized to [-1,1]. We take the LOD to be
>> + result = log2( max(sqrt(dot(dx, dx)), sqrt(dy, dy)) * 0.5 * L)
>
> ^ Remove this blank space here
ok
>> + = -1.0 + log2(max(sqrt(dot(dx, dx)), sqrt(dy, dy)) * L)
>> + = -1.0 + log2(sqrt(max(dot(dx, dx), dot(dy,dy))) * L)
>> + = -1.0 + log2(sqrt(l * l * max(dot(dx, dx), dot(dy,dy))))
> ^^^^^^ 'L' instead of 'l'
>> + = -1.0 + 0.5 * log2(L * L * max(dot(dx, dx), dot(dy,dy)))
>> + where L is the dimension of the cubemap. The code is:
>> +
>> + float m, result;
>> + m = max(dot(dx, dx), dot(dy, dy));
>
> Maybe you want to use 'M' instead of 'm' for consistency.
yes
>> + L = textureSize(sampler, 0).x;
>
> Is it always going to be .x (the width) that we need here?
IIRC from discussions with Kevin we could use height too but that will
always match as it is a cubemap, we need the dimension of the lod level
0 here.
>> + result = -1.0 + 0.5 * log2(l * l * m);
> ^^^^^^ 'L' instead of 'l'
ok
>> + */
>> +
>> +/* Helpers to make code more human readable. */
>> +#define EMIT(instr) base_ir->insert_before(instr)
>> +#define THEN(irif, instr) irif->then_instructions.push_tail(instr)
>> +#define CLONE(x) x->clone(mem_ctx, NULL)
>> +
>> + ir_variable *abs_p = temp(mem_ctx, glsl_type::vec3_type, "abs_p");
>> +
>> + EMIT(assign(abs_p, swizzle_for_size(abs(CLONE(ir->coordinate)), 3)));
>> +
>> + ir_variable *Q = temp(mem_ctx, glsl_type::vec3_type, "Q");
>> + ir_variable *dQdx = temp(mem_ctx, glsl_type::vec3_type, "dQdx");
>> + ir_variable *dQdy = temp(mem_ctx, glsl_type::vec3_type, "dQdy");
>> +
>> + /* unmodified dPdx, dPdy values */
>> + ir_rvalue *dPdx = ir->lod_info.grad.dPdx;
>> + ir_rvalue *dPdy = ir->lod_info.grad.dPdy;
>> +
>> + /* 1. compute selector */
>> +
>> + /* if (abs_p.x >= max(abs_p.y, abs_p.z)) ... */
>> + ir_if *branch_x =
>> + new(mem_ctx) ir_if(gequal(swizzle_x(abs_p),
>> + max2(swizzle_y(abs_p), swizzle_z(abs_p))));
>> +
>> + /* q = p.yzx;
>> + * dqdx = dpdx.yzx;
>> + * dqdy = dpdy.yzx;
>> + */
>> + int yzx = MAKE_SWIZZLE4(SWIZZLE_Y, SWIZZLE_Z, SWIZZLE_X, 0);
>
> Isn't that 0 on the 4th component actually the same as SWIZZLE_X? I
> suppose that you just want to make clear that we don't care about that
> component... in that case, could we use something like SWIZZLE_NIL
> instead? If not I guess 0 is good enough.
I took example from existing 3 component swizzles in
builtin_builder::_cross(), they are done this way.
>> + THEN(branch_x, assign(Q, swizzle(CLONE(ir->coordinate), yzx, 3)));
>> + THEN(branch_x, assign(dQdx, swizzle(CLONE(dPdx), yzx, 3)));
>> + THEN(branch_x, assign(dQdy, swizzle(CLONE(dPdy), yzx, 3)));
>> + EMIT(branch_x);
>> +
>> + /* if (abs_p.y >= max(abs_p.x, abs_p.z)) */
>> + ir_if *branch_y =
>> + new(mem_ctx) ir_if(gequal(swizzle_y(abs_p),
>> + max2(swizzle_x(abs_p), swizzle_z(abs_p))));
>> +
>> + /* q = p.xzy;
>> + * dqdx = dpdx.xzy;
>> + * dqdy = dpdy.xzy;
>> + */
>> + int xzy = MAKE_SWIZZLE4(SWIZZLE_X, SWIZZLE_Z, SWIZZLE_Y, 0);
>> + THEN(branch_y, assign(Q, swizzle(CLONE(ir->coordinate), xzy, 3)));
>> + THEN(branch_y, assign(dQdx, swizzle(CLONE(dPdx), xzy, 3)));
>> + THEN(branch_y, assign(dQdy, swizzle(CLONE(dPdy), xzy, 3)));
>> + EMIT(branch_y);
>> +
>> + /* if (abs_p.z >= max(abs_p.x, abs_p.y)) */
>> + ir_if *branch_z =
>> + new(mem_ctx) ir_if(gequal(swizzle_z(abs_p),
>> + max2(swizzle_x(abs_p), swizzle_y(abs_p))));
>> +
>> + /* q = p;
>> + * dqdx = dpdx;
>> + * dqdy = dpdy;
>> + */
>> + THEN(branch_z, assign(Q, swizzle_for_size(CLONE(ir->coordinate), 3)));
>> + THEN(branch_z, assign(dQdx, CLONE(dPdx)));
>> + THEN(branch_z, assign(dQdy, CLONE(dPdy)));
>> + EMIT(branch_z);
>> +
>> + /* 2. quotient rule */
>> + ir_variable *recip = temp(mem_ctx, glsl_type::float_type, "recip");
>> + EMIT(assign(recip, div(new(mem_ctx) ir_constant(1.0f), swizzle_z(Q))));
>> +
>> + ir_variable *dx = temp(mem_ctx, glsl_type::vec(2), "dx");
>> + ir_variable *dy = temp(mem_ctx, glsl_type::vec(2), "dy");
>
> I think we usually use glsl_type::vec2_type instead of the
> glsl_type::vec(components) version?
ok, will change
>> + /* dx = recip * ( dqdx.xy - q.xy * (dqdx.z * recip) );
>> + * dy = recip * ( dqdy.xy - q.xy * (dqdy.z * recip) );
>> + */
>> + EMIT(assign(dx, mul(recip, sub(swizzle_xy(dQdx),
>> + mul(swizzle_xy(Q), mul(swizzle_z(dQdx),
>> + recip))))));
>> + EMIT(assign(dy, mul(recip, sub(swizzle_xy(dQdy),
>> + mul(swizzle_xy(Q), mul(swizzle_z(dQdy),
>> + recip))))));
>
> What if, instead of q.xy * (dqdx.z * recip), we do (q.xy * recip) *
> dqdx.z? In that case the (q.xy * recip) can be emitted only once and its
> result reused for both EMITs.
Yes, this should work, will change. Thanks for the comments, I'll send
v2 soon.
> Iago
>
>> + /* m = max(dot(dx, dx), dot(dy, dy)); */
>> + ir_variable *M = temp(mem_ctx, glsl_type::float_type, "m");
>> + EMIT(assign(M, max2(dot(dx, dx), dot(dy, dy))));
>> +
>> + /* size has textureSize() of LOD 0 */
>> + ir_variable *L = temp(mem_ctx, glsl_type::float_type, "L");
>> + EMIT(assign(L, swizzle_x(size)));
>> +
>> + ir_variable *result = temp(mem_ctx, glsl_type::float_type, "result");
>> +
>> + /* result = -1.0 + 0.5 * log2(L * L * m); */
>> + EMIT(assign(result,
>> + add(new(mem_ctx)ir_constant(-1.0f),
>> + mul(new(mem_ctx)ir_constant(0.5f),
>> + expr(ir_unop_log2, mul(mul(L, L), M))))));
>> +
>> + /* 3. final assignment of parameters to textureLod call */
>> + ir->lod_info.lod = new (mem_ctx) ir_dereference_variable(result);
>> +
>> +#undef THEN
>> +#undef EMIT
>> +
>> } else {
>> ir->lod_info.lod = expr(ir_unop_log2, rho);
>> }
>
>
More information about the mesa-stable
mailing list