[Mesa-dev] [PATCH 04/10] i965/vs: Replace brw_vs_emit.c with dumping code into the vec4_visitor.
Eric Anholt
eric at anholt.net
Fri Oct 26 15:56:50 PDT 2012
Kenneth Graunke <kenneth at whitecape.org> writes:
> Rather than having two separate backends, just create a small layer that
> translates the subset of Mesa IR used for ARB_vertex_program and fixed
> function programs to the Vec4 IR. This allows us to use the same
> optimization passes, code generator, register allocator as for GLSL.
> + if (shader) {
> + move_grf_array_access_to_scratch();
> + move_uniform_array_access_to_pull_constants();
> + } else {
> + /* The ARB_vertex_program frontend emits pull constant loads directly
> + * rather than using reladdr, so we don't need to walk through all the
> + * instructions looking for things to move. There isn't anything.
> + *
> + * We do still need to split things to vec4 size.
> + */
> + split_uniform_registers();
This will blow up if you have a more-than-push-constants-sized pile of
non-constant but constant-indexed params I think.
> + * Implementation of the compiler for ARB_vertex_program, NV_vertex_program,
> + * and NV_vertex_program1_1 shaders on top of the GLSL compiler backend.
> + */
No more NV_vertex_program. Woo!
> + case OPCODE_EXP: {
> + dst_reg t = dst;
> + if (vpi->DstReg.WriteMask & WRITEMASK_X) {
> + src_reg floor = src_reg(this, glsl_type::float_type);
> + emit(RNDD(dst_reg(floor), src[0]));
> + t.writemask = WRITEMASK_X;
> + #warning "EXP code regression: use SHL, not EXP2 for .x"
The #warnings would need to change before landing.
> + emit_math(SHADER_OPCODE_EXP2, t, floor);
> + }
> + case OPCODE_LIT: {
> + dst_reg result = dst;
> + /* From the ARB_vertex_program spec:
> + *
> + * tmp = VectorLoad(op0);
> + * if (tmp.x < 0) tmp.x = 0;
> + * if (tmp.y < 0) tmp.y = 0;
> + * if (tmp.w < -(128.0-epsilon)) tmp.w = -(128.0-epsilon);
> + * else if (tmp.w > 128-epsilon) tmp.w = 128-epsilon;
> + * result.x = 1.0;
> + * result.y = tmp.x;
> + * result.z = (tmp.x > 0) ? RoughApproxPower(tmp.y, tmp.w) : 0.0;
> + * result.w = 1.0;
> + *
> + * Note that we don't do the clamping to +/- 128. We didn't in
> + * brw_vs_emit.c either.
> + */
> + if (vpi->DstReg.WriteMask & WRITEMASK_XW) {
> + result.writemask = WRITEMASK_XW;
> + emit(MOV(result, src_reg(1.0f)));
> + }
> + if (vpi->DstReg.WriteMask & WRITEMASK_YZ) {
> + result.writemask = WRITEMASK_YZ;
> + emit(MOV(result, src_reg(0.0f)));
> +
> + src_reg tmp_x = src[0];
> + tmp_x.swizzle = BRW_SWIZZLE_XXXX;
This should be a reswizzle of tmp_x.swizzle I think, and similarly below.
> + case OPCODE_MAD: {
> + #warning "code regression: need MAC"
> + src_reg temp = src_reg(this, glsl_type::vec4_type);
> + emit(MUL(dst_reg(temp), src[0], src[1]));
> + emit(ADD(dst, temp, src[2]));
> + break;
> + }
I really want to do mad/mac as a post-visit peephole process in both vs
and fs I think. With that and just another step or two, we'd be able to
dump tree grafting.
> + // probably not true, but I need to update this code when it breaks
Inconsistent // comments with the rest of our code
> + if (src.Swizzle != SWIZZLE_NOOP || src.Negate) {
> + unsigned short zeros_mask = 0;
> + unsigned short ones_mask = 0;
> + unsigned short src_mask = 0;
> + unsigned short src_swiz[4];
> +
> + for (int i = 0; i < 4; i++) {
> + /* The ZERO, ONE, and Negate options are only used for OPCODE_SWZ,
> + * but it's simplest to handle it here.
> + */
> + int s = GET_SWZ(src.Swizzle, i);
> + switch (s) {
> + case SWIZZLE_X:
> + case SWIZZLE_Y:
> + case SWIZZLE_Z:
> + case SWIZZLE_W:
> + src_mask |= 1 << i;
> + src_swiz[i] = s;
> + break;
> + case SWIZZLE_ZERO:
> + zeros_mask |= 1 << i;
> + break;
> + case SWIZZLE_ONE:
> + ones_mask |= 1 << i;
> + break;
> + }
> + }
> +
> + result.swizzle =
> + BRW_SWIZZLE4(src_swiz[0], src_swiz[1], src_swiz[2], src_swiz[3]);
Use of uninitialized src_swiz[] values on the stack in the ZERO/ONE
case. If they happened to be >3, I think things would go badly as the
fields overflow.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20121026/aedfc034/attachment.pgp>
More information about the mesa-dev
mailing list