[Mesa-dev] [i915g] i915_fpc_optimize_useless_mov is invalid in the general case
Michael Karcher
freedesktop-bugzilla at mkarcher.dialup.fu-berlin.de
Mon Dec 5 14:00:14 PST 2011
Hello developers,
trying some sample programs on the i915 gallium based driver, I stumbled
upon getting black/white rendering in teapot, if using the hardware
pixel shader backend, but getting correct output (yellow textured base)
with software rendering. It turned out that
i915_fpc_optimize_useless_mov is the culprit, as it kills the pixel
shader created by the classic render pipeline:
FRAG
PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
DCL IN[0], COLOR, LINEAR
DCL IN[1], FOG, PERSPECTIVE
DCL IN[2], GENERIC[0], PERSPECTIVE
DCL OUT[0], COLOR
DCL SAMP[0]
DCL CONST[1..3]
DCL TEMP[0..3]
IMM FLT32 { 1.0000, 0.0000, 0.0000, 0.0000}
0: TXP TEMP[0], IN[2].xyyw, SAMP[0], 2D
1: MUL TEMP[0].xyz, TEMP[0], IN[0]
2: MOV TEMP[1].xyz, TEMP[0].xyzx
3: MOV TEMP[1].w, IN[0].wwww
4: MOV TEMP[1].w, TEMP[1]
5: MUL TEMP[2].x, IN[1].xxxx, CONST[1].wwww
6: MUL TEMP[2].x, TEMP[2].xxxx, TEMP[2].xxxx
7: EX2 TEMP[2].x, -TEMP[2].xxxx
8: MOV_SAT TEMP[2].x, TEMP[2].xxxx
9: ADD TEMP[3].x, IMM[0].xxxx, -TEMP[2].xxxx
10: MUL TEMP[3].xyz, CONST[2].xyzz, TEMP[3].xxxx
11: MAD TEMP[1].xyz, TEMP[0].xyzz, TEMP[2].xxxx, TEMP[3].xyzz
12: MOV OUT[0], TEMP[1]
13: END
gets translated without the optimization into
0: BEGIN
1: DCL S[0]
2: DCL T_TEX0
3: DCL T_DIFFUSE
4: DCL T_FOG_W
5: R[0] = TEXLDP S[0],T_TEX0
6: R[0].xyz = MUL R[0], T_DIFFUSE
7: R[1].xyz = MOV R[0].xyzx
8: R[1].w = MOV T_DIFFUSE.wwww
9: R[1].w = MOV R[1]
10: R[2].x = MUL T_FOG_W.wwww, CONST[1].wwww
11: R[2].x = MUL R[2].xxxx, R[2].xxxx
12: R[2].x = EXP R[2].-x-x-x-x
13: R[2].x = SATURATE MOV R[2].xxxx
14: R[3].x = ADD CONST[0].xxxx, R[2].-x-x-x-x
15: R[3].xyz = MUL CONST[2].xyzz, R[3].xxxx
16: R[1].xyz = MAD R[0].xyzz, R[2].xxxx, R[3].xyzz
17: oC = MOV R[1]
18: END
This translation is correct. The mentioned optimization now kicks in
corrrectly removing line 9, but it also reokaces line 6/7 by
6: R[1].xyz = MUL R[0], T_DIFFUSE
This does yield the correct result in R[1].xyz, but it does (of course)
not update R[0].xyz, which is a problem, because R[0].xyzz is used in
line 16. In this special case, we could get away with renaming R[0].xyzz
into R[1].xyzz in line 16, but in the general case, there is no warranty
that R[1].xyz still contains in line 16 what it did in line 7.
Any suggestions? Just remove this optimization? Improve the optimizer to
have it check that the eliminated temporary is not used in any further
lines?
Regards,
Michael Karcher
More information about the mesa-dev
mailing list