[Bug 76862] clamp with bounds inside [0, 1] generates slow code
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Tue Jun 3 18:45:11 PDT 2014
https://bugs.freedesktop.org/show_bug.cgi?id=76862
--- Comment #2 from Matt Turner <mattst88 at gmail.com> ---
Ian's description has bugs, so let me try again:
GLSL's clamp(A, B, C) clamps A to a lower bound of B and an upper bound of C.
We implement this in the compiler with min() and max() operations: min(max(a,
b), c).
i965 assembly for clamp(A, B, C) would look like
(select from A, B the argument that is greater than or equal; i.e., max)
sel.ge tmp, A, B
(select from tmp, C the argument that is less that; i.e., min)
sel.l dst, tmp, C
Saturate is a special case of clamp, specifically when the bounds are 0.0 to
1.0 (for floating point types). Probably all GPUs can perform saturate for free
-- it's a destination modifier in i965 assembly.
The i965 backend's try_emit_saturate() function recognizes min(max(a, 0.0),
1.0) as a saturate operation, and sets the saturate modifier (or emits a MOV
instruction with saturate).
i965 assembly for clamp(A, 0.0, 1.0) (after try_emit_saturate()) would turn
into
mov.sat dst, A
The proposed optimization idea here is that for immediate arguments that
satisfy the condition in comment #0, we can emit a single min/max instruction
with a saturate modifier instead of a min and a max instruction.
So, clamp(A, 0, 0.5) would be
(select the least of A and 0.5, saturate the result, and store in dst)
sel.l.sat dst, A, 0.5
and similarly clamp(A, 0.1, 1) -> sel.ge.sat dst, A, 0.1.
--
You are receiving this mail because:
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-3d-bugs/attachments/20140604/94628b6e/attachment-0001.html>
More information about the intel-3d-bugs
mailing list