[Patch v2 1/4] Replace i2f() in r600_blit.c with an optimized version.
Steven Fuerst
svfuerst at gmail.com
Sat Aug 11 10:30:19 PDT 2012
We use __fls() to find the most significant bit. Using that, the
loop can be avoided. A second trick is to use the behaviour of the
rotate instructions to expand the range of the unsigned int to float
conversion to the full 32 bits in a branchless way.
The routine is now exact up to 2^24. Above that, we truncate which
is equivalent to rounding towards zero.
Signed-off-by: Steven Fuerst <svfuerst at gmail.com>
---
drivers/gpu/drm/radeon/r600_blit.c | 50 ++++++++++++++++++++----------------
1 file changed, 28 insertions(+), 22 deletions(-)
diff --git a/drivers/gpu/drm/radeon/r600_blit.c b/drivers/gpu/drm/radeon/r600_blit.c
index 3c031a4..326a8da 100644
--- a/drivers/gpu/drm/radeon/r600_blit.c
+++ b/drivers/gpu/drm/radeon/r600_blit.c
@@ -489,29 +489,35 @@ set_default_state(drm_radeon_private_t *dev_priv)
ADVANCE_RING();
}
-static uint32_t i2f(uint32_t input)
+/* 23 bits of float fractional data */
+#define I2F_FRAC_BITS 23
+#define I2F_MASK ((1 << I2F_FRAC_BITS) - 1)
+
+/*
+ * Converts unsigned integer into 32-bit IEEE floating point representation.
+ * Will be exact from 0 to 2^24. Above that, we round towards zero
+ * as the fractional bits will not fit in a float. (It would be better to
+ * round towards even as the fpu does, but that is slower.)
+ */
+static uint32_t i2f(uint32_t x)
{
- u32 result, i, exponent, fraction;
-
- if ((input & 0x3fff) == 0)
- result = 0; /* 0 is a special case */
- else {
- exponent = 140; /* exponent biased by 127; */
- fraction = (input & 0x3fff) << 10; /* cheat and only
- handle numbers below 2^^15 */
- for (i = 0; i < 14; i++) {
- if (fraction & 0x800000)
- break;
- else {
- fraction = fraction << 1; /* keep
- shifting left until top bit = 1 */
- exponent = exponent - 1;
- }
- }
- result = exponent << 23 | (fraction & 0x7fffff); /* mask
- off top bit; assumed 1 */
- }
- return result;
+ uint32_t msb, exponent, fraction;
+
+ /* Zero is special */
+ if (!x) return 0;
+
+ /* Get location of the most significant bit */
+ msb = __fls(x);
+
+ /*
+ * Use a rotate instead of a shift because that works both leftwards
+ * and rightwards due to the mod(32) behaviour. This means we don't
+ * need to check to see if we are above 2^24 or not.
+ */
+ fraction = ror32(x, (msb - I2F_FRAC_BITS) & 0x1f) & I2F_MASK;
+ exponent = (127 + msb) << I2F_FRAC_BITS;
+
+ return fraction + exponent;
}
--
1.7.10.4
More information about the dri-devel
mailing list