<div>Looking through the kernel radeon drm source, it looks like the i2f() functions in r600_blit.c and r600_blit_ksm() can be optimized a bit.</div><div><br></div><div>The following extends the range to all unsigned 32bit integers, and avoids the slow loop by using the bsr instruction via __fls(). It provides an exact 1-1 correspondence up to 2^24. Above that, there is the inevitable rounding. This routine rounds towards zero (truncation).</div>
<div><br></div><div>/* 23 bits of float fractional data */</div><div>#define I2F_FRAC_BITS<span class="Apple-tab-span" style="white-space:pre"> </span>23</div><div>#define I2F_MASK ((1 << I2F_FRAC_BITS) - 1)</div><div>
<br></div><div>/*</div><div> * Converts an unsigned integer into 32-bit IEEE floating point representation.</div><div> * Will be exact from 0 to 2^24. Above that, we round towards zero</div><div> * as the fractional bits will not fit in a float. (It would be better to</div>
<div> * round towards even as the fpu does, but that is slower.)</div><div> * This routine depends on the mod(32) behaviour of the rotate instructions</div><div> * on x86.</div><div> */</div><div>uint32_t i2f(uint32_t x)</div>
<div>{</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>uint32_t msb, exponent, fraction;</div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>/* Zero is special */</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>if (!x) return 0;</div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>/* Get location of the most significant bit */</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>msb = __fls(x);</div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>/*</div><div><span class="Apple-tab-span" style="white-space:pre"> </span> * Use a rotate instead of a shift because that works both leftwards</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span> * and rightwards due to the mod(32) beahviour. This means we don't</div><div><span class="Apple-tab-span" style="white-space:pre"> </span> * need to check to see if we are above 2^24 or not.</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span> */</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>fraction = ror32(x, msb - I2F_FRAC_BITS) & I2F_MASK;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>exponent = (127 + msb) << I2F_FRAC_BITS;</div>
<div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>return fraction + exponent;</div><div>}</div><div><br></div><div>Steven Fuerst</div>