[Mesa-dev] [PATCH v2 4/4] i965: Properly handle integer types in opt_vector_float().

Matt Turner mattst88 at gmail.com
Wed Apr 20 19:12:24 UTC 2016


On Tue, Apr 19, 2016 at 12:44 PM, Matt Turner <mattst88 at gmail.com> wrote:
> On Mon, Apr 18, 2016 at 11:52 PM, Kenneth Graunke <kenneth at whitecape.org> wrote:
>> Previously, opt_vector_float() always interpreted MOV sources as
>> floating point, and always created a MOV with a F-type destination.
>>
>> This meant that we could mess up sequences of integer loads, such as:
>>
>>    mov vgrf6.0.x:D, 0D
>>    mov vgrf6.0.y:D, 1D
>>    mov vgrf6.0.z:D, 2D
>>    mov vgrf6.0.w:D, 3D
>>
>> Here, integer 0/1/2/3 become approximately 0.0f, so we generated:
>>
>>    mov vgrf6.0:F, [0F, 0F, 0F, 0F]
>>
>> which is clearly wrong.  We can properly handle this by converting
>> integer values to float (rather than bitcasting), and emitting a type
>> converting MOV:
>>
>>    mov vgrf6.0:D, [0F, 1F, 2F, 3F]
>>
>> To do this, see first see if the integer values (converted to float)
>> are representable.  If so, we use a D-type MOV.  If not, we then try
>> the floating point values and an F-type MOV.  We make zero not impose
>> type restrictions.  This is important because 0D would imply a D-type
>> MOV, but is often used in sequences such as MOV 0D, MOV 0x3f800000D,
>> where we want to use an F-type MOV.
>>
>> Fixes about 54 dEQP-GLES2 failures with the vec4 VS backend.  This
>> recently became visible due to changes in opt_vector_float() which
>> made it optimize more cases, but it was a pre-existing bug.
>>
>> v2: Handle the type of zero better.
>>
>> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
>
> This patch has these shader-db stats on HSW:
>
> total instructions in shared programs: 7084195 -> 7082191 (-0.03%)
> instructions in affected programs: 246027 -> 244023 (-0.81%)
> helped: 1937
>
> total cycles in shared programs: 65669642 -> 65651968 (-0.03%)
> cycles in affected programs: 531064 -> 513390 (-3.33%)
> helped: 1177

Looks like a lot of vertex shaders benefit because they compare
against ivec4(0, 1, 2, 3) which we now load as a VF in one
instruction. Neat!

The series is

Reviewed-by: Matt Turner <mattst88 at gmail.com>


More information about the mesa-dev mailing list