[Mesa-dev] llvmpipe broken on Skylake Pentium (LP_NATIVE_VECTOR_WIDTH=128)

Ilia Mirkin imirkin at alum.mit.edu
Mon Oct 12 12:40:26 PDT 2015


On Mon, Oct 12, 2015 at 3:27 PM, Adam Jackson <ajax at redhat.com> wrote:
> I'm having some difficulty getting llvmpipe working on a Skylake
> Pentium, which has the charming property of not having AVX support at
> all (Skylake Cores have AVX2, and Xeons have AVX512, but Pentium seems

Sounds an awful lot like what I have in an Core i7-920 ILK:

fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology
nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3
cx16 xtpr pdcm sse4_1 sse4_2 popcnt lahf_lm ida dtherm tpr_shadow vnmi
flexpriority ept vpid

I have no trouble with llvmpipe... you could try using
GALLIUM_DUMP_CPU=true on a debug build -- should tell you what gallium
thinks your cpu has.

  -ilia

> to be the new way of spelling Celeron).  Currently I'm trying this with
> llvm 3.6.2 and Mesa 10.6.5, but llvm 3.7 doesn't seem to be any better.
>
> The error I'm getting is:
>
> $ DISPLAY=:2 LIBGL_DRIVERS_PATH=`pwd`/lib64/gallium LP_NATIVE_VECTOR_WIDTH=128 /usr/lib64/mesa/gloss
> LLVM ERROR: Cannot select: intrinsic %llvm.x86.sse41.pblendvb
>
> (Setting LP_NATIVE_VECTOR_WIDTH like that seems to be effective at
> triggering this on Skylake Core, but in the name of paranoia I'm
> emulating a Pentium by patching kvm to mask off the AVX bits of
> cpuflags: https://ajax.fedorapeople.org/qemu-pseudoskl.patch )
>
> That does indeed seem to be the pblendvb intrinsic from
> lp_build_select(), from lp_build_depth_stencil_test(), and at that
> point I get (against Xvfb, giving me a z32 depth format):
>
> (gdb) p bld->type
> $1 = {floating = 0, fixed = 0, sign = 0, norm = 0, width = 32, length = 4}
>
> There are several other paths through lp_build_select() that look like
> they could work, but don't.  If I turn on the if (0)'d vector select
> path, I get something like:
>
> LLVM ERROR: Cannot select: 0xc23df0: v4i32 = X86ISD::SMAX 0xc1caf0, 0xc25020 [ORD=103] [ID=189]
>   0xc1caf0: v4i32 = X86ISD::VSRL 0xc3a810, 0xc269e0 [ORD=102] [ID=179]
>     0xc3a810: v4i32 = bitcast 0xc096b0 [ORD=95] [ID=150]
>       0xc096b0: v2i64,ch = X86ISD::VZEXT_LOAD 0xc28160, 0xc1f750<LD8[%sunkaddr145](align=4)> [ORD=95] [ID=140]
>         0xc1f750: i64 = add 0xc22560, 0xc21220 [ORD=83] [ID=102]
>           0xc22560: i64,ch = CopyFromReg 0xbca1c0, 0xc21880 [ORD=82] [ID=76]
>             0xc21880: i64 = Register %vreg26 [ID=28]
>           0xc21220: i64 = Constant<232> [ID=29]
>     0xc269e0: v4i32 = X86ISD::VZEXT_MOVL 0xc09c00 [ORD=102] [ID=169]
>       0xc09c00: v4i32 = scalar_to_vector 0xc24be0 [ORD=102] [ID=160]
>         0xc24be0: i32 = truncate 0xc22de0 [ORD=99] [ID=151]
>           0xc22de0: i64,ch = load 0xc28160, 0xc22450, 0xc0a9f0<LD4[%sunkaddr154], sext from i32> [ORD=104] [ID=141]
>             0xc22450: i64 = add 0xc22560, 0xc09f30 [ORD=97] [ID=100]
>               0xc22560: i64,ch = CopyFromReg 0xbca1c0, 0xc21880 [ORD=82] [ID=76]
>                 0xc21880: i64 = Register %vreg26 [ID=28]
>               0xc09f30: i64 = Constant<244> [ID=31]
>             0xc0a9f0: i64 = undef [ID=4]
>   0xc25020: v4i32 = bitcast 0xc06ec0 [ORD=9] [ID=119]
>     0xc06ec0: v2i64,ch = load 0xbca1c0, 0xc05fc0, 0xc0a9f0<LD16[ConstantPool]> [ORD=9] [ID=104]
>       0xc05fc0: i64 = X86ISD::Wrapper 0xc21550 [ID=78]
>         0xc21550: i64 = TargetConstantPool<<4 x i32> <i32 1, i32 1, i32 1, i32 1>> 0 [ID=47]
>       0xc0a9f0: i64 = undef [ID=4]
> In function: fs57_variant0_partial
>
> I get the same result for either the BuildTrunc or BuildICmp paths
> through the if (0) at the top, and I also get the same result if I just
> fall through to lp_build_select_bitwise().
>
> This doesn't seem to be the only breakage.  lp_test_format dies with:
>
> LLVM ERROR: Cannot select: 0x231e090: v4i32 = X86ISD::UMIN 0x2346b70, 0x231bf80 [ORD=5] [ID=30]
>   0x2346b70: v4i32 = bitcast 0x2346840 [ORD=3] [ID=29]
>     0x2346840: v2i64 = scalar_to_vector 0x2346c80 [ORD=3] [ID=27]
>       0x2346c80: i64,ch = load 0x236abd0, 0x231b3d0, 0x231ba30<LD8[%4](align=4)> [ORD=3] [ID=24]
>         0x231b3d0: i64,ch = CopyFromReg 0x236abd0, 0x231b2c0 [ORD=1] [ID=20]
>           0x231b2c0: i64 = Register %vreg1 [ID=2]
>         0x231ba30: i64 = undef [ID=4]
>   0x231bf80: v4i32 = bitcast 0x231be70 [ORD=5] [ID=28]
>     0x231be70: v2i64,ch = load 0x236abd0, 0x231d920, 0x231ba30<LD16[ConstantPool]> [ORD=5] [ID=25]
>       0x231d920: i64 = X86ISD::Wrapper 0x231e3c0 [ID=22]
>         0x231e3c0: i64 = TargetConstantPool<<4 x i32> <i32 1, i32 1, i32 1, i32 1>> 0 [ID=14]
>       0x231ba30: i64 = undef [ID=4]
> In function: fetch_r32g32_uscaled_unorm8
>
> lp_test_arit dies with:
>
> LLVM ERROR: Cannot select: intrinsic %llvm.x86.sse41.round.ps
>
> lp_test_conv dies with:
>
> LLVM ERROR: Cannot select: 0xd79a20: v4i32 = X86ISD::SMIN 0xd796f0, 0xd794d0 [ORD=8] [ID=25]
>   0xd796f0: v4i32 = X86ISD::SMAX 0xd79f70, 0xd696b0 [ORD=6] [ID=23]
>     0xd79f70: v4i32 = bitcast 0xd698d0 [ORD=4] [ID=21]
>       0xd698d0: v2i64,ch = load 0xd43920, 0xd69380, 0xd69050<LD16[%3]> [ORD=4] [ID=17]
>         0xd69380: i64 = add 0xd68c10, 0xd69270 [ORD=3] [ID=13]
>           0xd68c10: i64,ch = CopyFromReg 0xd43920, 0xd68b00 [ORD=1] [ID=8]
>             0xd68b00: i64 = Register %vreg0 [ID=1]
>           0xd69270: i64 = Constant<16> [ID=4]
>         0xd69050: i64 = undef [ID=3]
>     0xd696b0: v4i32 = bitcast 0xd695a0 [ORD=5] [ID=18]
>       0xd695a0: v2i64,ch = load 0xd43920, 0xd793c0, 0xd69050<LD16[ConstantPool]> [ORD=5] [ID=14]
>         0xd793c0: i64 = X86ISD::Wrapper 0xd7a2a0 [ID=10]
>           0xd7a2a0: i64 = TargetConstantPool<<4 x i32> <i32 -32768, i32 -32768, i32 -32768, i32 -32768>> 0 [ID=6]
>         0xd69050: i64 = undef [ID=3]
>   0xd794d0: v4i32 = bitcast 0xd795e0 [ORD=7] [ID=19]
>     0xd795e0: v2i64,ch = load 0xd43920, 0xd79910, 0xd69050<LD16[ConstantPool]> [ORD=7] [ID=15]
>       0xd79910: i64 = X86ISD::Wrapper 0xd7a190 [ID=11]
>         0xd7a190: i64 = TargetConstantPool<<4 x i32> <i32 32767, i32 32767, i32 32767, i32 32767>> 0 [ID=7]
>       0xd69050: i64 = undef [ID=3]
> In function: test
>
> All of the above lp_test_* failures can be triggered by setting
> LP_NATIVE_VECTOR_WIDTH when running make check so I don't think my kvm
> patch is to blame.
>
> I'm a little out of my depth trying to track this down, any ideas?
>
> - ajax
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list