[Libva] vainfo crashes with "Program received signal SIGFPE, Arithmetic exception." Need help to troubleshoot.
dar8757 at eml.cc
dar8757 at eml.cc
Tue Feb 26 07:46:24 PST 2013
Hi Gwenole,
> How different is box #1 from box #2, i.e. chipset?
Similarly, not identically, hardware configured. Same OS & s/w
versions.
@ box1
hwinfo --cpu
01: None 00.0: 10103 CPU
[Created at cpu.374]
Hardware Class: cpu
Arch: X86-64
Vendor: "AuthenticAMD"
Model: 16.4.2 "AMD Phenom(tm) II X4 945 Processor"
Features:
fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,mmx,fxsr,sse,sse2,ht,syscall,nx,mmxext,fxsr_opt,pdpe1gb,rdtscp,lm,3dnowext,3dnow,constant_tsc,rep_good,nopl,nonstop_tsc,extd_apicid,pni,monitor,cx16,popcnt,lahf_lm,cmp_legacy,svm,extapic,cr8_legacy,abm,sse4a,misalignsse,3dnowprefetch,osvw,ibs,skinit,wdt,hw_pstate,npt,lbrv,svm_lock,nrip_save
Clock: 1800 MHz
BogoMips: 6020.94
Cache: 512 kb
Units/Processor: 4
Config Status: cfg=no, avail=yes, need=no,
active=unknown
hwinfo --gfxcard
32: PCI 100.0: 0300 VGA compatible controller (VGA)
[Created at pci.319]
SysFS ID:
/devices/pci0000:00/0000:00:02.0/0000:01:00.0
SysFS BusID: 0000:01:00.0
Hardware Class: graphics card
Model: "nVidia G94 [GeForce 9600 GT]"
Vendor: pci 0x10de "nVidia Corporation"
Device: pci 0x0622 "G94 [GeForce 9600 GT]"
SubVendor: pci 0x1043 "ASUSTeK Computer Inc."
SubDevice: pci 0x82fc
Revision: 0xa1
Driver: "nvidia"
Driver Modules: "nvidia"
Memory Range: 0xfa000000-0xfaffffff
(rw,non-prefetchable)
Memory Range: 0xd0000000-0xdfffffff
(ro,non-prefetchable)
Memory Range: 0xf8000000-0xf9ffffff
(rw,non-prefetchable)
I/O Ports: 0xac00-0xac7f (rw)
Memory Range: 0xfbb80000-0xfbbfffff
(ro,non-prefetchable,disabled)
IRQ: 18 (1547193 events)
I/O Ports: 0x3c0-0x3df (rw)
Module Alias:
"pci:v000010DEd00000622sv00001043sd000082FCbc03sc00i00"
Driver Info #0:
Driver Status: nouveau is not active
Driver Activation Cmd: "modprobe nouveau"
Driver Info #1:
Driver Status: nvidia is active
Driver Activation Cmd: "modprobe nvidia"
Config Status: cfg=no, avail=yes, need=no,
active=unknown
Attached to: #10 (PCI bridge)
lsmod | grep nvidia
nvidia 9354600 48
nvidia-settings -v
nvidia-settings: version 310.32
(buildmeister at swio-display-x86-rhel47-01) Mon Jan 14
15:51:37 PST 2013
@ box2
hwinfo --cpu
01: None 00.0: 10103 CPU
[Created at cpu.374]
Unique ID: rdCR.j8NaKXDZtZ6
Hardware Class: cpu
Arch: X86-64
Vendor: "AuthenticAMD"
Model: 15.107.1 "AMD Athlon(tm) 64 X2 Dual Core
Processor 4000+"
Features:
fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,mmx,fxsr,sse,sse2,ht,syscall,nx,mmxext,fxsr_opt,rdtscp,lm,3dnowext,3dnow,rep_good,nopl,extd_apicid,pni,cx16,lahf_lm,cmp_legacy,svm,extapic,cr8_legacy,3dnowprefetch,lbrv
Clock: 2100 MHz
BogoMips: 4218.62
Cache: 512 kb
Units/Processor: 2
Config Status: cfg=no, avail=yes, need=no,
active=unknown
hwinfo --gfxcard
33: PCI 300.0: 0300 VGA compatible controller (VGA)
[Created at pci.319]
SysFS ID:
/devices/pci0000:00/0000:00:04.0/0000:03:00.0
SysFS BusID: 0000:03:00.0
Hardware Class: graphics card
Model: "nVidia G98 [GeForce 8400 GS]"
Vendor: pci 0x10de "nVidia Corporation"
Device: pci 0x06e4 "G98 [GeForce 8400 GS]"
Revision: 0xa1
Driver: "nvidia"
Driver Modules: "nvidia"
Memory Range: 0xfd000000-0xfdffffff
(rw,non-prefetchable)
Memory Range: 0xd0000000-0xdfffffff
(ro,non-prefetchable)
Memory Range: 0xfa000000-0xfbffffff
(rw,non-prefetchable)
I/O Ports: 0xdc00-0xdc7f (rw)
Memory Range: 0xfeae0000-0xfeafffff
(ro,non-prefetchable,disabled)
IRQ: 19 (7441 events)
I/O Ports: 0x3c0-0x3df (rw)
Module Alias:
"pci:v000010DEd000006E4sv00000000sd00000000bc03sc00i00"
Driver Info #0:
Driver Status: nouveau is not active
Driver Activation Cmd: "modprobe nouveau"
Driver Info #1:
Driver Status: nvidia is active
Driver Activation Cmd: "modprobe nvidia"
Config Status: cfg=new, avail=yes, need=no,
active=unknown
Attached to: #17 (PCI bridge)
lsmod | grep nvidia
nvidia 9354600 38
nvidia-settings -v
nvidia-settings: version 310.32
(buildmeister at swio-display-x86-rhel47-01) Mon Jan 14
15:51:37 PST 2013
> Could you please try to figure out how __vaDriverInit() failed on box #2?
...
> - XOpenDisplay() failed -- do you have a correct DISPLAY or are you
> issuing vainfo right from box #2 terminal ?
Hm. Perhaps an "Aha!" moment ... ??
I've been issuing both box's `vainfo` commands in terminals opened on
box #1.
In that case,
@ box #1
vainfo
libva info: VA-API version 0.33.0
libva info: va_getDriverName() returns 0
libva info: Trying to open
/usr/local/lib64/dri/nvidia_drv_video.so
libva info: Found init function __vaDriverInit_0_33
libva info: va_openDriver() returns 0
vainfo: VA-API version: 0.33 (libva 1.1.1.pre1)
vainfo: Driver version: Splitted-Desktop Systems VDPAU
backend for VA-API - 0.7.5.pre1
vainfo: Supported profile and entrypoints
VAProfileMPEG2Simple : VAEntrypointVLD
VAProfileMPEG2Main : VAEntrypointVLD
VAProfileH264Main : VAEntrypointVLD
VAProfileH264High : VAEntrypointVLD
VAProfileVC1Simple : VAEntrypointVLD
VAProfileVC1Main : VAEntrypointVLD
VAProfileVC1Advanced : VAEntrypointVLD
@ box #2
> Do you at least see the "vainfo: Driver version:" message?
NOT if i exec `vainfo` in the box #2 terminal window on box #1. i.e., I
see:
vainfo
libva info: VA-API version 0.33.0
libva info: va_getDriverName() returns 0
libva info: Trying to open
/usr/local/lib64/dri/nvidia_drv_video.so
libva info: Found init function __vaDriverInit_0_33
Floating point exception
BUT,
> - XOpenDisplay() failed -- do you have a correct DISPLAY or are you
> issuing vainfo right from box #2 terminal ?
if i open a terminal *ON* box #2, and exec `vainfo` _there_, it works!
vainfo
libva info: VA-API version 0.33.0
libva info: va_getDriverName() returns 0
libva info: Trying to open
/usr/local/lib64/dri/nvidia_drv_video.so
libva info: Found init function __vaDriverInit_0_33
libva info: va_openDriver() returns 0
vainfo: VA-API version: 0.33 (libva 1.1.1.pre1)
vainfo: Driver version: Splitted-Desktop Systems VDPAU
backend for VA-API - 0.7.5.pre1
vainfo: Supported profile and entrypoints
VAProfileMPEG2Simple : VAEntrypointVLD
VAProfileMPEG2Main : VAEntrypointVLD
VAProfileH264Main : VAEntrypointVLD
VAProfileH264High : VAEntrypointVLD
VAProfileVC1Simple : VAEntrypointVLD
VAProfileVC1Main : VAEntrypointVLD
VAProfileVC1Advanced : VAEntrypointVLD
I _believe_ I have "correct DISPLAY". At least, while ON box #1, in a
terminal window opened to a box #2 shell, I can:
(1) verify
echo $DISPLAY
localhost:10.0
(2) launch a graphical/ncurses app, e.g. `mc`
(3) launch a remote X app, e.g., `xterm`
TBH, I don't know if that's what `vainfo` is expecting ...
So, before digging around further for
> There are multiple bugs:
> ...
> There are multiple causes to your initialization failures in your box #2:
> ...
since `vainfo` seems to work when exec'd ON the physical box #2, is it
PEBKAC & "problem solved"? Or are there still issues here?
Cheers.
p.s. Out of curiosity, I removed my DIY'd vainfo installs, and prereqs,
and installed `vainfo`, as provided by distro's packaging on both
machines -- and find that THAT instance fails -- exactly as reported
here -- in ALL cases: on both boxes, regardless of whether via shell
@remote or @local terminal.
More information about the Libva
mailing list