[Liboil] alignment problems with sse2
M.Drochner at fz-juelich.de
Thu Aug 31 15:28:51 PDT 2006
ds at schleef.org said:
> This is a known problem, see http://bugs.debian.org/cgi-bin/
Thanks, this at least gives me an idea what to expect
from gcc and what not. I don't think it is a good idea
to expect a stack alignment which is stricter than what
the official ABI requires. As it appears, this won't change
anytime soon, so we have to live with it.
So the wrapper enforcing 16-byte alignment is basically OK
as I see it. Not nice, but reasonable in that situation.
> The best solution at this point (not yet implemented) is to fix
> all the liboil functions to not use local variables for SSE.
I wouldn't call that a fix but a workaround.
In the process, I've stumbled over some related facets of the
problem which, while not of immediate help, you might find
-On sse2 alignment bugs, the x86 CPU issues a GPF.
-It is not easy to find out the address of the (misaligned)
data reference in the trap handler. The CPU doesn't set CR2
nor any other register containing the address directly. The
only way I can imagine is to analyze the instruction at %pc,
the registers involved etc.
-Linux does appearently issue a SIGSEGV on such GPFs. (NetBSD
does so too, but I might be motivated to change this, as soon
as I'm sure there are no bad effects to other programs.)
-If using SA_SIGINFO, it is expected that a SIGSEGV handler gets
the faulting address as "si_addr". I don't know Linux internals
enough to tell whether they bother to do that correctly. NetBSD
as of now doesn't. I'm aware of the fact that liboil doesn't
use SA_SIGINFO, but issuing different signals depending on
whether siginfo is requested or not would be nonsense.
-So I'd say it makes more sense to issue a SIGILL at that point.
"si_addr" is expected to contain the address of the faulting
instruction here which is easily gathered.
(The Linux "sigaction" manpage is a bit fishy here, but other
UNIX'es are quite clear.)
-liboil happens to catch SIGILL in the initial test functions,
so if the OS issued SIGILL on those alignment problems, the
functions would be sorted out early and the application would
just work. (I've tested this on a modified NetBSD.)
It is of course not given that stack alignment during initial
test and final deployment is the same, but for me it happens
to be the case -- or it is just the test which exhibits the
problems, don't know yet.
This might suggest to catch SIGSEGV as well during initial
tests. Not a clean thing, but helpful.
More information about the Liboil