Bug in module loading system on x86-64?
Kendall Bennett
KendallB at scitechsoft.com
Fri Oct 1 11:46:32 PDT 2004
Hi All,
We accidentally found what appears to be a strange bug in the module
loading system for the X server that might still be present in the latest
versions (we were building with R6.7.0 at the time). Normally this
problem would never surface in practice, but someone who knows more about
the module loading system may want to look into the problem.
Anyway, here is what happens. We had a minor bug in a piece of code in
our X module such that when we ported it from 32-bit to 64-bit on our
AMD64 machine, our code incorrectly read a value from disk as 64-bit when
it was supposed to be 32-bit. Then it tried to malloc a *huge* amount of
memory (20G or more I think). This naturally failed because the machine
onlyl has 512M of physical memory in it and the swap file is only about
768M in size. Our code happily accepted the failure condition and just
failed to access out monitor database (a non-fatal condition) but then
the X server crashed badly when trying to load specific modules. What
happened is that prior to the huge malloc() call, all modules that were
being loaded (in fact all xalloc() calls) allocated memory very low in
the address space. Once the huge malloc() call was attempted (but
failed), glibc started allocating all new blocks way up in the virtual
memory space for some reason (way above the 4G boundary).
Now normally this should not be a problem, but it caused major issues
with the module loader, specifically when resolving symbols. When we
tried to load either the DDC module or later the framebuffer module when
we disabled the DDC module, none of the symbols were resolved correctly
and as soon as tried to call any of the functions within the module the X
server crashed (the pointers were basically bogus).
Once we fixed the huge malloc problem in our code, the X server started
working as per normal. Normally the X server is never going to allocate
that much memory to cause this condition, so this is not really a serious
issue for production code. But it is still interesting and someone may
want to look into it. We would look into it but we don't have time at the
moment so I figured I should at least mention it on this list (maybe it
can get added to Bugzilla)?
I am sure you could reproduce the problem with any existing driver module
just by attempting to allocate a huge amount of memory prior to loading
any modules, and then the next module symbol you try to call will crash.
Regards,
---
Kendall Bennett
Chief Executive Officer
SciTech Software, Inc.
Phone: (530) 894 8400
http://www.scitechsoft.com
~ SciTech SNAP - The future of device driver technology! ~
More information about the xorg
mailing list