Bug in module loading system on x86-64?

Mon Oct 4 08:08:56 PDT 2004

I forgot some of the details:
Here are some facts I remember:
- The code segment on AMD64 resides in low 32bit memory while huge data 
  chunks are allocated in another address range using mmap(). 
- The memory model used in x86_64 code doesn't allow certain addresseing 
  scemes like addressing code outside of the 32bit memory range. 
- The loader loads the modules into data segments it allocates.

The data segment may be allocated anywhere in the 64bit memory space
so that references may fail.
We can circumvent this by using mmap() for allocation together with the
MAP_32BIT flag (see loader.c or elfloader.c). 
For some reason mmap() seems to take this flag only as a hint and allocates
the memory whereever it can if there isn't any memory available in the
low 32bit address range.

Egbert.

Kendall Bennett writes:
 > Hi All,
 > 
 > We accidentally found what appears to be a strange bug in the module 
 > loading system for the X server that might still be present in the latest 
 > versions (we were building with R6.7.0 at the time). Normally this 
 > problem would never surface in practice, but someone who knows more about 
 > the module loading system may want to look into the problem.
 > 
 > Anyway, here is what happens. We had a minor bug in a piece of code in 
 > our X module such that when we ported it from 32-bit to 64-bit on our 
 > AMD64 machine, our code incorrectly read a value from disk as 64-bit when 
 > it was supposed to be 32-bit. Then it tried to malloc a *huge* amount of 
 > memory (20G or more I think). This naturally failed because the machine 
 > onlyl has 512M of physical memory in it and the swap file is only about 
 > 768M in size. Our code happily accepted the failure condition and just 
 > failed to access out monitor database (a non-fatal condition) but then 
 > the X server crashed badly when trying to load specific modules. What 
 > happened is that prior to the huge malloc() call, all modules that were 
 > being loaded (in fact all xalloc() calls) allocated memory very low in 
 > the address space. Once the huge malloc() call was attempted (but 
 > failed), glibc started allocating all new blocks way up in the virtual 
 > memory space for some reason (way above the 4G boundary).
 > 
 > Now normally this should not be a problem, but it caused major issues 
 > with the module loader, specifically when resolving symbols. When we 
 > tried to load either the DDC module or later the framebuffer module when 
 > we disabled the DDC module, none of the symbols were resolved correctly 
 > and as soon as tried to call any of the functions within the module the X 
 > server crashed (the pointers were basically bogus).
 > 
 > Once we fixed the huge malloc problem in our code, the X server started 
 > working as per normal. Normally the X server is never going to allocate 
 > that much memory to cause this condition, so this is not really a serious 
 > issue for production code. But it is still interesting and someone may 
 > want to look into it. We would look into it but we don't have time at the 
 > moment so I figured I should at least mention it on this list (maybe it 
 > can get added to Bugzilla)?
 > 
 > I am sure you could reproduce the problem with any existing driver module 
 > just by attempting to allocate a huge amount of memory prior to loading 
 > any modules, and then the next module symbol you try to call will crash.
 > 
 > Regards,
 > 
 > ---
 > Kendall Bennett
 > Chief Executive Officer
 > SciTech Software, Inc.
 > Phone: (530) 894 8400
 > http://www.scitechsoft.com
 > 
 > ~ SciTech SNAP - The future of device driver technology! ~
 > 
 > 
 > _______________________________________________
 > xorg mailing list
 > xorg at freedesktop.org
 > http://freedesktop.org/mailman/listinfo/xorg