MTRR setting failure

Mike A. Harris mharris at www.linux.org.uk
Mon Mar 21 02:46:14 PST 2005


Thomas Hellström wrote:
> Hi!
> 
> It seems that the linux MTRR setting code both in Xorg and in kernels 
>  >=2.6 is not doing it's job properly:
> 
> The background is that some BIOSes do not set up MTRR correctly, but 
> just sets up write-combining in a small part of the framebuffer memory. 
> When the X server tries to set it's MTRR this fails because there are 
> overlapping regions of different sizes. I've seen this in most VIA 
> bioses and also newer DELL workstations with ATI Radeons.
> 
> In the via/unichrome driver we have an ugly workaround which first maps 
> the framebuffer as MMIO (uncached) to get rid of any offending MTRR 
> region, then unmaps it and remaps it as write-combining.
> 
> On 2.6 series kernels, removing a wc region does sometimes not work the 
> first time it is tried. You have to give the command twice to make this 
> work. Don't know why. I never saw this in the 2.4 series.
> 
> This makes the code even more ugly, having to map framebuffer as MMIO 
> twice.
> 
> I suggest to
> 
> 1. Change the linux X framebuffer mapping code to remove any offending 
> MTRR regions, possibly saving them for a clean exit.
> 2. Have the code double check if the removal really went through. If 
> not, do it again, repeat a limited number of times.
> 3. Maybe inplement the newer IOCTL removal instead of fprintf(mtrrfile, 
> "disable=%d\n", region); ?
> 
> Failure to properly set up MTRR can be such a performance degrador. 
> Particularly when using Xv, and the user will rarely find out the cause 
> of the slowness.

Currently, more and more as time goes on, people are reporting bugs 
about MTRR either not getting set up at all, or not getting set up 
correctly.  Others point out the warnings in their X logs when reporting 
some other completely unrelated bug report as they grasp at straws 
trying to find something to explain the various problems they experience.

It's sad because MTRR is nothing bug a nasty ugly x86 hack.  A hack that 
was only really necessary for a year or two worth of hardware, until 
later Intel CPUs (and AMD, etc.) came out with PAT support in hardware, 
which more or less makes MTRRs completely unnecessary, as long as the OS 
kernel supports the PAT features of the processor(s).

Some Intel CPUs have some hardware bugs in the PAT feature which is 
unfortunate.  Regardless of that, Microsoft seems to implement PAT in 
Windows without much difficulty for many years now, and it just works 
more or less.

What is really needed, is for the Linux kernel (and other OS kernels) to 
aquire PAT support, as the hardware feature has existed since the 
Pentium Pro and later, and all Athlon CPUs.  Not sure about AMD K6/K5 or 
other brands/models.  The X server should then try to use the interfaces 
the kernel would provide to set cacheability on page level granularity, 
and only try to use MTRRs if the system does not support PAT.

This would solve numerous problems all at once.

Jeff Hartmann wrote PAT patches for the Linux kernel a few years ago,
and submitted to linux-kernel mailing list.  This first implementation
was inadequate, and there was discussion about how to proceed in
moving forward, but nothing happened for many months.

A year or so later, Nvidia (Andy Ritger?  I don't remember
specifically...) posted PAT patches to lkml also, which were more
complete and reasonable, but still not quite "there".  After this, there
have been a few more occasions where PAT discussion occured on lkml,
but to date, nobody has ever completed a working PAT implementation that
is ready for primetime and submitted it to Linus it seems.

Due to the increasing number of MTRR related problem reports, it would
be a good project for someone to take on, to finish off PAT support
and get it accepted in the Linux kernel officially.

Anyone out there interested?



More information about the xorg mailing list