[Intel-gfx] [RFC] algorithm for handling bad cachelines

Daniel Vetter daniel at ffwll.ch
Tue Mar 27 16:50:39 CEST 2012


On Tue, Mar 27, 2012 at 07:19:43AM -0700, Ben Widawsky wrote:
> I wanted to run this by folks before I start doing any actual work.
> 
> This is primarily for GPGPU, or perhaps *really* accurate rendering
> requirements.
> 
> IVB+ has an interrupt to tell us when a cacheline seems to be going bad.
> There is also a mechanism to remap the bad cachelines. The
> implementation details aren't quite clear to me yet, but I'd like to
> enable this feature for userspace.
> 
> Here is my current plan, but it involves filesystem access, so it's
> probably going to get a lot of flames.
> 
> 1. Handle cache line going bad interrupt.
> <After n number of these interrupts to the same line,>
> 2. send a uevent
> 2.5 reset the GPU (docs tell us to)
> <On module load>
> 3. Read  a module parameter with a path in the filesystem
> of the list of bad lines. It's not clear to me yet exactly what I need
> to store, but it should be a relatively simple list.

.... path in filesystem is no-go for kernel interface. So bad cachelines
need to go into the modele parameter itself. Or we add a sysfs interface
and reset the gpu (because if my understanding is right, we can't disable
cachelines once the gpu has used them).

> 4. Parse list on driver load, and handle as necessary.
> 5. goto 1.
> 
> Probably the biggest unanswered question is exactly when in the HW
> loading do we have to finish remapping. If it can happen at any time
> while the card is running, I don't need the filesystem stuff, but I
> believe I need to remap the lines quite early in the device bootstrap.

I believe so, too ;-)

> The only alternative I have is a huge comma separated string for a
> module parameter, but I kind of like reading the file better.

Well, you can't read a file from the kernel because we might init the
driver without any userspace present (when the driver is built-in).

> Any feedback is highly appreciated. I couldn't really find much
> precedent for doing this in other drivers, so pointers to similar
> things would also be highly welcome.
-Daniel
-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48



More information about the Intel-gfx mailing list