[Intel-gfx] [RFC] algorithm for handling bad cachelines
daniel at ffwll.ch
Tue Mar 27 16:50:39 CEST 2012
On Tue, Mar 27, 2012 at 07:19:43AM -0700, Ben Widawsky wrote:
> I wanted to run this by folks before I start doing any actual work.
> This is primarily for GPGPU, or perhaps *really* accurate rendering
> IVB+ has an interrupt to tell us when a cacheline seems to be going bad.
> There is also a mechanism to remap the bad cachelines. The
> implementation details aren't quite clear to me yet, but I'd like to
> enable this feature for userspace.
> Here is my current plan, but it involves filesystem access, so it's
> probably going to get a lot of flames.
> 1. Handle cache line going bad interrupt.
> <After n number of these interrupts to the same line,>
> 2. send a uevent
> 2.5 reset the GPU (docs tell us to)
> <On module load>
> 3. Read a module parameter with a path in the filesystem
> of the list of bad lines. It's not clear to me yet exactly what I need
> to store, but it should be a relatively simple list.
.... path in filesystem is no-go for kernel interface. So bad cachelines
need to go into the modele parameter itself. Or we add a sysfs interface
and reset the gpu (because if my understanding is right, we can't disable
cachelines once the gpu has used them).
> 4. Parse list on driver load, and handle as necessary.
> 5. goto 1.
> Probably the biggest unanswered question is exactly when in the HW
> loading do we have to finish remapping. If it can happen at any time
> while the card is running, I don't need the filesystem stuff, but I
> believe I need to remap the lines quite early in the device bootstrap.
I believe so, too ;-)
> The only alternative I have is a huge comma separated string for a
> module parameter, but I kind of like reading the file better.
Well, you can't read a file from the kernel because we might init the
driver without any userspace present (when the driver is built-in).
> Any feedback is highly appreciated. I couldn't really find much
> precedent for doing this in other drivers, so pointers to similar
> things would also be highly welcome.
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48
More information about the Intel-gfx