[Intel-gfx] [PATCH 15/16] intel_l3_parity: Support error injection

Daniel Vetter daniel at ffwll.ch
Fri Sep 13 18:14:38 CEST 2013


On Fri, Sep 13, 2013 at 5:54 PM, Ben Widawsky <ben at bwidawsk.net> wrote:
> On Fri, Sep 13, 2013 at 11:12:11AM +0200, Daniel Vetter wrote:
>> On Thu, Sep 12, 2013 at 10:28:41PM -0700, Ben Widawsky wrote:
>> > Haswell added the ability to inject errors which is extremely useful for
>> > testing. Add two arguments to the tool to inject, and uninject.
>> >
>> > Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
>>
>> Do we run any risk that a concurrent write/read to the same register range
>> could hang the machine due to the same-cacheline w/a we need? Just want to
>> make sure that when we integrate this into a testcase there's no surprises
>> like with intel_gpu_top ...
>> -Daniel
>
> The race against the kernel is ever present on all tests/tools. Are we
> running parallel igt yet? If so, I can make the read/write functions
> threadsafe.
>
> On this note in particular I suppose we can make a debugfs entry like
> the forcewake one to allow user space to do register accesses.
>
> Interestingly, this also reminds me of another caveat I meant to put in
> the commit message and forgot... the error injection register is also
> per context, which makes it a pain to clear (and the pain in writing the
> test case). I'm even beginning to think maybe a debugfs for this
> register is the way to go.
>
> As a side note, the injection feature is entirely debug only - but
> agreed, random hangs in the test suite is not good.

Hm, this will be tricky. If nothing else writes this range (i.e. not
our interrupt handler) we could use a secure batchbuffer and emit the
MI_LRI from the userspace batch. Then we could submit some workload
using hw contexts that uses the l3$ cache (I guess without something
in there it won't notice the injected error) and after the error is
detected we could simply kill the context, restoring the original
state again.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch



More information about the Intel-gfx mailing list