[Intel-gfx] [PATCH 15/21] drm/i915/gtt: Fill scratch page
Mika Kuoppala
mika.kuoppala at linux.intel.com
Thu Jun 11 09:37:04 PDT 2015
Tomas Elf <tomas.elf at intel.com> writes:
> On 22/05/2015 18:05, Mika Kuoppala wrote:
>> During review of dynamic page tables series, I was able
>> to hit a lite restore bug with execlists. I assume that
>> due to incorrect pd, the batch run out of legit address space
>> and into the scratch page area. The ACTHD was increasing
>> due to scratch being all zeroes (MI_NOOPs). And as gen8
>> address space is quite large, the hangcheck happily waited
>> for a long long time, keeping the process effectively stuck.
>>
>> According to Chris Wilson any modern gpu will grind to halt
>> if it encounters commands of all ones. This seemed to do the
>> trick and hang was declared promptly when the gpu wandered into
>> the scratch land.
>>
>> v2: Use 0xffff00ff pattern (Chris)
>
> Just for my own benefit:
>
> 1. Is there any particular reason for this pattern rather than 0xffffffff?
>
> 2. Someone please correct me if I'm wrong here but at least based on my
> own experiences with gen9 submitting batch buffers filled with bad
> instructions (0xffffffff) to the GPU does not hang it. I'm guessing that
> is because there's allegedly a hardware security parser that MI_NOOPs
> out invalid instructions during execution. If that's the case here then
> I guess we might have to come up with something else for gen9+ if we
> want to induce engine hangs once the execution reaches the scratch page?
>
If that is the case with gen9, then we need more ducttape. Like
that we always increase busyness in hangcheck (a little) to finally
declare a hang even tho no loops are detected.
But with this and gen < 9, the execution grinds to a halt and
I get hang in a 5 second window.
-Mika
> On the other hand, on gen9+ page faulting is supposedly not broken
> anymore so maybe we don't need the scratch page to begin with there so
> maybe it's all moot at that point? Again, if I'm making no sense here
> feel free to set things straight, I'm very curious about how all of this
> is supposed to work.
>
> Thanks,
> Tomas
>
>>
>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>> Signed-off-by: Mika Kuoppala <mika.kuoppala at intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> index 43fa543..a2a0c88 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> @@ -2168,6 +2168,8 @@ void i915_global_gtt_cleanup(struct drm_device *dev)
>> vm->cleanup(vm);
>> }
>>
>> +#define SCRATCH_PAGE_MAGIC 0xffff00ffffff00ffULL
>> +
>> static int alloc_scratch_page(struct i915_address_space *vm)
>> {
>> struct i915_page_scratch *sp;
>> @@ -2185,6 +2187,7 @@ static int alloc_scratch_page(struct i915_address_space *vm)
>> return ret;
>> }
>>
>> + fill_px(vm->dev, sp, SCRATCH_PAGE_MAGIC);
>> set_pages_uc(px_page(sp), 1);
>>
>> vm->scratch_page = sp;
>>
More information about the Intel-gfx
mailing list