<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<br>
<br>
<div class="moz-cite-prefix">On 19.10.2022 11:14, Matthew Auld
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:ca42bc29-ef8c-cb36-a8f7-897c7baee0ca@intel.com">On
19/10/2022 10:12, Matthew Auld wrote: <br>
<blockquote type="cite">On 19/10/2022 08:12, Andrzej Hajda wrote:
<br>
<blockquote type="cite">Instruction prefetch mechanism requires
that 512 bytes after the last <br>
command should be readable by EU. Otherwise DMAR errors and
engine <br>
hangs can happen. <br>
<br>
Closes: <a class="moz-txt-link-freetext"
href="https://gitlab.freedesktop.org/drm/intel/-/issues/5278">https://gitlab.freedesktop.org/drm/intel/-/issues/5278</a>
<br>
Signed-off-by: Andrzej Hajda <a class="moz-txt-link-rfc2396E"
href="mailto:andrzej.hajda@intel.com"><andrzej.hajda@intel.com></a>
<br>
</blockquote>
<br>
Is there a Bspec ref for this? I would have assumed that EU was
more about kernels/shaders, than simple MI commands? Also should
we be hitting dmar errors for ppGTT if this were some kind of
overfetch? AFAICT we always point entries back to scratch,
unlike with say the GGTT where we might have stale entries, and
unbinding should flush the tlb? <br>
</blockquote>
<br>
s/unbinding/put_pages/ <br>
</blockquote>
<br>
Bspec is here [1], but when you made distinction between simple MI
commands and kernel/shaders I am not so sure if it applies to this
case, so I will present my finding leading to this conclusion:<br>
<br>
My findings (on RaptorLake):<br>
1. dmar errors always print physical address of recently removed bb
created by igt_emit_store_dw, at least in my tests.<br>
2. intel_iommu enqueues tlb flush during put_pages of this bb, but
actual flush happens later, triggered by timer.<br>
3. Together with dmar errors GuC reports CAT error on context/engine
executing this batch (with IPEHR=MI_BATCH_BUFFER_END).<br>
4. Errors happens only on vcs/vecs (???).<br>
5. Errors happens only in case tested huge page has size SZ_2M -
SZ_64K, or SZ_2M - SZ_4K. In both cases calculated size of bb (8kb)
is just few dwords after the last cmd, in other cases there is much
more padding.<br>
6. Enlarging bb works (as in this patch).<br>
7. Flushing iommu tlb for the phys address of bb just before calling
dma_unmap_sg (in i915_gem_gtt_finish_pages) helps as well :)<br>
8. There is already some workaround present in
i915_gem_gtt_finish_pages:<br>
<blockquote type="cite">
<div
style="background-color:#ffffff;padding-left:2px;padding-top:0px;padding-right:0px;padding-bottom:0px;">
<div
style="color:#000000;background-color:#ffffff;font-family:"Consolas";font-size:10pt;white-space:nowrap;">
<p><span style="color:#000000;"> </span><span
style="color:#3f7f5f;">/* </span><span
style="color:#7f9fbf;font-weight:bold;">XXX</span><span
style="color:#3f7f5f;"> This does not prevent more
requests being submitted! */</span></p>
<p><span style="color:#000000;"> </span><span
style="color:#7f0055;font-weight:bold;">if</span><span
style="color:#000000;"> (unlikely(ggtt-></span><span
style="color:#0000c0;">do_idle_maps</span><span
style="color:#000000;">))</span></p>
<p><span style="color:#000000;"> </span><span
style="color:#3f7f5f;">/* Wait a bit, in the hope it
avoids the hang */</span></p>
<p><span style="color:#000000;"> usleep_range(100, 250);</span></p>
<p></p>
</div>
</div>
</blockquote>
but it is only implemented for Gen5 and is slow, but also works
(probably because tlb is flushed meantime).<br>
<br>
[1]: <a class="moz-txt-link-freetext" href="https://gfxspecs.intel.com/Predator/Home/Index/47286">https://gfxspecs.intel.com/Predator/Home/Index/47286</a><br>
<br>
Regards<br>
Andrzej<br>
<br>
<blockquote type="cite"
cite="mid:ca42bc29-ef8c-cb36-a8f7-897c7baee0ca@intel.com"> <br>
<blockquote type="cite"> <br>
<blockquote type="cite">--- <br>
drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c | 2 +- <br>
1 file changed, 1 insertion(+), 1 deletion(-) <br>
<br>
diff --git
a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c
b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c <br>
index 3c55e77b0f1b00..fe999a02f8e10a 100644 <br>
--- a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c <br>
+++ b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c <br>
@@ -50,7 +50,7 @@ igt_emit_store_dw(struct i915_vma *vma, <br>
u32 *cmd; <br>
int err; <br>
- size = (4 * count + 1) * sizeof(u32); <br>
+ size = (4 * count + 1) * sizeof(u32) + 512; <br>
size = round_up(size, PAGE_SIZE); <br>
obj =
i915_gem_object_create_internal(vma->vm->i915, size); <br>
if (IS_ERR(obj)) <br>
</blockquote>
</blockquote>
</blockquote>
<br>
</body>
</html>