<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>BTW, this also seems to be what breaks suspend/resume.<br>
</p>
<p><br>
</p>
<p>Andrey<br>
</p>
<br>
<div class="moz-cite-prefix">On 09/21/2018 01:56 PM, Andrey
Grodzovsky wrote:<br>
</div>
<blockquote type="cite"
cite="mid:681ddd4e-6bd2-db28-4286-2cc577d0f00a@amd.com">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<p>No worries, I will just revert locally until then to clear the
extra errors during my investigation of current GPU reset status
and issues.</p>
<p><br>
</p>
<p>Andrey<br>
</p>
<br>
<div class="moz-cite-prefix">On 09/21/2018 01:53 PM, Christian
König wrote:<br>
</div>
<blockquote type="cite"
cite="mid:04944e7b-044b-4b16-3d2f-e760eedcee9a@gmail.com">
<div class="moz-cite-prefix">I unfortunately don't have a
Polaris to test this myself.<br>
<br>
But please give me time till Monday so that I can at least try
one more things to fix it.<br>
<br>
Christian.<br>
<br>
Am 21.09.2018 um 19:11 schrieb Andrey Grodzovsky:<br>
</div>
<blockquote type="cite"
cite="mid:c81338de-5fc7-3be3-961a-bba0eba05351@amd.com">
<p>Ping...</p>
<p><br>
</p>
<p>Andrey<br>
</p>
<br>
<div class="moz-cite-prefix">On 09/20/2018 04:35 PM, Andrey
Grodzovsky wrote:<br>
</div>
<blockquote type="cite"
cite="mid:4afeb01c-37e9-ca76-8055-5dd15fca98d3@amd.com">
<p>What's the status with this error and the suggested patch
to fix it ? It impacts GPU reset on Polaris11.</p>
<p>Do we want to investigate why the original patch breaks
it or just disable with the proposed patch ?</p>
<p><br>
</p>
<p>P.S Suspend resume also stopped working on latest branch
- will bisect it later today or tomorrow.</p>
<p><br>
</p>
<p>Andrey<br>
</p>
<br>
<div class="moz-cite-prefix">On 09/18/2018 11:00 AM,
Christian König wrote:<br>
</div>
<blockquote type="cite"
cite="mid:edd44be9-2ef3-3c39-3342-5d3b4bbfa40a@amd.com">
<div class="moz-cite-prefix">Tom,<br>
<br>
can you try if the following makes it working again?<br>
<br>
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c<br>
index b6160de70d12..d65f5ba92fc5 100644<br>
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c<br>
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c<br>
@@ -937,6 +937,10 @@ static int
gfx_v8_0_ring_test_ib(struct amdgpu_ring *ring, long
timeout)<br>
return r;<br>
}<br>
<br>
+static int gfx_v8_0_kiq_ring_test_ib(struct amdgpu_ring
*ring, long timeout)<br>
+{<br>
+ return 0;<br>
+}<br>
<br>
static void gfx_v8_0_free_microcode(struct
amdgpu_device *adev)<br>
{<br>
@@ -7174,7 +7178,7 @@ static const struct
amdgpu_ring_funcs gfx_v8_0_ring_funcs_kiq = {<br>
.emit_ib = gfx_v8_0_ring_emit_ib_compute,<br>
.emit_fence = gfx_v8_0_ring_emit_fence_kiq,<br>
.test_ring = gfx_v8_0_ring_test_ring,<br>
- .test_ib = gfx_v8_0_ring_test_ib,<br>
+ .test_ib = gfx_v8_0_kiq_ring_test_ib,<br>
.insert_nop = amdgpu_ring_insert_nop,<br>
.pad_ib = amdgpu_ring_generic_pad_ib,<br>
.emit_rreg = gfx_v8_0_ring_emit_rreg,<br>
<br>
<br>
Thanks,<br>
Christian.<br>
<br>
Am 18.09.2018 um 16:41 schrieb Christian König:<br>
</div>
<blockquote type="cite"
cite="mid:4a250398-d2ac-1650-739d-e4a6598f1c48@gmail.com">
<div class="moz-cite-prefix">CRTC and GFX interrupts
seem to be working perfectly fine.<br>
<br>
The problem here looks like only EOP interrupts from
the Compute queue are not correctly handled.<br>
<br>
Most likely a bug somewhere in gfx_v8_0_eop_irq().<br>
<br>
Christian.<br>
<br>
Am 18.09.2018 um 16:36 schrieb Deucher, Alexander:<br>
</div>
<blockquote type="cite"
cite="mid:BN6PR12MB1809B0E02DDA1E8AACFFD1DAF71D0@BN6PR12MB1809.namprd12.prod.outlook.com">
<style type="text/css" style="display:none;"><!-- P {margin-top:0;margin-bottom:0;} --></style>
<div id="divtagdefaultwrapper"
style="font-size:12pt;color:#000000;font-family:Calibri,Helvetica,sans-serif;"
dir="ltr">
<p style="margin-top:0;margin-bottom:0">FWIW, a
number of consumer Raven boards have bad IVRS
tables (windows doesn't use interrupt remapping so
they are sometimes wrong and probably not
validated. There are a number of workaround to
manually override the IVRS tables to make
interrupts work. I think specifying pci=noacpi is
also a possible workaround.</p>
<p style="margin-top:0;margin-bottom:0"><br>
</p>
<p style="margin-top:0;margin-bottom:0">Alex<br>
</p>
</div>
<hr style="display:inline-block;width:98%"
tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font
style="font-size:11pt" face="Calibri, sans-serif"
color="#000000"><b>From:</b> amd-gfx <a
class="moz-txt-link-rfc2396E"
href="mailto:amd-gfx-bounces@lists.freedesktop.org"
moz-do-not-send="true"><amd-gfx-bounces@lists.freedesktop.org></a>
on behalf of Christian König <a
class="moz-txt-link-rfc2396E"
href="mailto:christian.koenig@amd.com"
moz-do-not-send="true"><christian.koenig@amd.com></a><br>
<b>Sent:</b> Tuesday, September 18, 2018 10:31:16
AM<br>
<b>To:</b> StDenis, Tom; amd-gfx mailing list;
Zhou, David(ChunMing)<br>
<b>Subject:</b> Re: Regression on gfx8 with ring
init</font>
<div> </div>
</div>
<div class="BodyFragment"><font size="2"><span
style="font-size:11pt;">
<div class="PlainText">Well looks like interrupt
processing is working perfectly fine.<br>
<br>
But looking at the error message once more I
see that this actually <br>
affects ring number 9 and not the GFX ring.<br>
<br>
Can you fix amdgpu_ib_ring_tests() to print
ring->name instead of the <br>
number?<br>
<br>
That must be some of the compute rings.<br>
<br>
Thanks,<br>
Christian.<br>
<br>
Am 18.09.2018 um 16:20 schrieb Tom St Denis:<br>
> On 2018-09-18 10:13 a.m., Christian König
wrote:<br>
>> Mhm, there is no more failed IB-test
in there isn't it?<br>
><br>
> oh sorry I thought you wanted to test
HEAD~ ... Attached is a log from <br>
> the tip of drm-next<br>
><br>
> Tom<br>
><br>
>><br>
>> Christian.<br>
>><br>
>> Am 18.09.2018 um 16:09 schrieb Tom St
Denis:<br>
>>> Disabling IOMMU in the BIOS
resulted in a correct boot up...<br>
>>><br>
>>> Here's the log.<br>
>>><br>
>>> Tom<br>
>>><br>
>>> On 2018-09-18 9:58 a.m., Tom St
Denis wrote:<br>
>>>> Odd I couldn't even boot my
system with the dGPU as primary after <br>
>>>> rebuilding the kernel. It
got hung up in the IOMMU driver (loads <br>
>>>> of AMD-Vi IOMMU errors) which
I wasn't able to capture because it <br>
>>>> panic'ed before loading the
network stack.<br>
>>>><br>
>>>> Bizarre.<br>
>>>><br>
>>>> I'll keep trying.<br>
>>>><br>
>>>> Tom<br>
>>>><br>
>>>> On 2018-09-18 9:35 a.m.,
Christian König wrote:<br>
>>>>> Am 18.09.2018 um 15:32
schrieb Tom St Denis:<br>
>>>>>> On 2018-09-18 9:30
a.m., Christian König wrote:<br>
>>>>>>> Great, not sure
if that is a good or a bad news.<br>
>>>>>>><br>
>>>>>>> Anyway going to
revert the change for now. Does anybody <br>
>>>>>>> volunteer to
figure out why interrupts sometimes doesn't
work <br>
>>>>>>> correctly on
Raven?<br>
>>>>>><br>
>>>>>> What does "doesn't
work correctly?" My workstation is a Raven1 <br>
>>>>>> (Ryzen 2400G) and
other than the TTM bulk move issue has been <br>
>>>>>> perfectly stable
(through suspend/resumes too I might add).<br>
>>>>>><br>
>>>>>> Anything I could test
with my devel raven?<br>
>>>>><br>
>>>>> The problem seems to be
that on some boards IH handling doesn't <br>
>>>>> work as it should.<br>
>>>>><br>
>>>>> Can you try to disable
the onboard graphics and try again?<br>
>>>>><br>
>>>>> If that still doesn't
work there is a DRM_DEBUG in <br>
>>>>> amdgpu_ih_process(), make
that a DRM_ERROR and send me the <br>
>>>>> resulting dmesg of
loading amdgpu (but don't start any UMD).<br>
>>>>><br>
>>>>> Thanks,<br>
>>>>> Christian.<br>
>>>>><br>
>>>>>><br>
>>>>>><br>
>>>>>> Tom<br>
>>>>>><br>
>>>>>>><br>
>>>>>>> Christian.<br>
>>>>>>><br>
>>>>>>> Am 18.09.2018 um
15:27 schrieb Tom St Denis:<br>
>>>>>>>> This commit:<br>
>>>>>>>><br>
>>>>>>>> [root@raven
linux]# git bisect good<br>
>>>>>>>>
9b0df0937a852d299fbe42a5939c9a8a4cc83c55 is
the first bad commit<br>
>>>>>>>> commit
9b0df0937a852d299fbe42a5939c9a8a4cc83c55<br>
>>>>>>>> Author:
Christian König <a
class="moz-txt-link-rfc2396E"
href="mailto:christian.koenig@amd.com"
moz-do-not-send="true"><christian.koenig@amd.com></a><br>
>>>>>>>> Date: Tue
Sep 18 10:38:09 2018 +0200<br>
>>>>>>>><br>
>>>>>>>>
drm/amdgpu: remove fence fallback<br>
>>>>>>>><br>
>>>>>>>> DC
doesn't seem to have a fallback path either.<br>
>>>>>>>><br>
>>>>>>>> So when
interrupts doesn't work any more we are pretty
much <br>
>>>>>>>> busted no<br>
>>>>>>>> matter
what.<br>
>>>>>>>><br>
>>>>>>>>
Signed-off-by: Christian König <a
class="moz-txt-link-rfc2396E"
href="mailto:christian.koenig@amd.com"
moz-do-not-send="true"><christian.koenig@amd.com></a><br>
>>>>>>>>
Reviewed-by: Chunming Zhou <a
class="moz-txt-link-rfc2396E"
href="mailto:david1.zhou@amd.com"
moz-do-not-send="true"><david1.zhou@amd.com></a><br>
>>>>>>>><br>
>>>>>>>> Results in
this:<br>
>>>>>>>><br>
>>>>>>>> [
24.334025] [drm] Initialized amdgpu 3.27.0
20150101 for <br>
>>>>>>>> 0000:07:00.0
on minor 1<br>
>>>>>>>> [
24.335674] modprobe (3895) used greatest stack
depth: 12600 <br>
>>>>>>>> bytes left<br>
>>>>>>>> [
26.272358] [drm:gfx_v8_0_ring_test_ib
[amdgpu]] *ERROR* <br>
>>>>>>>> amdgpu: IB
test timed out.<br>
>>>>>>>> [
26.272460] [drm:amdgpu_ib_ring_tests [amdgpu]]
*ERROR* <br>
>>>>>>>> amdgpu:
failed testing IB on ring 9 (-110).<br>
>>>>>>>> [
26.407885] [drm:process_one_work] *ERROR* ib
ring test <br>
>>>>>>>> failed
(-110).<br>
>>>>>>>> [
28.506708] fuse init (API version 7.27)<br>
>>>>>>>><br>
>>>>>>>> On init with
my polaris/raven1 system.<br>
>>>>>>>><br>
>>>>>>>> Cheers,<br>
>>>>>>>> Tom<br>
>>>>>>>>
_______________________________________________<br>
>>>>>>>> amd-gfx
mailing list<br>
>>>>>>>> <a
class="moz-txt-link-abbreviated"
href="mailto:amd-gfx@lists.freedesktop.org"
moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a><br>
>>>>>>>> <a
href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx"
moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a><br>
>>>>>>><br>
>>>>>><br>
>>>>><br>
>>>><br>
>>><br>
>><br>
><br>
<br>
_______________________________________________<br>
amd-gfx mailing list<br>
<a class="moz-txt-link-abbreviated"
href="mailto:amd-gfx@lists.freedesktop.org"
moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a><br>
<a
href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx"
moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a><br>
</div>
</span></font></div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx" moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
</blockquote>
<br>
</blockquote>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx" moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
</blockquote>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx" moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
</blockquote>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx" moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
</blockquote>
<br>
</blockquote>
<br>
</blockquote>
<br>
</body>
</html>