<html data-lt-installed="true">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body spellcheck="false" data-gramm="false">
<p>Hi Andrey,</p>
<p>I don't have any XGMI machines here, maybe you can reach out
shaoyun for help.<br>
</p>
<div class="moz-cite-prefix">On 2022/1/29 上午12:57, Grodzovsky,
Andrey wrote:<br>
</div>
<blockquote type="cite"
cite="mid:DM5PR12MB19474AEFB824C4C97DCD7AABEA229@DM5PR12MB1947.namprd12.prod.outlook.com">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<div>Just a gentle ping.</div>
<div><br>
</div>
<div>Andrey</div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt"
face="Calibri, sans-serif" color="#000000"><b>From:</b>
Grodzovsky, Andrey<br>
<b>Sent:</b> 26 January 2022 10:52<br>
<b>To:</b> Christian König
<a class="moz-txt-link-rfc2396E" href="mailto:ckoenig.leichtzumerken@gmail.com"><ckoenig.leichtzumerken@gmail.com></a>; Koenig, Christian
<a class="moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com"><Christian.Koenig@amd.com></a>; Lazar, Lijo
<a class="moz-txt-link-rfc2396E" href="mailto:Lijo.Lazar@amd.com"><Lijo.Lazar@amd.com></a>; <a class="moz-txt-link-abbreviated" href="mailto:dri-devel@lists.freedesktop.org">dri-devel@lists.freedesktop.org</a>
<a class="moz-txt-link-rfc2396E" href="mailto:dri-devel@lists.freedesktop.org"><dri-devel@lists.freedesktop.org></a>;
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-rfc2396E" href="mailto:amd-gfx@lists.freedesktop.org"><amd-gfx@lists.freedesktop.org></a>; Chen, JingWen
<a class="moz-txt-link-rfc2396E" href="mailto:JingWen.Chen2@amd.com"><JingWen.Chen2@amd.com></a><br>
<b>Cc:</b> Chen, Horace <a class="moz-txt-link-rfc2396E" href="mailto:Horace.Chen@amd.com"><Horace.Chen@amd.com></a>; Liu, Monk
<a class="moz-txt-link-rfc2396E" href="mailto:Monk.Liu@amd.com"><Monk.Liu@amd.com></a><br>
<b>Subject:</b> Re: [RFC v2 4/8] drm/amdgpu: Serialize non TDR
gpu recovery with TDRs</font>
<div> </div>
</div>
<div>
<p>JingWen - could you maybe give those patches a try on SRIOV
XGMI system ? If you see issues maybe you could let me connect
and debug. My SRIOV XGMI system which Shayun kindly arranged
for me is not loading the driver with my drm-misc-next branch
even without my patches.<br>
</p>
<p>Andrey<br>
</p>
<div class="x_moz-cite-prefix">On 2022-01-17 14:21, Andrey
Grodzovsky wrote:<br>
</div>
<blockquote type="cite">
<p><br>
</p>
<div class="x_moz-cite-prefix">On 2022-01-17 2:17 p.m.,
Christian König wrote:<br>
</div>
<blockquote type="cite">Am 17.01.22 um 20:14 schrieb Andrey
Grodzovsky:<br>
<blockquote type="cite">
<p>Ping on the question</p>
</blockquote>
<br>
Oh, my! That was already more than a week ago and is
completely swapped out of my head again.<br>
<br>
<blockquote type="cite">
<p>Andrey<br>
</p>
<div class="x_moz-cite-prefix">On 2022-01-05 1:11 p.m.,
Andrey Grodzovsky wrote:<br>
</div>
<blockquote type="cite">
<blockquote type="cite" style="color:#007cff">
<blockquote type="cite" style="color:#007cff">Also,
what about having the reset_active or in_reset flag
in the reset_domain itself?
<br>
</blockquote>
<br>
Of hand that sounds like a good idea. <br>
</blockquote>
<br>
<br>
What then about the adev->reset_sem semaphore ?
Should we also move this to reset_domain ? Both of the
moves have functional
<br>
implications only for XGMI case because there will be
contention over accessing those single instance
variables from multiple devices
<br>
while now each device has it's own copy. <br>
</blockquote>
</blockquote>
<br>
Since this is a rw semaphore that should be unproblematic I
think. It could just be that the cache line of the lock then
plays ping/pong between the CPU cores.<br>
<br>
<blockquote type="cite">
<blockquote type="cite"><br>
What benefit the centralization into reset_domain gives
- is it for example to prevent one device in a hive
trying to access through MMIO another one's
<br>
VRAM (shared FB memory) while the other one goes through
reset ? <br>
</blockquote>
</blockquote>
<br>
I think that this is the killer argument for a centralized
lock, yes.<br>
</blockquote>
<p><br>
</p>
<p>np, i will add a patch with centralizing both flag into
reset domain and resend.</p>
<p>Andrey</p>
<p><br>
</p>
<blockquote type="cite"><br>
Christian.<br>
<br>
<blockquote type="cite">
<blockquote type="cite"><br>
Andrey </blockquote>
</blockquote>
<br>
</blockquote>
</blockquote>
</div>
</blockquote>
</body>
</html>