<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">On 2023-06-01 17:17, Felix Kuehling
wrote:<br>
</div>
<blockquote type="cite" cite="mid:04b12a87-6672-3241-456e-f6947f09de27@amd.com">On
2023-06-01 16:47, James Zhu wrote:
<br>
<blockquote type="cite">Don't sleep when event age unmatch, and
update last_event_age.
<br>
It is only for KFD_EVENT_TYPE_SIGNAL which is checked by user
space.
<br>
<br>
Signed-off-by: James Zhu <a class="moz-txt-link-rfc2396E" href="mailto:James.Zhu@amd.com"><James.Zhu@amd.com></a>
<br>
---
<br>
drivers/gpu/drm/amd/amdkfd/kfd_events.c | 13 +++++++++++++
<br>
1 file changed, 13 insertions(+)
<br>
<br>
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
<br>
index b27a79c5f826..23e154811471 100644
<br>
--- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
<br>
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
<br>
@@ -964,6 +964,19 @@ int kfd_wait_on_events(struct kfd_process
*p,
<br>
event_data.event_id);
<br>
if (ret)
<br>
goto out_unlock;
<br>
+
<br>
+ /* last_event_age = 0 reserved for backward compatible
*/
<br>
+ if (event_data.last_event_age &&
<br>
+ event_waiters[i].event->event_age !=
event_data.last_event_age) {
<br>
+ event_data.last_event_age =
event_waiters[i].event->event_age;
<br>
+ WRITE_ONCE(event_waiters[i].activated, true);
<br>
</blockquote>
<br>
I think this is probably not necessary if you're returning before
the first call to test_event_condition.
<br>
</blockquote>
<p>[JZ] Currently, the returning is with test_event_condition. Since
some conditions needs return with all events signaled.</p>
<p>so all waiters need check and update if any event interrupts are
missing here(unmatched event age). <br>
</p>
<blockquote type="cite" cite="mid:04b12a87-6672-3241-456e-f6947f09de27@amd.com">
<br>
<br>
<blockquote type="cite">+
<br>
+ if (copy_to_user(&events[i], &event_data,
<br>
+ sizeof(struct kfd_event_data))) {
<br>
+ ret = -EFAULT;
<br>
+ goto out_unlock;
<br>
+ }
<br>
+ }
<br>
</blockquote>
<br>
When waiting on multiple events, it would be more efficient to
catch all events with outdated age in a single system call,
instead of returning after finding the first one. Then return
after the loop if it found one or more outdated events.</blockquote>
[JZ] Yes, the code is working in this way, when all events' waiters
are activated, the following test_event_condition ==
KFD_IOC_WAIT_RESULT_COMPLETE, then return to user mode without
sleep.<br>
<blockquote type="cite" cite="mid:04b12a87-6672-3241-456e-f6947f09de27@amd.com">
<br>
<br>
Also EFAULT is the wrong error code. It means "bad address". I
think we could return 0 here because there is really no error.
It's just a normal condition to allow user mode to update its
event information before going to sleep. If you want a non-0
return code, I'd recommend -EDEADLK, because sleeping without
outdated event information can lead to deadlocks. We'd also need
to translate that into a meaningful status code in the Thunk to
let ROCr know what's going on. I don't see a suitable status code
in the current Thunk API for this.
<br>
</blockquote>
[JZ] Basically, this failure is for copy_to_user. It will lead to
busy wait instead of deadlock.<br>
<blockquote type="cite" cite="mid:04b12a87-6672-3241-456e-f6947f09de27@amd.com"><b>
</b><br>
Regards,
<br>
Felix
<br>
<br>
<br>
<blockquote type="cite"> }
<br>
/* Check condition once. */
<br>
</blockquote>
</blockquote>
</body>
</html>