<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 2022-01-14 5:37 p.m., Felix Kuehling
      wrote:<br>
    </div>
    <blockquote type="cite" cite="mid:47b948f0-c004-b3d1-d5f9-0cbf1db1f21e@amd.com">
      <pre class="moz-quote-pre" wrap="">Am 2022-01-14 um 3:38 p.m. schrieb Philip Yang:
</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">After GPU page fault is recovered, output timestamp when fault is
received, duration to recover the fault, if migration or only
GPU page table is updated, fault address, read or write fault.

Signed-off-by: Philip Yang <a class="moz-txt-link-rfc2396E" href="mailto:Philip.Yang@amd.com"><Philip.Yang@amd.com></a>
---
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 41 +++++++++++++++++++++
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h |  3 ++
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c        | 12 ++++--
 3 files changed, 53 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
index 5818ea8ad4ce..6ed3d85348d6 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
@@ -265,6 +265,47 @@ void kfd_smi_event_update_vmfault(struct kfd_dev *dev, uint16_t pasid)
        add_event_to_kfifo(task_info.pid, dev, KFD_SMI_EVENT_VMFAULT, fifo_in, len);
 }
 
+static bool kfd_smi_event_pid_duration(struct kfd_dev *dev, uint16_t pasid,
+                                      pid_t *pid, uint64_t ts,
+                                      uint64_t *duration)
+{
+       struct amdgpu_task_info task_info = {0};
+
+       if (list_empty(&dev->smi_clients))
+               return false;
+
+       amdgpu_vm_get_task_info(dev->adev, pasid, &task_info);
</pre>
      </blockquote>
      <pre class="moz-quote-pre" wrap="">
The caller (svm_range_restore_pages) already knows the kfd_process. It
should be able to provide the pid directly from p->lead_thread.pid. No
need to look that up from the pasid and vm task info.</pre>
    </blockquote>
    yes, I can use p->lead_thread.pid for other events as well.<br>
    <blockquote type="cite" cite="mid:47b948f0-c004-b3d1-d5f9-0cbf1db1f21e@amd.com">
      <pre class="moz-quote-pre" wrap="">


</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">+  if (!task_info.pid) {
+               pr_debug("task is gone\n");
+               return false;
+       }
+       if (pid)
+               *pid = task_info.pid;
+       if (duration)
+               *duration = ktime_get_ns() - ts;
+       return true;
+}
+
+void kfd_smi_event_page_fault(struct kfd_dev *dev, uint16_t pasid,
+                             unsigned long address, bool migration,
+                             bool write_fault, uint64_t ts)
+{
+       char fifo_in[128];
+       uint64_t duration;
+       pid_t pid;
+       int len;
+
+       if (!kfd_smi_event_pid_duration(dev, pasid, &pid, ts, &duration))
+               return;
+
+       len = snprintf(fifo_in, sizeof(fifo_in), "%d ts=%lld duration=%lld"
+                      " pid=%d pfn=0x%lx write=%d migration=%d\n",
</pre>
      </blockquote>
      <pre class="moz-quote-pre" wrap="">
Please don't break the string over several lines. I believe that would
also trigger a checkpatch warning.</pre>
    </blockquote>
    It does trigger checkpatch warning, although I saw other places
    using multiple lines format, but this is not good to grep string by
    line, I will short it and use one line format.<br>
    <blockquote type="cite" cite="mid:47b948f0-c004-b3d1-d5f9-0cbf1db1f21e@amd.com">
      <pre class="moz-quote-pre" wrap="">


</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">+                 KFD_SMI_EVENT_PAGE_FAULT, ts, duration, pid, address,
+                      write_fault, migration);
</pre>
      </blockquote>
      <pre class="moz-quote-pre" wrap="">
The existing events use %x for all numbers, including event number and
pid. Maybe better to stick with that convention for consistency. At
least for the event number.
</pre>
    </blockquote>
    Will output event number as hex.<br>
    <blockquote type="cite" cite="mid:47b948f0-c004-b3d1-d5f9-0cbf1db1f21e@amd.com">
      <pre class="moz-quote-pre" wrap="">
The existing events seems to favor a very compact layout. I could think
of ways to make this event more compact as well (still using decimal for
some things for readability):

"%x %d(%d)-%d @%x %c%c", KFD_SMI_EVENT_PAGE_FAULT, ts, duration, pid,
address, write ? 'W' : 'w', migration ? 'M' : 'm'

</pre>
    </blockquote>
    good idea to short the format and message.<br>
    <blockquote type="cite" cite="mid:47b948f0-c004-b3d1-d5f9-0cbf1db1f21e@amd.com">
      <pre class="moz-quote-pre" wrap="">
</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">+
+       add_event_to_kfifo(pid, dev, KFD_SMI_EVENT_PAGE_FAULT, fifo_in, len);
+}
+
 int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
 {
        struct kfd_smi_client *client;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h
index bffd0c32b060..fa3a8fdad69f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h
@@ -28,5 +28,8 @@ void kfd_smi_event_update_vmfault(struct kfd_dev *dev, uint16_t pasid);
 void kfd_smi_event_update_thermal_throttling(struct kfd_dev *dev,
                                             uint64_t throttle_bitmask);
 void kfd_smi_event_update_gpu_reset(struct kfd_dev *dev, bool post_reset);
+void kfd_smi_event_page_fault(struct kfd_dev *dev, uint16_t pasid,
+                             unsigned long address, bool migration,
+                             bool write_fault, uint64_t ts);
 
 #endif
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 37b3191615b6..b81667162dc1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -32,6 +32,7 @@
 #include "kfd_priv.h"
 #include "kfd_svm.h"
 #include "kfd_migrate.h"
+#include "kfd_smi_events.h"
 
 #ifdef dev_fmt
 #undef dev_fmt
@@ -2657,11 +2658,12 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
        struct svm_range_list *svms;
        struct svm_range *prange;
        struct kfd_process *p;
-       uint64_t timestamp;
+       uint64_t timestamp = ktime_get_ns();
</pre>
      </blockquote>
      <pre class="moz-quote-pre" wrap="">
kfd_ioctl_get_clock_counters uses ktime_get_boottime_ns for the
system_clock_counter. We should use the same here (and in the duration
calculation) to make it possible to correlate timestamps from different
KFD APIs.
</pre>
    </blockquote>
    ok<br>
    <blockquote type="cite" cite="mid:47b948f0-c004-b3d1-d5f9-0cbf1db1f21e@amd.com">
      <pre class="moz-quote-pre" wrap="">

</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">   int32_t best_loc;
        int32_t gpuidx = MAX_GPU_INSTANCE;
        bool write_locked = false;
        struct vm_area_struct *vma;
+       bool migration = false;
        int r = 0;
 
        if (!KFD_IS_SVM_API_SUPPORTED(adev->kfd.dev)) {
@@ -2737,9 +2739,9 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
                goto out_unlock_range;
        }
 
-       timestamp = ktime_to_us(ktime_get()) - prange->validate_timestamp;
        /* skip duplicate vm fault on different pages of same range */
-       if (timestamp < AMDGPU_SVM_RANGE_RETRY_FAULT_PENDING) {
+       if ((ktime_to_us(timestamp) -  prange->validate_timestamp) <
+           AMDGPU_SVM_RANGE_RETRY_FAULT_PENDING) {
</pre>
      </blockquote>
      <pre class="moz-quote-pre" wrap="">
ktime_to_us takes a ktime_t. But timestamp is just a time in
nanoseconds. I think the correct conversion is just div_u64(timestamp,
1000).</pre>
    </blockquote>
    <p>prange->validate_timestamp will change to
      ktime_get_boottime_ns as well.</p>
    <p>Thanks,</p>
    <p>Philip<br>
    </p>
    <blockquote type="cite" cite="mid:47b948f0-c004-b3d1-d5f9-0cbf1db1f21e@amd.com">
      <pre class="moz-quote-pre" wrap="">

Regards,
  Felix


</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">           pr_debug("svms 0x%p [0x%lx %lx] already restored\n",
                         svms, prange->start, prange->last);
                r = 0;
@@ -2776,6 +2778,7 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
                 prange->actual_loc);
 
        if (prange->actual_loc != best_loc) {
+               migration = true;
                if (best_loc) {
                        r = svm_migrate_to_vram(prange, best_loc, mm);
                        if (r) {
@@ -2804,6 +2807,9 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
                pr_debug("failed %d to map svms 0x%p [0x%lx 0x%lx] to gpus\n",
                         r, svms, prange->start, prange->last);
 
+       kfd_smi_event_page_fault(adev->kfd.dev, p->pasid, addr, migration,
+                                write_fault, timestamp);
+
 out_unlock_range:
        mutex_unlock(&prange->migrate_mutex);
 out_unlock_svms:
</pre>
      </blockquote>
    </blockquote>
  </body>
</html>