<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 1/19/21 8:45 AM, Daniel Vetter
      wrote:<br>
    </div>
    <blockquote type="cite" cite="mid:YAbilAl8g0d9s7vz@phenom.ffwll.local">
      <pre class="moz-quote-pre" wrap="">On Tue, Jan 19, 2021 at 09:48:03AM +0100, Christian König wrote:
</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">Am 18.01.21 um 22:01 schrieb Andrey Grodzovsky:
</pre>
        <blockquote type="cite">
          <pre class="moz-quote-pre" wrap="">Handle all DMA IOMMU gropup related dependencies before the
group is removed.

Signed-off-by: Andrey Grodzovsky <a class="moz-txt-link-rfc2396E" href="mailto:andrey.grodzovsky@amd.com"><andrey.grodzovsky@amd.com></a>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  5 ++++
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 46 ++++++++++++++++++++++++++++++
  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 ++
  6 files changed, 65 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 478a7d8..2953420 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -51,6 +51,7 @@
  #include <linux/dma-fence.h>
  #include <linux/pci.h>
  #include <linux/aer.h>
+#include <linux/notifier.h>
  #include <drm/ttm/ttm_bo_api.h>
  #include <drm/ttm/ttm_bo_driver.h>
@@ -1041,6 +1042,10 @@ struct amdgpu_device {
        bool                            in_pci_err_recovery;
        struct pci_saved_state          *pci_state;
+
+       struct notifier_block           nb;
+       struct blocking_notifier_head   notifier;
+       struct list_head                device_bo_list;
  };
  static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 45e23e3..e99f4f1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -70,6 +70,8 @@
  #include <drm/task_barrier.h>
  #include <linux/pm_runtime.h>
+#include <linux/iommu.h>
+
  MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
  MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
  MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
@@ -3200,6 +3202,39 @@ static const struct attribute *amdgpu_dev_attributes[] = {
  };
+static int amdgpu_iommu_group_notifier(struct notifier_block *nb,
+                                    unsigned long action, void *data)
+{
+       struct amdgpu_device *adev = container_of(nb, struct amdgpu_device, nb);
+       struct amdgpu_bo *bo = NULL;
+
+       /*
+        * Following is a set of IOMMU group dependencies taken care of before
+        * device's IOMMU group is removed
+        */
+       if (action == IOMMU_GROUP_NOTIFY_DEL_DEVICE) {
+
+               spin_lock(&ttm_bo_glob.lru_lock);
+               list_for_each_entry(bo, &adev->device_bo_list, bo) {
+                       if (bo->tbo.ttm)
+                               ttm_tt_unpopulate(bo->tbo.bdev, bo->tbo.ttm);
+               }
+               spin_unlock(&ttm_bo_glob.lru_lock);
</pre>
        </blockquote>
        <pre class="moz-quote-pre" wrap="">
That approach won't work. ttm_tt_unpopulate() might sleep on an IOMMU lock.

You need to use a mutex here or even better make sure you can access the
device_bo_list without a lock in this moment.
</pre>
      </blockquote>
      <pre class="moz-quote-pre" wrap="">
I'd also be worried about the notifier mutex getting really badly in the
way.

Plus I'm worried why we even need this, it sounds a bit like papering over
the iommu subsystem. Assuming we clean up all our iommu mappings in our
device hotunplug/unload code, why do we still need to have an additional
iommu notifier on top, with all kinds of additional headaches? The iommu
shouldn't clean up before the devices in its group have cleaned up.

I think we need more info here on what the exact problem is first.
-Daniel</pre>
    </blockquote>
    <p><br>
      Originally I experienced the  crash bellow on IOMMU enabled
      device, it happens post device removal from PCI topology - <br>
      during shutting down of user client holding last reference to drm
      device file (X in my case). <br>
      The crash is because by the time I get to this point struct
      device->iommu_group pointer is NULL <br>
      already since the IOMMU group for the device is unset during PCI
      removal. So this contradicts what you said above<br>
      that the iommu shouldn't clean up before the devices in its group
      have cleaned up.<br>
      So instead of guessing when is the right place to place all IOMMU
      related cleanups it makes sense<br>
      to get notification from IOMMU subsystem in the form of event
      IOMMU_GROUP_NOTIFY_DEL_DEVICE<br>
      and use that place to do all the relevant cleanups.</p>
    <p>Andrey<br>
    </p>
    <p>
      <br>
      [  123.810074 <   28.126960>] BUG: kernel NULL pointer
      dereference, address: 00000000000000c8
      <br>
      [  123.810080 <    0.000006>] #PF: supervisor read access in
      kernel mode
      <br>
      [  123.810082 <    0.000002>] #PF: error_code(0x0000) -
      not-present page
      <br>
      [  123.810085 <    0.000003>] PGD 0 P4D 0
      <br>
      [  123.810089 <    0.000004>] Oops: 0000 [#1] SMP NOPTI
      <br>
      [  123.810094 <    0.000005>] CPU: 5 PID: 1418 Comm:
      Xorg:shlo4 Tainted: G           O      5.9.0-rc2-dev+ #59
      <br>
      [  123.810096 <    0.000002>] Hardware name: System
      manufacturer System Product Name/PRIME X470-PRO, BIOS 4406
      02/28/2019
      <br>
      [  123.810105 <    0.000009>] <b>RIP:
        0010:iommu_get_dma_domain</b>+0x10/0x20
      <br>
      [  123.810108 <    0.000003>] Code: b0 48 c7 87 98 00 00 00
      00 00 00 00 31 c0 c3 b8 f4 ff ff ff eb a6 0f 1f 40 00 0f 1f 44 00
      00 48 8b 87 d0 02 00 00 55 48 89 e5 <48> 8b 80 c8 00 00 00
      5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 48
      <br>
      [  123.810111 <    0.000003>] RSP: 0018:ffffa2e201f7f980
      EFLAGS: 00010246
      <br>
      [  123.810114 <    0.000003>] RAX: 0000000000000000 RBX:
      0000000000001000 RCX: 0000000000000000
      <br>
      [  123.810116 <    0.000002>] RDX: 0000000000001000 RSI:
      00000000bf5cb000 RDI: ffff93c259dc60b0
      <br>
      [  123.810117 <    0.000001>] RBP: ffffa2e201f7f980 R08:
      0000000000000000 R09: 0000000000000000
      <br>
      [  123.810119 <    0.000002>] R10: ffffa2e201f7faf0 R11:
      0000000000000001 R12: 00000000bf5cb000
      <br>
      [  123.810121 <    0.000002>] R13: 0000000000001000 R14:
      ffff93c24cef9c50 R15: ffff93c256c05688
      <br>
      [  123.810124 <    0.000003>] FS:  00007f5e5e8d3700(0000)
      GS:ffff93c25e940000(0000) knlGS:0000000000000000
      <br>
      [  123.810126 <    0.000002>] CS:  0010 DS: 0000 ES: 0000
      CR0: 0000000080050033
      <br>
      [  123.810128 <    0.000002>] CR2: 00000000000000c8 CR3:
      000000027fe0a000 CR4: 00000000003506e0
      <br>
      [  123.810130 <    0.000002>] Call Trace:
      <br>
      [  123.810136 <    0.000006>]  __iommu_dma_unmap+0x2e/0x100
      <br>
      [  123.810141 <    0.000005>]  ? kfree+0x389/0x3a0
      <br>
      [  123.810144 <    0.000003>]  iommu_dma_unmap_page+0xe/0x10
      <br>
      [  123.810149 <    0.000005>] dma_unmap_page_attrs+0x4d/0xf0
      <br>
      [  123.810159 <    0.000010>]  ?
      ttm_bo_del_from_lru+0x8e/0xb0 [ttm]
      <br>
      [  123.810165 <    0.000006>]
      ttm_unmap_and_unpopulate_pages+0x8e/0xc0 [ttm]
      <br>
      [  123.810252 <    0.000087>]
      amdgpu_ttm_tt_unpopulate+0xaa/0xd0 [amdgpu]
      <br>
      [  123.810258 <    0.000006>]  ttm_tt_unpopulate+0x59/0x70
      [ttm]
      <br>
      [  123.810264 <    0.000006>]  ttm_tt_destroy+0x6a/0x70
      [ttm]
      <br>
      [  123.810270 <    0.000006>]
      ttm_bo_cleanup_memtype_use+0x36/0xa0 [ttm]
      <br>
      [  123.810276 <    0.000006>]  ttm_bo_put+0x1e7/0x400 [ttm]
      <br>
      [  123.810358 <    0.000082>]  amdgpu_bo_unref+0x1e/0x30
      [amdgpu]
      <br>
      [  123.810440 <    0.000082>]
      amdgpu_gem_object_free+0x37/0x50 [amdgpu]
      <br>
      [  123.810459 <    0.000019>]  drm_gem_object_free+0x35/0x40
      [drm]
      <br>
      [  123.810476 <    0.000017>]
      drm_gem_object_handle_put_unlocked+0x9d/0xd0 [drm]
      <br>
      [  123.810494 <    0.000018>]
      drm_gem_object_release_handle+0x74/0x90 [drm]
      <br>
      [  123.810511 <    0.000017>]  ?
      drm_gem_object_handle_put_unlocked+0xd0/0xd0 [drm]
      <br>
      [  123.810516 <    0.000005>]  idr_for_each+0x4d/0xd0
      <br>
      [  123.810534 <    0.000018>]  drm_gem_release+0x20/0x30
      [drm]
      <br>
      [  123.810550 <    0.000016>]  drm_file_free+0x251/0x2a0
      [drm]
      <br>
      [  123.810567 <    0.000017>]
      drm_close_helper.isra.14+0x61/0x70 [drm]
      <br>
      [  123.810583 <    0.000016>]  drm_release+0x6a/0xe0 [drm]
      <br>
      [  123.810588 <    0.000005>]  __fput+0xa2/0x250
      <br>
      [  123.810592 <    0.000004>]  ____fput+0xe/0x10
      <br>
      [  123.810595 <    0.000003>]  task_work_run+0x6c/0xa0
      <br>
      [  123.810600 <    0.000005>]  do_exit+0x376/0xb60
      <br>
      [  123.810604 <    0.000004>]  do_group_exit+0x43/0xa0
      <br>
      [  123.810608 <    0.000004>]  get_signal+0x18b/0x8e0
      <br>
      [  123.810612 <    0.000004>]  ? do_futex+0x595/0xc20
      <br>
      [  123.810617 <    0.000005>]  arch_do_signal+0x34/0x880
      <br>
      [  123.810620 <    0.000003>]  ?
      check_preempt_curr+0x50/0x60
      <br>
      [  123.810623 <    0.000003>]  ? ttwu_do_wakeup+0x1e/0x160
      <br>
      [  123.810626 <    0.000003>]  ? ttwu_do_activate+0x61/0x70
      <br>
      [  123.810630 <    0.000004>]
      exit_to_user_mode_prepare+0x124/0x1b0
      <br>
      [  123.810635 <    0.000005>]
      syscall_exit_to_user_mode+0x31/0x170
      <br>
      [  123.810639 <    0.000004>]  do_syscall_64+0x43/0x80 <br>
    </p>
    <p><br>
    </p>
    <p>Andrey</p>
    <p><br>
    </p>
    <blockquote type="cite" cite="mid:YAbilAl8g0d9s7vz@phenom.ffwll.local">
      <pre class="moz-quote-pre" wrap="">

</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">
Christian.

</pre>
        <blockquote type="cite">
          <pre class="moz-quote-pre" wrap="">+
+               if (adev->irq.ih.use_bus_addr)
+                       amdgpu_ih_ring_fini(adev, &adev->irq.ih);
+               if (adev->irq.ih1.use_bus_addr)
+                       amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
+               if (adev->irq.ih2.use_bus_addr)
+                       amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
+
+               amdgpu_gart_dummy_page_fini(adev);
+       }
+
+       return NOTIFY_OK;
+}
+
+
  /**
   * amdgpu_device_init - initialize the driver
   *
@@ -3304,6 +3339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
        INIT_WORK(&adev->xgmi_reset_work, amdgpu_device_xgmi_reset_func);
+       INIT_LIST_HEAD(&adev->device_bo_list);
+
        adev->gfx.gfx_off_req_count = 1;
        adev->pm.ac_power = power_supply_is_system_supplied() > 0;
@@ -3575,6 +3612,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
        if (amdgpu_device_cache_pci_state(adev->pdev))
                pci_restore_state(pdev);
+       BLOCKING_INIT_NOTIFIER_HEAD(&adev->notifier);
+       adev->nb.notifier_call = amdgpu_iommu_group_notifier;
+
+       if (adev->dev->iommu_group) {
+               r = iommu_group_register_notifier(adev->dev->iommu_group, &adev->nb);
+               if (r)
+                       goto failed;
+       }
+
        return 0;
  failed:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index 0db9330..486ad6d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
   *
   * Frees the dummy page used by the driver (all asics).
   */
-static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
+void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
  {
        if (!adev->dummy_page_addr)
                return;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
index afa2e28..5678d9c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
@@ -61,6 +61,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
  void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
  int amdgpu_gart_init(struct amdgpu_device *adev);
  void amdgpu_gart_fini(struct amdgpu_device *adev);
+void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
  int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
                       int pages);
  int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 6cc9919..4a1de69 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -94,6 +94,10 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
        }
        amdgpu_bo_unref(&bo->parent);
+       spin_lock(&ttm_bo_glob.lru_lock);
+       list_del(&bo->bo);
+       spin_unlock(&ttm_bo_glob.lru_lock);
+
        kfree(bo->metadata);
        kfree(bo);
  }
@@ -613,6 +617,12 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
        if (bp->type == ttm_bo_type_device)
                bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
+       INIT_LIST_HEAD(&bo->bo);
+
+       spin_lock(&ttm_bo_glob.lru_lock);
+       list_add_tail(&bo->bo, &adev->device_bo_list);
+       spin_unlock(&ttm_bo_glob.lru_lock);
+
        return 0;
  fail_unreserve:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index 9ac3756..5ae8555 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -110,6 +110,8 @@ struct amdgpu_bo {
        struct list_head                shadow_list;
        struct kgd_mem                  *kfd_bo;
+
+       struct list_head                bo;
  };
  static inline struct amdgpu_bo *ttm_to_amdgpu_bo(struct ttm_buffer_object *tbo)
</pre>
        </blockquote>
        <pre class="moz-quote-pre" wrap="">
</pre>
      </blockquote>
      <pre class="moz-quote-pre" wrap="">
</pre>
    </blockquote>
  </body>
</html>