[PATCH] drm/amd/amdgpu: Add tracepoint for DMA page mapping

Tom St Denis tom.stdenis at amd.com
Tue Aug 1 14:26:08 UTC 2017


On 01/08/17 10:10 AM, Christian König wrote:

> You need to cover multiple code path here:
> 1. The one you currently implemented which uses ttm_dma_populate() and 
> pci_map_page().
> 2. The one using ttm_dma_populate().

I'll have to look into this one though it's my understanding that code 
path is only followed if there's no (hardware) IOMMU right?  Which while 
it'd have to be covered isn't a high priority right now (our devel 
platforms have hardware IOMMU).

None the less I'll look into it once I figure out #3 and #4 per your 
comments.

> 3. The one using drm_prime_sg_to_page_addr_arrays() a bit above for 
> imported BOs.

This one seems rather straight forward but the only catch is I don't 
have access to "adev" inside that drm function.  Would it be taboo to do 
something like

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 376078334f54..cd97ef2144c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -916,6 +916,12 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm)
                 drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
                                                  gtt->ttm.dma_address, 
ttm->num_pages);
                 ttm->state = tt_unbound;
+               for (i = 0; i < ttm->num_pages; i++) {
+                       trace_amdgpu_ttm_tt_populate(
+                               adev,
+                               gtt->ttm.dma_address[i],
+                               page_to_phys(ttm->pages[i]));
+               }
                 return 0;
         }

This would normally result in a for loop around a nop which shouldn't be 
a huge performance hit but is one none the less.


> 4. The in amdgpu_ttm_tt_pin_userptr() which uses dma_map_sg() for userptrs.
> 
> Basically all just take gtt->ttm.ttm.pages and fill gtt->ttm.dma_address.

Same comment as #3.  I can tackle this with a for loop around the trace 
if that is ok.

> I suggest to add a helper you can call to trace all pages->dma_address 
> mappings inside a ttm.

One thing at a time :-).  That would probably be a bit better since the 
trace log gets filled with remappings (which is why my proof-of-concept 
umr patch only grabs the latest mapping as it reads the trace).

Is there a handy place we can grab a list of ttm_tt's bound to our 
device?  If so then in theory I could write a debugfs interface instead.

Thanks for your patience as I'm definitely less familiar with the VM 
side of things :-)

Cheers,
Tom


More information about the amd-gfx mailing list