[PATCH] drm/amd/amdgpu: Add tracepoint for DMA page mapping

axie axie at amd.com
Tue Aug 1 15:00:54 UTC 2017


Because this is used by a debug tool, can we use a macro to 
conditionally compile this feature?


On 2017-08-01 10:54 AM, Christian König wrote:
> Am 01.08.2017 um 16:26 schrieb Tom St Denis:
>> On 01/08/17 10:10 AM, Christian König wrote:
>>
>>> You need to cover multiple code path here:
>>> 1. The one you currently implemented which uses ttm_dma_populate() 
>>> and pci_map_page().
>>> 2. The one using ttm_dma_populate().
>>
>> I'll have to look into this one though it's my understanding that 
>> code path is only followed if there's no (hardware) IOMMU right?
>
> Wrong, that one is used when "anything" in the system used the SWIOTLB 
> path once. So the detection doesn't always work reliable.
>
>> This one seems rather straight forward but the only catch is I don't 
>> have access to "adev" inside that drm function.
>
> When you have a ttm you can always do "amdgpu_ttm_adev(ttm->bdev)" to 
> get the adev.
>
>>   Would it be taboo to do something like
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> index 376078334f54..cd97ef2144c9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> @@ -916,6 +916,12 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt 
>> *ttm)
>>                 drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
>> gtt->ttm.dma_address, ttm->num_pages);
>>                 ttm->state = tt_unbound;
>> +               for (i = 0; i < ttm->num_pages; i++) {
>> +                       trace_amdgpu_ttm_tt_populate(
>> +                               adev,
>> +                               gtt->ttm.dma_address[i],
>> +                               page_to_phys(ttm->pages[i]));
>> +               }
>>                 return 0;
>>         }
>>
>> This would normally result in a for loop around a nop which shouldn't 
>> be a huge performance hit but is one none the less.
>
> Try using trace_trace_amdgpu_ttm_tt_populate_enabled(), those 
> functions are generated by the trace subsystem automatically when you 
> define a trace point.
>
> It just doesn't use those nice tricks to modify the compiled binary on 
> the fly.
>
> Christian.
>
>>
>>> 4. The in amdgpu_ttm_tt_pin_userptr() which uses dma_map_sg() for 
>>> userptrs.
>>>
>>> Basically all just take gtt->ttm.ttm.pages and fill 
>>> gtt->ttm.dma_address.
>>
>> Same comment as #3.  I can tackle this with a for loop around the 
>> trace if that is ok.
>>
>>> I suggest to add a helper you can call to trace all 
>>> pages->dma_address mappings inside a ttm.
>>
>> One thing at a time :-).  That would probably be a bit better since 
>> the trace log gets filled with remappings (which is why my 
>> proof-of-concept umr patch only grabs the latest mapping as it reads 
>> the trace).
>>
>> Is there a handy place we can grab a list of ttm_tt's bound to our 
>> device?  If so then in theory I could write a debugfs interface instead.
>>
>> Thanks for your patience as I'm definitely less familiar with the VM 
>> side of things :-)
>>
>> Cheers,
>> Tom
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx



More information about the amd-gfx mailing list