[PATCH] drm/amd/amdgpu: Add tracepoint for DMA page mapping
axie
axie at amd.com
Tue Aug 1 15:00:54 UTC 2017
Because this is used by a debug tool, can we use a macro to
conditionally compile this feature?
On 2017-08-01 10:54 AM, Christian König wrote:
> Am 01.08.2017 um 16:26 schrieb Tom St Denis:
>> On 01/08/17 10:10 AM, Christian König wrote:
>>
>>> You need to cover multiple code path here:
>>> 1. The one you currently implemented which uses ttm_dma_populate()
>>> and pci_map_page().
>>> 2. The one using ttm_dma_populate().
>>
>> I'll have to look into this one though it's my understanding that
>> code path is only followed if there's no (hardware) IOMMU right?
>
> Wrong, that one is used when "anything" in the system used the SWIOTLB
> path once. So the detection doesn't always work reliable.
>
>> This one seems rather straight forward but the only catch is I don't
>> have access to "adev" inside that drm function.
>
> When you have a ttm you can always do "amdgpu_ttm_adev(ttm->bdev)" to
> get the adev.
>
>> Would it be taboo to do something like
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> index 376078334f54..cd97ef2144c9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> @@ -916,6 +916,12 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt
>> *ttm)
>> drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
>> gtt->ttm.dma_address, ttm->num_pages);
>> ttm->state = tt_unbound;
>> + for (i = 0; i < ttm->num_pages; i++) {
>> + trace_amdgpu_ttm_tt_populate(
>> + adev,
>> + gtt->ttm.dma_address[i],
>> + page_to_phys(ttm->pages[i]));
>> + }
>> return 0;
>> }
>>
>> This would normally result in a for loop around a nop which shouldn't
>> be a huge performance hit but is one none the less.
>
> Try using trace_trace_amdgpu_ttm_tt_populate_enabled(), those
> functions are generated by the trace subsystem automatically when you
> define a trace point.
>
> It just doesn't use those nice tricks to modify the compiled binary on
> the fly.
>
> Christian.
>
>>
>>> 4. The in amdgpu_ttm_tt_pin_userptr() which uses dma_map_sg() for
>>> userptrs.
>>>
>>> Basically all just take gtt->ttm.ttm.pages and fill
>>> gtt->ttm.dma_address.
>>
>> Same comment as #3. I can tackle this with a for loop around the
>> trace if that is ok.
>>
>>> I suggest to add a helper you can call to trace all
>>> pages->dma_address mappings inside a ttm.
>>
>> One thing at a time :-). That would probably be a bit better since
>> the trace log gets filled with remappings (which is why my
>> proof-of-concept umr patch only grabs the latest mapping as it reads
>> the trace).
>>
>> Is there a handy place we can grab a list of ttm_tt's bound to our
>> device? If so then in theory I could write a debugfs interface instead.
>>
>> Thanks for your patience as I'm definitely less familiar with the VM
>> side of things :-)
>>
>> Cheers,
>> Tom
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
More information about the amd-gfx
mailing list