[PATCH] drm/amd/amdgpu: Add tracepoint for DMA page mapping
Christian König
deathsimple at vodafone.de
Tue Aug 1 15:41:59 UTC 2017
We can turn of the trace subsystem during compile time already.
Additional to that the trace points use self modifying code to make sure
that they don't have any overhead as long as they are turned off. But I
don't think this works with the "if (trace_*_enabled()" as well.
Anyway, not a performance critical code path so I don't really bother.
Christian.
Am 01.08.2017 um 17:00 schrieb axie:
> Because this is used by a debug tool, can we use a macro to
> conditionally compile this feature?
>
>
> On 2017-08-01 10:54 AM, Christian König wrote:
>> Am 01.08.2017 um 16:26 schrieb Tom St Denis:
>>> On 01/08/17 10:10 AM, Christian König wrote:
>>>
>>>> You need to cover multiple code path here:
>>>> 1. The one you currently implemented which uses ttm_dma_populate()
>>>> and pci_map_page().
>>>> 2. The one using ttm_dma_populate().
>>>
>>> I'll have to look into this one though it's my understanding that
>>> code path is only followed if there's no (hardware) IOMMU right?
>>
>> Wrong, that one is used when "anything" in the system used the
>> SWIOTLB path once. So the detection doesn't always work reliable.
>>
>>> This one seems rather straight forward but the only catch is I don't
>>> have access to "adev" inside that drm function.
>>
>> When you have a ttm you can always do "amdgpu_ttm_adev(ttm->bdev)" to
>> get the adev.
>>
>>> Would it be taboo to do something like
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> index 376078334f54..cd97ef2144c9 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> @@ -916,6 +916,12 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt
>>> *ttm)
>>> drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
>>> gtt->ttm.dma_address, ttm->num_pages);
>>> ttm->state = tt_unbound;
>>> + for (i = 0; i < ttm->num_pages; i++) {
>>> + trace_amdgpu_ttm_tt_populate(
>>> + adev,
>>> + gtt->ttm.dma_address[i],
>>> + page_to_phys(ttm->pages[i]));
>>> + }
>>> return 0;
>>> }
>>>
>>> This would normally result in a for loop around a nop which
>>> shouldn't be a huge performance hit but is one none the less.
>>
>> Try using trace_trace_amdgpu_ttm_tt_populate_enabled(), those
>> functions are generated by the trace subsystem automatically when you
>> define a trace point.
>>
>> It just doesn't use those nice tricks to modify the compiled binary
>> on the fly.
>>
>> Christian.
>>
>>>
>>>> 4. The in amdgpu_ttm_tt_pin_userptr() which uses dma_map_sg() for
>>>> userptrs.
>>>>
>>>> Basically all just take gtt->ttm.ttm.pages and fill
>>>> gtt->ttm.dma_address.
>>>
>>> Same comment as #3. I can tackle this with a for loop around the
>>> trace if that is ok.
>>>
>>>> I suggest to add a helper you can call to trace all
>>>> pages->dma_address mappings inside a ttm.
>>>
>>> One thing at a time :-). That would probably be a bit better since
>>> the trace log gets filled with remappings (which is why my
>>> proof-of-concept umr patch only grabs the latest mapping as it reads
>>> the trace).
>>>
>>> Is there a handy place we can grab a list of ttm_tt's bound to our
>>> device? If so then in theory I could write a debugfs interface
>>> instead.
>>>
>>> Thanks for your patience as I'm definitely less familiar with the VM
>>> side of things :-)
>>>
>>> Cheers,
>>> Tom
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
More information about the amd-gfx
mailing list