[PATCH] drm/amd/amdgpu: Add tracepoint for DMA page mapping
Tom St Denis
tom.stdenis at amd.com
Tue Aug 1 16:26:22 UTC 2017
In v2 of the patch I used the _enabled() function around the blocks so
the for loop is only reached if the trace is enabled.
Tom
On 01/08/17 11:56 AM, Xie, AlexBin wrote:
> I don't know if compiler is smart enough to optimize the following for
> statement out...
>
>
> + for (i = 0; i < ttm->num_pages; i++) {
> + trace_amdgpu_ttm_tt_populate(
> + adev,
> + gtt->ttm.dma_address[i],
> + page_to_phys(ttm->pages[i]));
> + }
>
>
> -Alex Bin
>
>
> ------------------------------------------------------------------------
> *From:* Christian König <deathsimple at vodafone.de>
> *Sent:* Tuesday, August 1, 2017 11:41 AM
> *To:* Xie, AlexBin; amd-gfx at lists.freedesktop.org
> *Subject:* Re: [PATCH] drm/amd/amdgpu: Add tracepoint for DMA page mapping
> We can turn of the trace subsystem during compile time already.
>
> Additional to that the trace points use self modifying code to make sure
> that they don't have any overhead as long as they are turned off. But I
> don't think this works with the "if (trace_*_enabled()" as well.
>
> Anyway, not a performance critical code path so I don't really bother.
>
> Christian.
>
> Am 01.08.2017 um 17:00 schrieb axie:
>> Because this is used by a debug tool, can we use a macro to
>> conditionally compile this feature?
>>
>>
>> On 2017-08-01 10:54 AM, Christian König wrote:
>>> Am 01.08.2017 um 16:26 schrieb Tom St Denis:
>>>> On 01/08/17 10:10 AM, Christian König wrote:
>>>>
>>>>> You need to cover multiple code path here:
>>>>> 1. The one you currently implemented which uses ttm_dma_populate()
>>>>> and pci_map_page().
>>>>> 2. The one using ttm_dma_populate().
>>>>
>>>> I'll have to look into this one though it's my understanding that
>>>> code path is only followed if there's no (hardware) IOMMU right?
>>>
>>> Wrong, that one is used when "anything" in the system used the
>>> SWIOTLB path once. So the detection doesn't always work reliable.
>>>
>>>> This one seems rather straight forward but the only catch is I don't
>>>> have access to "adev" inside that drm function.
>>>
>>> When you have a ttm you can always do "amdgpu_ttm_adev(ttm->bdev)" to
>>> get the adev.
>>>
>>>> Would it be taboo to do something like
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>> index 376078334f54..cd97ef2144c9 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>> @@ -916,6 +916,12 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt
>>>> *ttm)
>>>> drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
>>>> gtt->ttm.dma_address, ttm->num_pages);
>>>> ttm->state = tt_unbound;
>>>> + for (i = 0; i < ttm->num_pages; i++) {
>>>> + trace_amdgpu_ttm_tt_populate(
>>>> + adev,
>>>> + gtt->ttm.dma_address[i],
>>>> + page_to_phys(ttm->pages[i]));
>>>> + }
>>>> return 0;
>>>> }
>>>>
>>>> This would normally result in a for loop around a nop which
>>>> shouldn't be a huge performance hit but is one none the less.
>>>
>>> Try using trace_trace_amdgpu_ttm_tt_populate_enabled(), those
>>> functions are generated by the trace subsystem automatically when you
>>> define a trace point.
>>>
>>> It just doesn't use those nice tricks to modify the compiled binary
>>> on the fly.
>>>
>>> Christian.
>>>
>>>>
>>>>> 4. The in amdgpu_ttm_tt_pin_userptr() which uses dma_map_sg() for
>>>>> userptrs.
>>>>>
>>>>> Basically all just take gtt->ttm.ttm.pages and fill
>>>>> gtt->ttm.dma_address.
>>>>
>>>> Same comment as #3. I can tackle this with a for loop around the
>>>> trace if that is ok.
>>>>
>>>>> I suggest to add a helper you can call to trace all
>>>>> pages->dma_address mappings inside a ttm.
>>>>
>>>> One thing at a time :-). That would probably be a bit better since
>>>> the trace log gets filled with remappings (which is why my
>>>> proof-of-concept umr patch only grabs the latest mapping as it reads
>>>> the trace).
>>>>
>>>> Is there a handy place we can grab a list of ttm_tt's bound to our
>>>> device? If so then in theory I could write a debugfs interface
>>>> instead.
>>>>
>>>> Thanks for your patience as I'm definitely less familiar with the VM
>>>> side of things :-)
>>>>
>>>> Cheers,
>>>> Tom
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx at lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> amd-gfx Info Page - freedesktop.org
> <https://lists.freedesktop.org/mailman/listinfo/amd-gfx>
> lists.freedesktop.org
> Subscribing to amd-gfx: Subscribe to amd-gfx by filling out the
> following form. Use of all freedesktop.org lists is subject to our Code
> of ...
>
>
>>>
>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> amd-gfx Info Page - freedesktop.org
> <https://lists.freedesktop.org/mailman/listinfo/amd-gfx>
> lists.freedesktop.org
> Subscribing to amd-gfx: Subscribe to amd-gfx by filling out the
> following form. Use of all freedesktop.org lists is subject to our Code
> of ...
>
>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> amd-gfx Info Page - freedesktop.org
> <https://lists.freedesktop.org/mailman/listinfo/amd-gfx>
> lists.freedesktop.org
> Subscribing to amd-gfx: Subscribe to amd-gfx by filling out the
> following form. Use of all freedesktop.org lists is subject to our Code
> of ...
>
>
>
>
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
More information about the amd-gfx
mailing list