[PATCH] drm/radeon: fix TOPDOWN handling for bo_create

Oded Gabbay oded.gabbay at amd.com
Sun Mar 15 08:07:26 PDT 2015



On 03/12/2015 11:36 AM, Christian König wrote:
> On 12.03.2015 10:30, Oded Gabbay wrote:
>>
>> On 03/12/2015 11:23 AM, Christian König wrote:
>>> On 12.03.2015 10:02, Michel Dänzer wrote:
>>>> On 12.03.2015 06:14, Alex Deucher wrote:
>>>>> On Wed, Mar 11, 2015 at 4:51 PM, Alex Deucher <alexdeucher at gmail.com> wrote:
>>>>>> On Wed, Mar 11, 2015 at 2:21 PM, Christian König
>>>>>> <deathsimple at vodafone.de> wrote:
>>>>>>> On 11.03.2015 16:44, Alex Deucher wrote:
>>>>>>>> radeon_bo_create() calls radeon_ttm_placement_from_domain()
>>>>>>>> before ttm_bo_init() is called.  radeon_ttm_placement_from_domain()
>>>>>>>> uses the ttm bo size to determine when to select top down
>>>>>>>> allocation but since the ttm bo is not initialized yet the
>>>>>>>> check is always false.
>>>>>>>>
>>>>>>>> Noticed-by: Oded Gabbay <oded.gabbay at amd.com>
>>>>>>>> Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
>>>>>>>> Cc: stable at vger.kernel.org
>>>>>>> And I was already wondering why the heck the BOs always made this
>>>>>>> ping/pong
>>>>>>> in memory after creation.
>>>>>>>
>>>>>>> Patch is Reviewed-by: Christian König <christian.koenig at amd.com>
>>>>>> And fixing that promptly broke VCE due to vram location requirements.
>>>>>> Updated patch attached.  Thoughts?
>>>>> And one more take to make things a bit more explicit for static kernel
>>>>> driver allocations.
>>>> struct ttm_place::lpfn is honoured even with TTM_PL_FLAG_TOPDOWN, so
>>>> latter should work with RADEON_GEM_CPU_ACCESS. It sounds like the
>>>> problem is really that some BOs are expected to be within a certain
>>>> range from the beginning of VRAM, but lpfn isn't set accordingly. It
>>>> would be better to fix that by setting lpfn directly than indirectly via
>>>> RADEON_GEM_CPU_ACCESS.
>>> Yeah, agree. We should probably try to find the root cause of this instead.
>>>
>>> As far as I know VCE has no documented limitation on where buffers are
>>> placed (unlike UVD). So this is a bit strange. Are you sure that it isn't
>>> UVD which breaks here?
>>>
>>> Regards,
>>> Christian.
>> I noticed this bug when trying to allocate very large BOs (385MB) from the
>> other side of VRAM.
>> However, even with this fix, the following scenario still fails:
>> 1. Allocate BO of 385MB on VRAM with no CPU access.
>> 2. Map it to VRAM
>> 3. Allocate second BO of 385MB on VRAM with no CPU access
>>
>> The last step fails as the ttm can't find a place to put this second BO. I
>> suspect the Top-Down thing isn't being respected at all by the
>> creation/pinning of BO.
>>
>> I think that what happens is that the first BO is pinned right after the
>> first 256 MB, instead of pinning it at the end of the VRAM.
>> Then, when trying to create the second BO, there is no room for it, as there
>> is only 256MB before the first BO, and 383MB after the first BO.
>>
>> I need to debug it further, but will probably only do that on Sunday.
>
> What is the content of radeon_vram_mm (in debugfs) after you allocated the first
> BO?
>
> The placement should be visible there pretty fine.
>
> Regards,
> Christian.
>
Here are the contents before the allocation:
root at odedg-test:/sys/kernel/debug/dri/0# cat radeon_vram_mm
0x00000000-0x00000040: 0x00000040: used
0x00000040-0x00000041: 0x00000001: used
0x00000041-0x00000042: 0x00000001: used
0x00000042-0x00000043: 0x00000001: used
0x00000043-0x00000044: 0x00000001: used
0x00000044-0x0000eab4: 0x0000ea70: free
0x0000eab4-0x0000edb4: 0x00000300: used
0x0000edb4-0x0000f3b4: 0x00000600: free
0x0000f3b4-0x0000f6b4: 0x00000300: used
0x0000f6b4-0x0000f8b4: 0x00000200: used
0x0000f8b4-0x0000fdc8: 0x00000514: used
0x0000fdc8-0x00010000: 0x00000238: used
0x00010000-0x00040000: 0x00030000: free
total: 262144, used 3984 free 258160

And here they are after the allocation of 385MB BO (not pinned yet):

root at odedg-test:/sys/kernel/debug/dri/0# cat radeon_vram_mm
0x00000000-0x00000040: 0x00000040: used
0x00000040-0x00000041: 0x00000001: used
0x00000041-0x00000042: 0x00000001: used
0x00000042-0x00000043: 0x00000001: used
0x00000043-0x00000044: 0x00000001: used
0x00000044-0x0000eab4: 0x0000ea70: free
0x0000eab4-0x0000edb4: 0x00000300: used
0x0000edb4-0x0000edb8: 0x00000004: free
0x0000edb8-0x0000edb9: 0x00000001: used
0x0000edb9-0x0000edc1: 0x00000008: used
0x0000edc1-0x0000edc9: 0x00000008: used
0x0000edc9-0x0000edd1: 0x00000008: used
0x0000edd1-0x0000edd9: 0x00000008: used
0x0000edd9-0x0000ede1: 0x00000008: used
0x0000ede1-0x0000ede9: 0x00000008: used
0x0000ede9-0x0000edf1: 0x00000008: used
0x0000edf1-0x0000edf9: 0x00000008: used
0x0000edf9-0x0000ee01: 0x00000008: used
0x0000ee01-0x0000ee09: 0x00000008: used
0x0000ee09-0x0000ee11: 0x00000008: used
0x0000ee11-0x0000ee19: 0x00000008: used
0x0000ee19-0x0000ee21: 0x00000008: used
0x0000ee21-0x0000ee29: 0x00000008: used
0x0000ee29-0x0000ee31: 0x00000008: used
0x0000ee31-0x0000ee39: 0x00000008: used
0x0000ee39-0x0000ee41: 0x00000008: used
0x0000ee41-0x0000ee49: 0x00000008: used
0x0000ee49-0x0000ee51: 0x00000008: used
0x0000ee51-0x0000ee59: 0x00000008: used
0x0000ee59-0x0000ee61: 0x00000008: used
0x0000ee61-0x0000ee69: 0x00000008: used
0x0000ee69-0x0000ee71: 0x00000008: used
0x0000ee71-0x0000ee79: 0x00000008: used
0x0000ee79-0x0000ee81: 0x00000008: used
0x0000ee81-0x0000f3b4: 0x00000533: free
0x0000f3b4-0x0000f6b4: 0x00000300: used
0x0000f6b4-0x0000f8b4: 0x00000200: used
0x0000f8b4-0x0000fdc8: 0x00000514: used
0x0000fdc8-0x00010000: 0x00000238: used
0x00010000-0x00027f00: 0x00017f00: free
0x00027f00-0x00040000: 0x00018100: used
total: 262144, used 102745 free 159399

So apparently ttm take into consideration the TTM_PL_FLAG_TOPDOWN flag.
However, because the rest of the memory is fragmented, I can't allocate more 
than 383MB (0x00010000-0x00027f00)

I assume the contents of 0-0x10000 are taken by the graphics stack and maybe 
some of them are pinned ? Because there is a large free hole at 
0x00000044-0x0000eab4: 0x0000ea70: free

This is an example where dividing the allocation to multiple BOs (of 1-2MB) 
could overcome the fragmentation issue.


	Oded


>>
>>     Oded
>>
>>>>
>>>> Anyway, since this isn't the first bug which prevents
>>>> TTM_PL_FLAG_TOPDOWN from working as intended in the radeon driver, I
>>>> wonder if its performance impact should be re-evaluated. Lauri?
>>>>
>>>>
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel at lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>


More information about the dri-devel mailing list