[PATCH] drm/radeon: fix TOPDOWN handling for bo_create

Tue Mar 17 04:40:46 PDT 2015

On 16.03.2015 23:32, Alex Deucher wrote:
> On Thu, Mar 12, 2015 at 10:55 PM, Michel Dänzer <michel at daenzer.net> wrote:
>> On 12.03.2015 22:09, Alex Deucher wrote:
>>> On Thu, Mar 12, 2015 at 5:23 AM, Christian König
>>> <deathsimple at vodafone.de> wrote:
>>>> On 12.03.2015 10:02, Michel Dänzer wrote:
>>>>> On 12.03.2015 06:14, Alex Deucher wrote:
>>>>>> On Wed, Mar 11, 2015 at 4:51 PM, Alex Deucher <alexdeucher at gmail.com>
>>>>>> wrote:
>>>>>>> On Wed, Mar 11, 2015 at 2:21 PM, Christian König
>>>>>>> <deathsimple at vodafone.de> wrote:
>>>>>>>> On 11.03.2015 16:44, Alex Deucher wrote:
>>>>>>>>> radeon_bo_create() calls radeon_ttm_placement_from_domain()
>>>>>>>>> before ttm_bo_init() is called.  radeon_ttm_placement_from_domain()
>>>>>>>>> uses the ttm bo size to determine when to select top down
>>>>>>>>> allocation but since the ttm bo is not initialized yet the
>>>>>>>>> check is always false.
>>>>>>>>>
>>>>>>>>> Noticed-by: Oded Gabbay <oded.gabbay at amd.com>
>>>>>>>>> Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
>>>>>>>>> Cc: stable at vger.kernel.org
>>>>>>>>
>>>>>>>> And I was already wondering why the heck the BOs always made this
>>>>>>>> ping/pong
>>>>>>>> in memory after creation.
>>>>>>>>
>>>>>>>> Patch is Reviewed-by: Christian König <christian.koenig at amd.com>
>>>>>>> And fixing that promptly broke VCE due to vram location requirements.
>>>>>>> Updated patch attached.  Thoughts?
>>>>>> And one more take to make things a bit more explicit for static kernel
>>>>>> driver allocations.
>>>>> struct ttm_place::lpfn is honoured even with TTM_PL_FLAG_TOPDOWN, so
>>>>> latter should work with RADEON_GEM_CPU_ACCESS. It sounds like the
>>>>> problem is really that some BOs are expected to be within a certain
>>>>> range from the beginning of VRAM, but lpfn isn't set accordingly. It
>>>>> would be better to fix that by setting lpfn directly than indirectly via
>>>>> RADEON_GEM_CPU_ACCESS.
>>>>
>>>> Yeah, agree. We should probably try to find the root cause of this instead.
>>>>
>>>> As far as I know VCE has no documented limitation on where buffers are
>>>> placed (unlike UVD). So this is a bit strange. Are you sure that it isn't
>>>> UVD which breaks here?
>>> It's definitely VCE, I don't know why UVD didn't have a problem.  I
>>> considered using pin_restricted to make sure it got pinned in the CPU
>>> visible region, but that had two problems: 1. it would end up getting
>>> migrated when pinned,
>> Maybe something like radeon_uvd_force_into_uvd_segment() is needed for
>> VCE as well?
>>
>>
>>> 2. it would end up at the top of the restricted
>>> region since the top down flag is set which would end up fragmenting
>>> vram.
>> If that's an issue (which outweighs the supposed benefit of
>> TTM_PL_FLAG_TOPDOWN), then again the proper solution would be not to set
>> TTM_PL_FLAG_TOPDOWN when rbo->placements[i].lpfn != 0 and smaller than
>> the whole available region, instead of checking for VRAM and
>> RADEON_GEM_CPU_ACCESS.
>>
> How about something like the attached patch?  I'm not really sure
> about the restrictions for the UVD and VCE fw and stack/heap buffers,
> but this seems to work.  It seems like the current UVD/VCE code works
> by accident since the check for TOPDOWN fails.

Well, I would still like to rather find the bug in the VCE code cause 
there shouldn't be a hardware limitation like this and that really hurt 
us already with UVD.

What system do you use to test this? How much address space do you have 
for VRAM/GART? Is it possible that we get over a 4GB boundary with that?

Regards,
Christian.

>
> Alex