some half-baked ttm ideas

Christian König christian.koenig at amd.com
Wed Sep 16 07:01:42 UTC 2020


Am 16.09.20 um 08:56 schrieb Dave Airlie:
> On Wed, 16 Sep 2020 at 16:44, Thomas Hellström (Intel)
> <thomas_os at shipmail.org> wrote:
>>
>> On 9/16/20 6:28 AM, Dave Airlie wrote:
>>> On Wed, 16 Sep 2020 at 14:19, Dave Airlie <airlied at gmail.com> wrote:
>>>> On Wed, 16 Sep 2020 at 00:12, Christian König
>>>> <ckoenig.leichtzumerken at gmail.com> wrote:
>>>>> Hi Dave,
>>>>>
>>>>> I think we should just completely nuke ttm_tt_bind() and ttm_tt_unbind()
>>>>> and all of that.
>>>>>
>>>>> Drivers can to this from their move_notify() callback now instead.
>>>> Good plan, I've put a bunch of rework into the same branch,
>>>>
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fairlied%2Flinux%2Fcommits%2Fttm-half-baked-ideas&data=02%7C01%7Cchristian.koenig%40amd.com%7Cc8bcebfc4b904ff1739108d85a0db2a7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637358362159479923&sdata=vMPrWtiP6qdP5BoTdqNlRXGsYQJ9aPmVvVkFoyWFJWM%3D&reserved=0
>>>>
>>>> but I've fried my brain a bit, I'm having trouble reconciling move
>>>> notify and unbinding in the right places, I feel like I'm circling
>>>> around the answer but haven't hit it yet.
>>> drm/ttm: add unbind to move notify paths.
>>>
>>> In that tree is incorrect and I think where things fall apart, since
>>> if we are moving TTM to VRAM that will unbind the TTM object from the
>>> GTT at move notify time before the move has executed.
>>>
>>> I'm feeling a move_complete_notify might be an idea, but I'm wondering
>>> if it's a bad idea.
>>>
>>> Dave.
>> I don't know if this complicates things more, but move_notify was
>> originally only thought to be an invalidation callback, and was never
>> intended to drive any other actions in the driver than to invalidate
>> various GPU bindings.
>>
>> The idea was that TTM should really never set up any GPU bindings, but
>> just provide memory where it was gpu-bindable and make sure it was
>> CPU-mappable where needed. The "exception" was mappable AGP-type
>> gpu-bindings, for the simple reason that they were needed to provide
>> CPU-mappings on systems where you couldn't map the pages directly. But
>> since we set up a GPU map on these systems anyway, many (most) drivers
>> just made use of that, but others took the step further insisting on
>> using move_notify() to set up GPU bindings, which was never intended and
>> adds error paths in the TTM move code that are pretty hard to follow.
>>
>> So if we're changing things here,  I'd vote for the following:
>>
>> * Driver calls ttm_bo_validate to put memory where it is cpu-mappable
>> and gpu-bindable
>> * On successful validate, driver sets up GPU bindings itself.
>>
>> * move_notify only invalidates GPU bindings and should really return a void.
>>
>> So that bind() and unbind() stuff is really only needed for cpu-map
>> through aperture. If we ditch that, then we need to re-define the task
>> of TTM to provide memory in a cpu-mappable location and figure how
>> drivers that require cpu-map-through-aperture should handle this, since
>> they can't use the TTM fault handler for that memory anymore. The same
>> holds for drivers that want to manage their translation table
>> themselves, and needs some cpu-mapping operations to go through the
>> aperture rather than to the pages directly.
>>
>> If the driver has no special cpu-mapping requirements, it should be
>> perfectly legal for it to not provide any bind() or unbind() functionality.
> I think that is close to where we want to end up, it's just
> transitioning through a few intermediate stages to get to it.
>
> I think I can likely put the binds into the driver move callback
> instead of the move_notify once I reorg things a bit more, and then
> maybe we could split the move out to happen post validate.
>
> I'm just worried about intermediate state here, so if we validate
> something into VRAM we still have access to the CPU side backing store
> while it's moved in, and vice-versa.

Yes exactly.

For the intermediate step I think the best would be to manually bind the 
TT object to the GART after calling ttm_bo_validate() like Thomas suggested.

Unbinding can then happen at two locations:
1. In move_notify() for the case GTT->SYSTEM.
2. When the TT object is destroyed.

Both of those should be inside the driver and not TTM.

Regards,
Christian.

>
> Dave.



More information about the dri-devel mailing list