some half-baked ttm ideas

Wed Sep 16 07:39:20 UTC 2020

On 9/16/20 9:22 AM, Daniel Vetter wrote:
> On Wed, Sep 16, 2020 at 8:50 AM Thomas Hellström (Intel)
> <thomas_os at shipmail.org> wrote:
>>
>> On 9/16/20 6:28 AM, Dave Airlie wrote:
>>> On Wed, 16 Sep 2020 at 14:19, Dave Airlie <airlied at gmail.com> wrote:
>>>> On Wed, 16 Sep 2020 at 00:12, Christian König
>>>> <ckoenig.leichtzumerken at gmail.com> wrote:
>>>>> Hi Dave,
>>>>>
>>>>> I think we should just completely nuke ttm_tt_bind() and ttm_tt_unbind()
>>>>> and all of that.
>>>>>
>>>>> Drivers can to this from their move_notify() callback now instead.
>>>> Good plan, I've put a bunch of rework into the same branch,
>>>>
>>>> https://github.com/airlied/linux/commits/ttm-half-baked-ideas
>>>>
>>>> but I've fried my brain a bit, I'm having trouble reconciling move
>>>> notify and unbinding in the right places, I feel like I'm circling
>>>> around the answer but haven't hit it yet.
>>> drm/ttm: add unbind to move notify paths.
>>>
>>> In that tree is incorrect and I think where things fall apart, since
>>> if we are moving TTM to VRAM that will unbind the TTM object from the
>>> GTT at move notify time before the move has executed.
>>>
>>> I'm feeling a move_complete_notify might be an idea, but I'm wondering
>>> if it's a bad idea.
>>>
>>> Dave.
>> I don't know if this complicates things more, but move_notify was
>> originally only thought to be an invalidation callback, and was never
>> intended to drive any other actions in the driver than to invalidate
>> various GPU bindings.
>>
>> The idea was that TTM should really never set up any GPU bindings, but
>> just provide memory where it was gpu-bindable and make sure it was
>> CPU-mappable where needed. The "exception" was mappable AGP-type
>> gpu-bindings, for the simple reason that they were needed to provide
>> CPU-mappings on systems where you couldn't map the pages directly. But
>> since we set up a GPU map on these systems anyway, many (most) drivers
>> just made use of that, but others took the step further insisting on
>> using move_notify() to set up GPU bindings, which was never intended and
>> adds error paths in the TTM move code that are pretty hard to follow.
>>
>> So if we're changing things here,  I'd vote for the following:
>>
>> * Driver calls ttm_bo_validate to put memory where it is cpu-mappable
>> and gpu-bindable
>> * On successful validate, driver sets up GPU bindings itself.
>>
>> * move_notify only invalidates GPU bindings and should really return a void.
>>
>> So that bind() and unbind() stuff is really only needed for cpu-map
>> through aperture. If we ditch that, then we need to re-define the task
>> of TTM to provide memory in a cpu-mappable location and figure how
>> drivers that require cpu-map-through-aperture should handle this, since
>> they can't use the TTM fault handler for that memory anymore. The same
>> holds for drivers that want to manage their translation table
>> themselves, and needs some cpu-mapping operations to go through the
>> aperture rather than to the pages directly.
> pre-coffee and all, but for mmap I think we can fix that by extracting
> a few more of the building blocks from ttm_bo_vm, so that the driver's
> fault handler becomes roughly:
> 1. bo_reserve
> 2. bo_validate to make sure it's in the right spot, driver would
> control that. E.g. on i915 pwrite we first try the gtt without
> evicting, and then fall back to other mappings, maybe some driver
> wants to have that kind of fine-grained control too.
> 3. top half of ttm fault handler, up to but excluding ttm_tt_populate
> 4. again back in driver code, filling out all the tt itself (so
> keeping the tt structures probably good idea)
> 5. bottom half of ttm fault handler with all the pte insert code and
> prefaulting and all that
> 6. bo_unreserve

Yes, something like that should probably work just fine.

/Thomas