[Intel-gfx] [PATCH v2 08/15] drm/i915/ttm Add a generic TTM memcpy move for page-based iomem

Christian König christian.koenig at amd.com
Tue May 18 13:26:50 UTC 2021


Am 18.05.21 um 15:24 schrieb Thomas Hellström:
>
> On 5/18/21 3:08 PM, Christian König wrote:
>> Am 18.05.21 um 14:52 schrieb Thomas Hellström:
>>>
>>> On 5/18/21 2:09 PM, Christian König wrote:
>>>> Am 18.05.21 um 14:04 schrieb Thomas Hellström:
>>>>>
>>>>> On 5/18/21 1:55 PM, Christian König wrote:
>>>>>>
>>>>>>
>>>>>> Am 18.05.21 um 10:26 schrieb Thomas Hellström:
>>>>>>> The internal ttm_bo_util memcpy uses vmap functionality, and 
>>>>>>> while it
>>>>>>> probably might be possible to use it for copying in- and out of
>>>>>>> sglist represented io memory, using io_mem_reserve() / 
>>>>>>> io_mem_free()
>>>>>>> callbacks, that would cause problems with fault().
>>>>>>> Instead, implement a method mapping page-by-page using kmap_local()
>>>>>>> semantics. As an additional benefit we then avoid the occasional 
>>>>>>> global
>>>>>>> TLB flushes of vmap() and consuming vmap space, elimination of a 
>>>>>>> critical
>>>>>>> point of failure and with a slight change of semantics we could 
>>>>>>> also push
>>>>>>> the memcpy out async for testing and async driver develpment 
>>>>>>> purposes.
>>>>>>> Pushing out async can be done since there is no memory 
>>>>>>> allocation going on
>>>>>>> that could violate the dma_fence lockdep rules.
>>>>>>>
>>>>>>> For copies from iomem, use the WC prefetching memcpy variant for
>>>>>>> additional speed.
>>>>>>>
>>>>>>> Note that drivers that don't want to use struct io_mapping but 
>>>>>>> relies on
>>>>>>> memremap functionality, and that don't want to use scatterlists for
>>>>>>> VRAM may well define specialized (hopefully reusable) iterators 
>>>>>>> for their
>>>>>>> particular environment.
>>>>>>
>>>>>> In general yes please since I have that as TODO for TTM for a 
>>>>>> very long time.
>>>>>>
>>>>>> But I would prefer to fix the implementation in TTM instead and 
>>>>>> give it proper cursor handling.
>>>>>>
>>>>>> Amdgpu is also using page based iomem and we are having similar 
>>>>>> workarounds in place there as well.
>>>>>>
>>>>>> I think it makes sense to unify this inside TTM and remove the 
>>>>>> old memcpy util function when done.
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>
>>>>> Christian,
>>>>>
>>>>> I was thinking when we replace the bo.mem with a pointer (and 
>>>>> perhaps have a driver callback to allocate the bo->mem,
>>>>> we could perhaps embed a struct ttm_kmap_iter and use it for all 
>>>>> mapping in one way or another). That would mean perhaps land this 
>>>>> is i915 now and sort out the unification once the struct 
>>>>> ttm_resource, struct ttm_buffer_object separation has landed?
>>>>
>>>> That stuff is ready, reviewed and I'm just waiting for some amdgpu 
>>>> changes to land in drm-misc-next to push it.
>>>>
>>>> But yes in general an iterator for the resource object sounds like 
>>>> the right plan to me as well.
>>>>
>>>> Christian.
>>>
>>> OK, so then are you OK with landing this in i915 for now? That would 
>>> also ofc mean the export you NAK'd but strictly for this memcpy use 
>>> until we merge it with TTM?
>>
>> Well you can of course prototype that in i915, but I really don't 
>> want to export the TT functions upstream.
>
> I understand, I once had the same thoughts trying to avoid that as far 
> as possible, so this function was actually then added to the ttm_bo 
> interface, (hence the awkward naming) as a helper for drivers 
> implementing move(), essentially a very special case of 
> ttm_bo_move_accel_cleanup(), but anyway, see below:
>
>>
>> Can we cleanly move that functionality into TTM instead?
>
> I'll take a look at that, but I think we'd initially be having 
> iterators mimicing the current move_memcpy() for the
> linear iomem !WC cases, hope that's OK.

Yeah, that's peefectly fine with me. I can tackle cleaning up all 
drivers and move over to the new implementation when that is fully complete.

As I said we already have the same problem in amdgpu and only solved it 
by avoiding memcpy all together.

Christian.

>
> /Thomas
>
>
>>
>> Christian.
>>
>>>
>>>
>>> /Thomas
>>>
>>>>
>>>>>
>>>>> /Thomas
>>>>>
>>>>>
>>>>
>>



More information about the Intel-gfx mailing list