[Intel-gfx] [PATCH v2 08/15] drm/i915/ttm Add a generic TTM memcpy move for page-based iomem
Thomas Hellström
thomas.hellstrom at linux.intel.com
Tue May 18 13:24:40 UTC 2021
On 5/18/21 3:08 PM, Christian König wrote:
> Am 18.05.21 um 14:52 schrieb Thomas Hellström:
>>
>> On 5/18/21 2:09 PM, Christian König wrote:
>>> Am 18.05.21 um 14:04 schrieb Thomas Hellström:
>>>>
>>>> On 5/18/21 1:55 PM, Christian König wrote:
>>>>>
>>>>>
>>>>> Am 18.05.21 um 10:26 schrieb Thomas Hellström:
>>>>>> The internal ttm_bo_util memcpy uses vmap functionality, and
>>>>>> while it
>>>>>> probably might be possible to use it for copying in- and out of
>>>>>> sglist represented io memory, using io_mem_reserve() / io_mem_free()
>>>>>> callbacks, that would cause problems with fault().
>>>>>> Instead, implement a method mapping page-by-page using kmap_local()
>>>>>> semantics. As an additional benefit we then avoid the occasional
>>>>>> global
>>>>>> TLB flushes of vmap() and consuming vmap space, elimination of a
>>>>>> critical
>>>>>> point of failure and with a slight change of semantics we could
>>>>>> also push
>>>>>> the memcpy out async for testing and async driver develpment
>>>>>> purposes.
>>>>>> Pushing out async can be done since there is no memory allocation
>>>>>> going on
>>>>>> that could violate the dma_fence lockdep rules.
>>>>>>
>>>>>> For copies from iomem, use the WC prefetching memcpy variant for
>>>>>> additional speed.
>>>>>>
>>>>>> Note that drivers that don't want to use struct io_mapping but
>>>>>> relies on
>>>>>> memremap functionality, and that don't want to use scatterlists for
>>>>>> VRAM may well define specialized (hopefully reusable) iterators
>>>>>> for their
>>>>>> particular environment.
>>>>>
>>>>> In general yes please since I have that as TODO for TTM for a very
>>>>> long time.
>>>>>
>>>>> But I would prefer to fix the implementation in TTM instead and
>>>>> give it proper cursor handling.
>>>>>
>>>>> Amdgpu is also using page based iomem and we are having similar
>>>>> workarounds in place there as well.
>>>>>
>>>>> I think it makes sense to unify this inside TTM and remove the old
>>>>> memcpy util function when done.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>
>>>> Christian,
>>>>
>>>> I was thinking when we replace the bo.mem with a pointer (and
>>>> perhaps have a driver callback to allocate the bo->mem,
>>>> we could perhaps embed a struct ttm_kmap_iter and use it for all
>>>> mapping in one way or another). That would mean perhaps land this
>>>> is i915 now and sort out the unification once the struct
>>>> ttm_resource, struct ttm_buffer_object separation has landed?
>>>
>>> That stuff is ready, reviewed and I'm just waiting for some amdgpu
>>> changes to land in drm-misc-next to push it.
>>>
>>> But yes in general an iterator for the resource object sounds like
>>> the right plan to me as well.
>>>
>>> Christian.
>>
>> OK, so then are you OK with landing this in i915 for now? That would
>> also ofc mean the export you NAK'd but strictly for this memcpy use
>> until we merge it with TTM?
>
> Well you can of course prototype that in i915, but I really don't want
> to export the TT functions upstream.
I understand, I once had the same thoughts trying to avoid that as far
as possible, so this function was actually then added to the ttm_bo
interface, (hence the awkward naming) as a helper for drivers
implementing move(), essentially a very special case of
ttm_bo_move_accel_cleanup(), but anyway, see below:
>
> Can we cleanly move that functionality into TTM instead?
I'll take a look at that, but I think we'd initially be having iterators
mimicing the current move_memcpy() for the
linear iomem !WC cases, hope that's OK.
/Thomas
>
> Christian.
>
>>
>>
>> /Thomas
>>
>>>
>>>>
>>>> /Thomas
>>>>
>>>>
>>>
>
More information about the Intel-gfx
mailing list