[PATCH v2 08/15] drm/i915/ttm Add a generic TTM memcpy move for page-based iomem

Tue May 18 12:09:27 UTC 2021

Am 18.05.21 um 14:04 schrieb Thomas Hellström:
>
> On 5/18/21 1:55 PM, Christian König wrote:
>>
>>
>> Am 18.05.21 um 10:26 schrieb Thomas Hellström:
>>> The internal ttm_bo_util memcpy uses vmap functionality, and while it
>>> probably might be possible to use it for copying in- and out of
>>> sglist represented io memory, using io_mem_reserve() / io_mem_free()
>>> callbacks, that would cause problems with fault().
>>> Instead, implement a method mapping page-by-page using kmap_local()
>>> semantics. As an additional benefit we then avoid the occasional global
>>> TLB flushes of vmap() and consuming vmap space, elimination of a 
>>> critical
>>> point of failure and with a slight change of semantics we could also 
>>> push
>>> the memcpy out async for testing and async driver develpment purposes.
>>> Pushing out async can be done since there is no memory allocation 
>>> going on
>>> that could violate the dma_fence lockdep rules.
>>>
>>> For copies from iomem, use the WC prefetching memcpy variant for
>>> additional speed.
>>>
>>> Note that drivers that don't want to use struct io_mapping but 
>>> relies on
>>> memremap functionality, and that don't want to use scatterlists for
>>> VRAM may well define specialized (hopefully reusable) iterators for 
>>> their
>>> particular environment.
>>
>> In general yes please since I have that as TODO for TTM for a very 
>> long time.
>>
>> But I would prefer to fix the implementation in TTM instead and give 
>> it proper cursor handling.
>>
>> Amdgpu is also using page based iomem and we are having similar 
>> workarounds in place there as well.
>>
>> I think it makes sense to unify this inside TTM and remove the old 
>> memcpy util function when done.
>>
>> Regards,
>> Christian.
>
> Christian,
>
> I was thinking when we replace the bo.mem with a pointer (and perhaps 
> have a driver callback to allocate the bo->mem,
> we could perhaps embed a struct ttm_kmap_iter and use it for all 
> mapping in one way or another). That would mean perhaps land this is 
> i915 now and sort out the unification once the struct ttm_resource, 
> struct ttm_buffer_object separation has landed?

That stuff is ready, reviewed and I'm just waiting for some amdgpu 
changes to land in drm-misc-next to push it.

But yes in general an iterator for the resource object sounds like the 
right plan to me as well.

Christian.

>
> /Thomas
>
>