[Intel-gfx] [PATCH v2 08/15] drm/i915/ttm Add a generic TTM memcpy move for page-based iomem
Christian König
christian.koenig at amd.com
Tue May 18 13:08:17 UTC 2021
Am 18.05.21 um 14:52 schrieb Thomas Hellström:
>
> On 5/18/21 2:09 PM, Christian König wrote:
>> Am 18.05.21 um 14:04 schrieb Thomas Hellström:
>>>
>>> On 5/18/21 1:55 PM, Christian König wrote:
>>>>
>>>>
>>>> Am 18.05.21 um 10:26 schrieb Thomas Hellström:
>>>>> The internal ttm_bo_util memcpy uses vmap functionality, and while it
>>>>> probably might be possible to use it for copying in- and out of
>>>>> sglist represented io memory, using io_mem_reserve() / io_mem_free()
>>>>> callbacks, that would cause problems with fault().
>>>>> Instead, implement a method mapping page-by-page using kmap_local()
>>>>> semantics. As an additional benefit we then avoid the occasional
>>>>> global
>>>>> TLB flushes of vmap() and consuming vmap space, elimination of a
>>>>> critical
>>>>> point of failure and with a slight change of semantics we could
>>>>> also push
>>>>> the memcpy out async for testing and async driver develpment
>>>>> purposes.
>>>>> Pushing out async can be done since there is no memory allocation
>>>>> going on
>>>>> that could violate the dma_fence lockdep rules.
>>>>>
>>>>> For copies from iomem, use the WC prefetching memcpy variant for
>>>>> additional speed.
>>>>>
>>>>> Note that drivers that don't want to use struct io_mapping but
>>>>> relies on
>>>>> memremap functionality, and that don't want to use scatterlists for
>>>>> VRAM may well define specialized (hopefully reusable) iterators
>>>>> for their
>>>>> particular environment.
>>>>
>>>> In general yes please since I have that as TODO for TTM for a very
>>>> long time.
>>>>
>>>> But I would prefer to fix the implementation in TTM instead and
>>>> give it proper cursor handling.
>>>>
>>>> Amdgpu is also using page based iomem and we are having similar
>>>> workarounds in place there as well.
>>>>
>>>> I think it makes sense to unify this inside TTM and remove the old
>>>> memcpy util function when done.
>>>>
>>>> Regards,
>>>> Christian.
>>>
>>> Christian,
>>>
>>> I was thinking when we replace the bo.mem with a pointer (and
>>> perhaps have a driver callback to allocate the bo->mem,
>>> we could perhaps embed a struct ttm_kmap_iter and use it for all
>>> mapping in one way or another). That would mean perhaps land this is
>>> i915 now and sort out the unification once the struct ttm_resource,
>>> struct ttm_buffer_object separation has landed?
>>
>> That stuff is ready, reviewed and I'm just waiting for some amdgpu
>> changes to land in drm-misc-next to push it.
>>
>> But yes in general an iterator for the resource object sounds like
>> the right plan to me as well.
>>
>> Christian.
>
> OK, so then are you OK with landing this in i915 for now? That would
> also ofc mean the export you NAK'd but strictly for this memcpy use
> until we merge it with TTM?
Well you can of course prototype that in i915, but I really don't want
to export the TT functions upstream.
Can we cleanly move that functionality into TTM instead?
Christian.
>
>
> /Thomas
>
>>
>>>
>>> /Thomas
>>>
>>>
>>
More information about the Intel-gfx
mailing list