[PATCH 1/8] drm/gem: Write down some rules for vmap usage
Christian König
christian.koenig at amd.com
Tue Dec 1 10:34:53 UTC 2020
Am 01.12.20 um 11:27 schrieb Thomas Zimmermann:
> Hi
>
> Am 01.12.20 um 11:00 schrieb Daniel Vetter:
>> On Tue, Dec 1, 2020 at 10:40 AM Thomas Zimmermann
>> <tzimmermann at suse.de> wrote:
>>>
>>> Hi
>>>
>>> Am 01.12.20 um 10:10 schrieb Daniel Vetter:
>>>> On Tue, Dec 1, 2020 at 9:32 AM Thomas Zimmermann
>>>> <tzimmermann at suse.de> wrote:
>>>>>
>>>>> Hi
>>>>>
>>>>> Am 30.11.20 um 16:33 schrieb Christian König:
>>>>>> Am 30.11.20 um 16:30 schrieb Daniel Vetter:
>>>>>>> On Mon, Nov 30, 2020 at 01:04:26PM +0100, Thomas Zimmermann wrote:
>>>>>>>> Mapping a GEM object's buffer into kernel address space
>>>>>>>> prevents the
>>>>>>>> buffer from being evicted from VRAM, which in turn may result in
>>>>>>>> out-of-memory errors. It's therefore required to only vmap GEM
>>>>>>>> BOs for
>>>>>>>> short periods of time; unless the GEM implementation provides
>>>>>>>> additional
>>>>>>>> guarantees.
>>>>>>>>
>>>>>>>> Signed-off-by: Thomas Zimmermann <tzimmermann at suse.de>
>>>>>>>> ---
>>>>>>>> drivers/gpu/drm/drm_prime.c | 6 ++++++
>>>>>>>> include/drm/drm_gem.h | 16 ++++++++++++++++
>>>>>>>> 2 files changed, 22 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/drm_prime.c
>>>>>>>> b/drivers/gpu/drm/drm_prime.c
>>>>>>>> index 7db55fce35d8..9c9ece9833e0 100644
>>>>>>>> --- a/drivers/gpu/drm/drm_prime.c
>>>>>>>> +++ b/drivers/gpu/drm/drm_prime.c
>>>>>>>> @@ -669,6 +669,12 @@ EXPORT_SYMBOL(drm_gem_unmap_dma_buf);
>>>>>>>> * callback. Calls into &drm_gem_object_funcs.vmap for device
>>>>>>>> specific handling.
>>>>>>>> * The kernel virtual address is returned in map.
>>>>>>>> *
>>>>>>>> + * To prevent the GEM object from being relocated, callers
>>>>>>>> must hold
>>>>>>>> the GEM
>>>>>>>> + * object's reservation lock from when calling this function
>>>>>>>> until
>>>>>>>> releasing the
>>>>>>>> + * mapping. Holding onto a mapping and the associated reservation
>>>>>>>> lock for an
>>>>>>>> + * unbound time may result in out-of-memory errors. Calls to
>>>>>>>> drm_gem_dmabuf_vmap()
>>>>>>>> + * should therefore be accompanied by a call to
>>>>>>>> drm_gem_dmabuf_vunmap().
>>>>>>>> + *
>>>>>>>> * Returns 0 on success or a negative errno code otherwise.
>>>>>>> This is a dma-buf hook, which means just documenting the rules
>>>>>>> you'd like
>>>>>>> to have here isn't enough. We need to roll this out at the
>>>>>>> dma-buf level,
>>>>>>> and enforce it.
>>>>>>>
>>>>>>> Enforce it = assert_lock_held
>>>>>>>
>>>>>>> Roll out = review everyone. Because this goes through dma-buf
>>>>>>> it'll come
>>>>>>> back through shmem helpers (and other helpers and other
>>>>>>> subsystems) back
>>>>>>> to any driver using vmap for gpu buffers. This includes the media
>>>>>>> subsystem, and the media subsystem definitely doesn't cope with
>>>>>>> just
>>>>>>> temporarily mapping buffers. So there we need to pin them, which
>>>>>>> I think
>>>>>>> means we'll need 2 version of dma_buf_vmap - one that's
>>>>>>> temporary and
>>>>>>> requires we hold dma_resv lock, the other requires that the
>>>>>>> buffer is
>>>>>>> pinned.
>>>>>>
>>>>>> OR start to proper use the dma_buf_pin/dma_buf_unpin functions
>>>>>> which I
>>>>>> added to cover this use case as well.
>>>>>
>>>>> While I generally agree, here are some thoughts:
>>>>>
>>>>> I found all generic pin functions useless, because they don't
>>>>> allow for
>>>>> specifying where to pin. With fbdev emulation, this means that
>>>>> console
>>>>> buffers might never make it to VRAM for scanout. If anything, the
>>>>> policy
>>>>> should be that pin always pins in HW-accessible memory.
>>>>>
>>>>> Pin has quite a bit of overhead (more locking, buffer movement),
>>>>> so it
>>>>> should be the second choice after regular vmap. To make both work
>>>>> together, pin probably relies on holding the reservation lock
>>>>> internally.
>>>>>
>>>>> Therefore I think we still would want some additional helpers,
>>>>> such as:
>>>>>
>>>>> pin_unlocked(), which acquires the resv lock, calls regular
>>>>> pin and
>>>>> then drops the resv lock. Same for unpin_unlocked()
>>>>>
>>>>> vmap_pinned(), which enforces that the buffer has been pinned
>>>>> and
>>>>> then calls regalar vmap. Same for vunmap_pinned()
>>>>>
>>>>> A typical pattern with these functions would look like this.
>>>>>
>>>>> drm_gem_object bo;
>>>>> dma_buf_map map;
>>>>>
>>>>> init() {
>>>>> pin_unlocked(bo);
>>>>> vmap_pinned(bo, map);
>>>>> }
>>>>>
>>>>> worker() {
>>>>> begin_cpu_access()
>>>>> // access bo via map
>>>>> end_cpu_access()
>>>>> }
>>>>>
>>>>> fini() {
>>>>> vunmap_pinned(bo, map);
>>>>> unpin_unlocked(bo);
>>>>> }
>>>>>
>>>>> init()
>>>>> while (...) {
>>>>> worker()
>>>>> }
>>>>> fini()
>>>>>
>>>>> Is that reasonable for media drivers?
>>>>
>>>> So media drivers go through dma-buf, which means we always pin into
>>>> system memory. Which I guess for vram-only display drivers makes no
>>>> sense and should be rejected, but we still need somewhat consistent
>>>> rules.
>>>>
>>>> The other thing is that if you do a dma_buf_attach without dynamic
>>>> mode, dma-buf will pin things for you already. So in many cases it
>>>
>>> Do you have a pointer to code that illustrates this well?
>>>
>>>> could be that we don't need a separate pin (but since the pin is in
>>>> the exporter, not dma-buf layer, we can't check for that). I'm also
>>>> not seeing why existing users need to split up their dma_buf_vmap into
>>>> a pin + vmap, they don't need them separately.
>>>
>>> It's two different operations, with pin having some possible overhead
>>> and failure conditions. And during the last discussion, pin was say to
>>> be for userspace-managed buffers. Kernel code should hold the
>>> reservation lock.
>>>
>>> For my POV, the current interfaces have no clear policy or semantics.
>>> Looking through the different GEM implementations, each one seems to
>>> have its own interpretation.
>>
>> Yup, that's the problem really. In the past we've had vmap exclusively
>> for permanently pinned access, with no locking requirements. Now we're
>> trying to make this more dynamic, but in a somewhat ad-hoc fashion
>> (generic fbdev emulation charged ahead with making the fbdev
>> framebuffer evictable), and now it breaks at every seam. Adding more
>> ad-hoc semantics on top doesn't seem like a good idea.
>>
>> That's why I think we should have 2 different interfaces:
>> - dma_buf_vmap, the one we currently have. Permanently pins the
>> buffer, mapping survives, no locking required.
>> - dma_buf_vmap_local, the new interface, the one that generic fbdev
>> should have used (but oh well mistakes happen), requires
>> dma_resv_lock, the mapping is only local to the caller
>
> In patch 6 of this series, there's ast cursor code that acquires two
> BO's reservation locks and vmaps them afterwards. That's probably how
> you intend to use dma_buf_vmap_local.
>
> However, I think it's more logically to have a vmap callback that only
> does the actual vmap. This is all that exporters would have to implement.
>
> And with that, build one helper that pins before vmap and one helper
> that gets the resv lock.
I don't think that this is will work nor is it a good approach.
See the ast cursor handling for example. You need to acquire two BOs
here, not just one. And this can't be done cleanly with a single vmap call.
Regards,
Christian.
>
> I know that it might require some work on exporting drivers. But with
> your proposal, we probably need another dma-buf callback just for
> vmap_local. (?) That seems even less appealing to me.
>
> Best regards
> Thomas
>
>>
>> Trying to shovel both semantics into one interface, depending upon
>> which implementation we have backing the buffer, doesn't work indeed.
>>
>> Also on the pin topic, I think neither interface should require
>> callers to explicitly pin anything. For existing users it should
>> happen automatically behind the scenes imo, that's what they're
>> expecting.
>> -Daniel
>>
>>
>>>> I think we could use what we've done for dynamic dma-buf attachment
>>>> (which also change locking rules) and just have new functions for the
>>>> new way (i.e. short term vmap protected by dma_resv lock. Maybe call
>>>> these dma_buf_vmap_local, in the spirit of the new kmap_local which
>>>> are currently under discussion. I think _local suffix is better, for
>>>> otherwise people might do something silly like
>>>>
>>>> dma_resv_lock();
>>>> dma_buf_vmap_locked();
>>>> dma_resv_unlock();
>>>>
>>>> /* actual access maybe even in some other thread */
>>>>
>>>> dma_buf_resv_lock();
>>>> dma_buf_vunmap_unlocked();
>>>> dma_resv_unlock();
>>>>
>>>> _local suffix is better at telling that the resulting pointer has very
>>>> limited use (essentially just local to the calling context, if you
>>>> don't change any locking or anything).
>>>
>>> _local sounds good.
>>>
>>> Best regards
>>> Thomas
>>>
>>>>
>>>> I think encouraging importers to call dma_buf_pin/unpin isn't a good
>>>> idea. Yes dynamic ones need it, but maybe we should check for that
>>>> somehow in the exporterd interface (atm only amdgpu is using it).
>>>> -Daniel
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> Best regards
>>>>> Thomas
>>>>>
>>>>>
>>>>>>
>>>>>> Cheers,
>>>>>> Christian.
>>>>>>
>>>>>>>
>>>>>>> That's what I meant with that this approach here is very
>>>>>>> sprawling :-/
>>>>>>> -Daniel
>>>>>>>
>>>>>>>> */
>>>>>>>> int drm_gem_dmabuf_vmap(struct dma_buf *dma_buf, struct
>>>>>>>> dma_buf_map
>>>>>>>> *map)
>>>>>>>> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
>>>>>>>> index 5e6daa1c982f..7c34cd5ec261 100644
>>>>>>>> --- a/include/drm/drm_gem.h
>>>>>>>> +++ b/include/drm/drm_gem.h
>>>>>>>> @@ -137,7 +137,21 @@ struct drm_gem_object_funcs {
>>>>>>>> * Returns a virtual address for the buffer. Used by the
>>>>>>>> * drm_gem_dmabuf_vmap() helper.
>>>>>>>> *
>>>>>>>> + * Notes to implementors:
>>>>>>>> + *
>>>>>>>> + * - Implementations must expect pairs of @vmap and
>>>>>>>> @vunmap to be
>>>>>>>> + * called frequently and should optimize for this case.
>>>>>>>> + *
>>>>>>>> + * - Implemenations may expect the caller to hold the GEM
>>>>>>>> object's
>>>>>>>> + * reservation lock to protect against concurrent calls and
>>>>>>>> relocation
>>>>>>>> + * of the GEM object.
>>>>>>>> + *
>>>>>>>> + * - Implementations may provide additional guarantees (e.g.,
>>>>>>>> working
>>>>>>>> + * without holding the reservation lock).
>>>>>>>> + *
>>>>>>>> * This callback is optional.
>>>>>>>> + *
>>>>>>>> + * See also drm_gem_dmabuf_vmap()
>>>>>>>> */
>>>>>>>> int (*vmap)(struct drm_gem_object *obj, struct
>>>>>>>> dma_buf_map *map);
>>>>>>>> @@ -148,6 +162,8 @@ struct drm_gem_object_funcs {
>>>>>>>> * drm_gem_dmabuf_vunmap() helper.
>>>>>>>> *
>>>>>>>> * This callback is optional.
>>>>>>>> + *
>>>>>>>> + * See also @vmap.
>>>>>>>> */
>>>>>>>> void (*vunmap)(struct drm_gem_object *obj, struct
>>>>>>>> dma_buf_map
>>>>>>>> *map);
>>>>>>>> --
>>>>>>>> 2.29.2
>>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> dri-devel mailing list
>>>>>> dri-devel at lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>>>>
>>>>> --
>>>>> Thomas Zimmermann
>>>>> Graphics Driver Developer
>>>>> SUSE Software Solutions Germany GmbH
>>>>> Maxfeldstr. 5, 90409 Nürnberg, Germany
>>>>> (HRB 36809, AG Nürnberg)
>>>>> Geschäftsführer: Felix Imendörffer
>>>>>
>>>>
>>>>
>>>
>>> --
>>> Thomas Zimmermann
>>> Graphics Driver Developer
>>> SUSE Software Solutions Germany GmbH
>>> Maxfeldstr. 5, 90409 Nürnberg, Germany
>>> (HRB 36809, AG Nürnberg)
>>> Geschäftsführer: Felix Imendörffer
>>>
>>
>>
>
More information about the dri-devel
mailing list