[v2,1/8] drm/fbdev-generic: Always use shadow buffering
Thomas Zimmermann
tzimmermann at suse.de
Tue Mar 21 19:12:54 UTC 2023
Hi,
thanks for testing the patchset.
Am 21.03.23 um 16:23 schrieb Sui jingfeng:
>
> On 2023/3/20 23:07, Thomas Zimmermann wrote:
>> Remove all codepaths that implement fbdev output directly on GEM
>> buffers. Always allocate a shadow buffer in system memory and set
>> up deferred I/O for mmap.
>>
>> The fbdev code that operated directly on GEM buffers was used by
>> drivers based on GEM DMA helpers. Those drivers have been migrated
>> to use fbdev-dma, a dedicated fbdev emulation for DMA memory. All
>> remaining users of fbdev-generic require shadow buffering.
>>
>> Memory management of the remaining callers uses TTM, GEM SHMEM
>> helpers or a variant of GEM DMA helpers that is incompatible with
>> fbdev-dma. Therefore remove the unused codepaths from fbdev-generic
>> and simplify the code.
>>
>> Using a shadow buffer with deferred I/O is probably the best case
>> for most remaining callers. Some of the TTM-based drivers might
>> benefit from a dedicated fbdev emulation that operates directly on
>> the driver's video memory.
>
> I don't understand here, the TTM-based drivers should have equivalent
> performance
>
> with you implement. Because device memory typically very slow for cpu
> read, at least
>
> this is true for Mips and loongarch architecture. TTM-based drivers for
> those platform
>
> is also prefer to render to system ram first(for fast reading and
> compositing) and then
>
> blit to the real framebuffer pinned to VRAM.
Right. But in practice, writing the console directly to the video ram is
perceived as faster. The shadow buffer needs an additional memcpy to the
video ram. That adds latency to the output. It's more a slowness in
perception than in actual I/O access.
The drawing helpers for system memory have also been slow compared to
the helper for I/O memory. I've fixed some of it, but I'm not sure if I
covered all cases. See [1] for the patches.
[1] https://patchwork.freedesktop.org/series/100317/
And finally, with the fbdev buffer in video memory, drivers could
utilize hardware blitting to further speed up drawing. Although not many
drivers ever did this.
>
>
> In turn, I think shmem helper based drivers might benefit from a
> dedicated fbdev emulation.
>
> Because you are blit to the shadow of the video memory for shmem helper
> based driver. The
>
> driver may need another blit to the ultimate framebuffer. Using a
Indeed. SHMEM has two memcpy operations. If we could remove the fbdev
shadow buffer and operate on the SHMEM buffer directly, we'd reduce
overhead.
It would also mean that we'd mmap the SHMEM pages to fbdev's userspace.
I've not yet managed to implement this successfully, as it would require
new mmap code in the GEM SHMEM object for this purpose. And so far, I've
always ran into a bug where the fbdev emulation would end up in a
corrupt state.
I've spend time on 'fbdev-shmem' emulation, but given the problems, it
hasn't paid off so far.
Best regards
Thomas
> shadow buffer is still acceptable
>
> though, but why do you say "the TTM-based drivers might benefit from a
> dedicated fbdev emulation" ?
>
>>
>> Signed-off-by: Thomas Zimmermann <tzimmermann at suse.de>
>> Reviewed-by: Javier Martinez Canillas <javierm at redhat.com>
>> Acked-by: Zack Rusin <zackr at vmware.com>
>> ---
>> drivers/gpu/drm/drm_fbdev_generic.c | 184 +++++-----------------------
>> 1 file changed, 30 insertions(+), 154 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_fbdev_generic.c
>> b/drivers/gpu/drm/drm_fbdev_generic.c
>> index 4d6325e91565..e48a8e82378d 100644
>> --- a/drivers/gpu/drm/drm_fbdev_generic.c
>> +++ b/drivers/gpu/drm/drm_fbdev_generic.c
>> @@ -11,16 +11,6 @@
>> #include <drm/drm_fbdev_generic.h>
>> -static bool drm_fbdev_use_shadow_fb(struct drm_fb_helper *fb_helper)
>> -{
>> - struct drm_device *dev = fb_helper->dev;
>> - struct drm_framebuffer *fb = fb_helper->fb;
>> -
>> - return dev->mode_config.prefer_shadow_fbdev ||
>> - dev->mode_config.prefer_shadow ||
>> - fb->funcs->dirty;
>> -}
>> -
>> /* @user: 1=userspace, 0=fbcon */
>> static int drm_fbdev_fb_open(struct fb_info *info, int user)
>> {
>> @@ -46,115 +36,33 @@ static int drm_fbdev_fb_release(struct fb_info
>> *info, int user)
>> static void drm_fbdev_fb_destroy(struct fb_info *info)
>> {
>> struct drm_fb_helper *fb_helper = info->par;
>> - void *shadow = NULL;
>> + void *shadow = info->screen_buffer;
>> if (!fb_helper->dev)
>> return;
>> - if (info->fbdefio)
>> - fb_deferred_io_cleanup(info);
>> - if (drm_fbdev_use_shadow_fb(fb_helper))
>> - shadow = info->screen_buffer;
>> -
>> + fb_deferred_io_cleanup(info);
>> drm_fb_helper_fini(fb_helper);
>> -
>> - if (shadow)
>> - vfree(shadow);
>> - else if (fb_helper->buffer)
>> - drm_client_buffer_vunmap(fb_helper->buffer);
>> -
>> + vfree(shadow);
>> drm_client_framebuffer_delete(fb_helper->buffer);
>> - drm_client_release(&fb_helper->client);
>> + drm_client_release(&fb_helper->client);
>> drm_fb_helper_unprepare(fb_helper);
>> kfree(fb_helper);
>> }
>> -static int drm_fbdev_fb_mmap(struct fb_info *info, struct
>> vm_area_struct *vma)
>> -{
>> - struct drm_fb_helper *fb_helper = info->par;
>> -
>> - if (drm_fbdev_use_shadow_fb(fb_helper))
>> - return fb_deferred_io_mmap(info, vma);
>> - else if (fb_helper->dev->driver->gem_prime_mmap)
>> - return
>> fb_helper->dev->driver->gem_prime_mmap(fb_helper->buffer->gem, vma);
>> - else
>> - return -ENODEV;
>> -}
>> -
>> -static bool drm_fbdev_use_iomem(struct fb_info *info)
>> -{
>> - struct drm_fb_helper *fb_helper = info->par;
>> - struct drm_client_buffer *buffer = fb_helper->buffer;
>> -
>> - return !drm_fbdev_use_shadow_fb(fb_helper) && buffer->map.is_iomem;
>> -}
>> -
>> -static ssize_t drm_fbdev_fb_read(struct fb_info *info, char __user *buf,
>> - size_t count, loff_t *ppos)
>> -{
>> - ssize_t ret;
>> -
>> - if (drm_fbdev_use_iomem(info))
>> - ret = drm_fb_helper_cfb_read(info, buf, count, ppos);
>> - else
>> - ret = drm_fb_helper_sys_read(info, buf, count, ppos);
>> -
>> - return ret;
>> -}
>> -
>> -static ssize_t drm_fbdev_fb_write(struct fb_info *info, const char
>> __user *buf,
>> - size_t count, loff_t *ppos)
>> -{
>> - ssize_t ret;
>> -
>> - if (drm_fbdev_use_iomem(info))
>> - ret = drm_fb_helper_cfb_write(info, buf, count, ppos);
>> - else
>> - ret = drm_fb_helper_sys_write(info, buf, count, ppos);
>> -
>> - return ret;
>> -}
>> -
>> -static void drm_fbdev_fb_fillrect(struct fb_info *info,
>> - const struct fb_fillrect *rect)
>> -{
>> - if (drm_fbdev_use_iomem(info))
>> - drm_fb_helper_cfb_fillrect(info, rect);
>> - else
>> - drm_fb_helper_sys_fillrect(info, rect);
>> -}
>> -
>> -static void drm_fbdev_fb_copyarea(struct fb_info *info,
>> - const struct fb_copyarea *area)
>> -{
>> - if (drm_fbdev_use_iomem(info))
>> - drm_fb_helper_cfb_copyarea(info, area);
>> - else
>> - drm_fb_helper_sys_copyarea(info, area);
>> -}
>> -
>> -static void drm_fbdev_fb_imageblit(struct fb_info *info,
>> - const struct fb_image *image)
>> -{
>> - if (drm_fbdev_use_iomem(info))
>> - drm_fb_helper_cfb_imageblit(info, image);
>> - else
>> - drm_fb_helper_sys_imageblit(info, image);
>> -}
>> -
>> static const struct fb_ops drm_fbdev_fb_ops = {
>> .owner = THIS_MODULE,
>> - DRM_FB_HELPER_DEFAULT_OPS,
>> .fb_open = drm_fbdev_fb_open,
>> .fb_release = drm_fbdev_fb_release,
>> + .fb_read = drm_fb_helper_sys_read,
>> + .fb_write = drm_fb_helper_sys_write,
>> + DRM_FB_HELPER_DEFAULT_OPS,
>> + .fb_fillrect = drm_fb_helper_sys_fillrect,
>> + .fb_copyarea = drm_fb_helper_sys_copyarea,
>> + .fb_imageblit = drm_fb_helper_sys_imageblit,
>> + .fb_mmap = fb_deferred_io_mmap,
>> .fb_destroy = drm_fbdev_fb_destroy,
>> - .fb_mmap = drm_fbdev_fb_mmap,
>> - .fb_read = drm_fbdev_fb_read,
>> - .fb_write = drm_fbdev_fb_write,
>> - .fb_fillrect = drm_fbdev_fb_fillrect,
>> - .fb_copyarea = drm_fbdev_fb_copyarea,
>> - .fb_imageblit = drm_fbdev_fb_imageblit,
>> };
>> /*
>> @@ -169,7 +77,6 @@ static int drm_fbdev_fb_probe(struct drm_fb_helper
>> *fb_helper,
>> struct drm_framebuffer *fb;
>> struct fb_info *info;
>> u32 format;
>> - struct iosys_map map;
>> int ret;
>> drm_dbg_kms(dev, "surface width(%d), height(%d) and bpp(%d)\n",
>> @@ -197,44 +104,21 @@ static int drm_fbdev_fb_probe(struct
>> drm_fb_helper *fb_helper,
>> drm_fb_helper_fill_info(info, fb_helper, sizes);
>> - if (drm_fbdev_use_shadow_fb(fb_helper)) {
>> - info->screen_buffer = vzalloc(info->screen_size);
>> - if (!info->screen_buffer)
>> - return -ENOMEM;
>> - info->flags |= FBINFO_VIRTFB | FBINFO_READS_FAST;
>> + info->screen_buffer = vzalloc(info->screen_size);
>> + if (!info->screen_buffer)
>> + return -ENOMEM;
>> + info->flags |= FBINFO_VIRTFB | FBINFO_READS_FAST;
>> - /* Set a default deferred I/O handler */
>> - fb_helper->fbdefio.delay = HZ / 20;
>> - fb_helper->fbdefio.deferred_io = drm_fb_helper_deferred_io;
>> + info->fix.smem_start =
>> page_to_phys(vmalloc_to_page(info->screen_buffer));
>
> Why simply use screen_buffer instead of info->screen_buffer here ?
>
> info->fix.smem_start = page_to_phys(vmalloc_to_page(screen_buffer));
>
> I'm asking because I see you use vfree(screen_buffer) below the
> err_vfree label in this function.
>
>
> I also want to ask another question here:
> I heard,//the/ /memory/ /allocated/ /by//*//vzalloc//*//is/ /not/
> /physically/ /contiguous/. /Why such a virtual address can convert to
> physical address by//page_to_phys(vmalloc_to_page())?//Does it legal for
> a GPU without MMU accessing such a physical address leaked to user-space?
>
>> - info->fbdefio = &fb_helper->fbdefio;
>> - ret = fb_deferred_io_init(info);
>> - if (ret)
>> - return ret;
>> - } else {
>> - /* buffer is mapped for HW framebuffer */
>> - ret = drm_client_buffer_vmap(fb_helper->buffer, &map);
>> - if (ret)
>> - return ret;
>> - if (map.is_iomem) {
>> - info->screen_base = map.vaddr_iomem;
>> - } else {
>> - info->screen_buffer = map.vaddr;
>> - info->flags |= FBINFO_VIRTFB;
>> - }
>> -
>> - /*
>> - * Shamelessly leak the physical address to user-space. As
>> - * page_to_phys() is undefined for I/O memory, warn in this
>> - * case.
>> - */
>> -#if IS_ENABLED(CONFIG_DRM_FBDEV_LEAK_PHYS_SMEM)
>> - if (fb_helper->hint_leak_smem_start && info->fix.smem_start
>> == 0 &&
>> - !drm_WARN_ON_ONCE(dev, map.is_iomem))
>> - info->fix.smem_start =
>> - page_to_phys(virt_to_page(info->screen_buffer));
>> -#endif
>> - }
>> + /* Set a default deferred I/O handler */
>> + fb_helper->fbdefio.delay = HZ / 20;
>> + fb_helper->fbdefio.deferred_io = drm_fb_helper_deferred_io;
>> +
>> + info->fbdefio = &fb_helper->fbdefio;
>> + ret = fb_deferred_io_init(info);
>> + if (ret)
>> + return ret;
>> return 0;
>> }
>> @@ -318,18 +202,13 @@ static int drm_fbdev_fb_dirty(struct
>> drm_fb_helper *helper, struct drm_clip_rect
>> struct drm_device *dev = helper->dev;
>> int ret;
>> - if (!drm_fbdev_use_shadow_fb(helper))
>> - return 0;
>> -
>> /* Call damage handlers only if necessary */
>> if (!(clip->x1 < clip->x2 && clip->y1 < clip->y2))
>> return 0;
>> - if (helper->buffer) {
>> - ret = drm_fbdev_damage_blit(helper, clip);
>> - if (drm_WARN_ONCE(dev, ret, "Damage blitter failed:
>> ret=%d\n", ret))
>> - return ret;
>> - }
>> + ret = drm_fbdev_damage_blit(helper, clip);
>> + if (drm_WARN_ONCE(dev, ret, "Damage blitter failed: ret=%d\n", ret))
>> + return ret;
>> if (helper->fb->funcs->dirty) {
>> ret = helper->fb->funcs->dirty(helper->fb, NULL, 0, 0, clip,
>> 1);
>> @@ -415,12 +294,9 @@ static const struct drm_client_funcs
>> drm_fbdev_client_funcs = {
>> * suspend/resume need to call drm_fb_helper_set_suspend_unlocked()
>> themselves.
>> * Simple drivers might use drm_mode_config_helper_suspend().
>> *
>> - * Drivers that set the dirty callback on their framebuffer will get
>> a shadow
>> - * fbdev buffer that is blitted onto the real buffer. This is done in
>> order to
>> - * make deferred I/O work with all kinds of buffers. A shadow buffer
>> can be
>> - * requested explicitly by setting struct
>> drm_mode_config.prefer_shadow or
>> - * struct drm_mode_config.prefer_shadow_fbdev to true beforehand.
>> This is
>> - * required to use generic fbdev emulation with SHMEM helpers.
>> + * In order to provide fixed mmap-able memory ranges,
>
> I don't understand here, what do you mean about *fixed*?
>
> fixed relative to what? Can you say more?
>
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 840 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20230321/45ec036d/attachment.sig>
More information about the dri-devel
mailing list