[PATCH v2 3/4] drm/tinydrm: Add tinydrm_xrgb8888_to_gray8() helper

Wed Jul 5 05:57:49 UTC 2017

On Tue, Jul 04, 2017 at 11:15:56PM +0200, Noralf Trønnes wrote:
> 
> Den 04.07.2017 17.28, skrev Daniel Vetter:
> > On Tue, Jul 04, 2017 at 05:00:46PM +0200, Noralf Trønnes wrote:
> > > Den 28.06.2017 19.12, skrev Daniel Vetter:
> > > > On Wed, Jun 28, 2017 at 04:26:23PM +0200, Noralf Trønnes wrote:
> > > > > Den 26.06.2017 11.43, skrev Daniel Vetter:
> > > > > > On Thu, Jun 08, 2017 at 05:14:34PM +0200, Noralf Trønnes wrote:
> > > > > > > Drm has no monochrome or greyscale support so add a conversion
> > > > > > > from the common format XR24.
> > > > > > > 
> > > > > > > Also reorder includes into the common order.
> > > > > > > 
> > > > > > > Signed-off-by: Noralf Trønnes <noralf at tronnes.org>
> > > > > > > ---
> > > > > > >     drivers/gpu/drm/tinydrm/core/tinydrm-helpers.c | 74 +++++++++++++++++++++++++-
> > > > > > >     include/drm/tinydrm/tinydrm-helpers.h          |  1 +
> > > > > > >     2 files changed, 73 insertions(+), 2 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/gpu/drm/tinydrm/core/tinydrm-helpers.c b/drivers/gpu/drm/tinydrm/core/tinydrm-helpers.c
> > > > > > > index d4cda33..75808bb 100644
> > > > > > > --- a/drivers/gpu/drm/tinydrm/core/tinydrm-helpers.c
> > > > > > > +++ b/drivers/gpu/drm/tinydrm/core/tinydrm-helpers.c
> > > > > > > @@ -7,13 +7,15 @@
> > > > > > >      * (at your option) any later version.
> > > > > > >      */
> > > > > > > -#include <drm/tinydrm/tinydrm.h>
> > > > > > > -#include <drm/tinydrm/tinydrm-helpers.h>
> > > > > > >     #include <linux/backlight.h>
> > > > > > > +#include <linux/dma-buf.h>
> > > > > > >     #include <linux/pm.h>
> > > > > > >     #include <linux/spi/spi.h>
> > > > > > >     #include <linux/swab.h>
> > > > > > > +#include <drm/tinydrm/tinydrm.h>
> > > > > > > +#include <drm/tinydrm/tinydrm-helpers.h>
> > > > > > > +
> > > > > > >     static unsigned int spi_max;
> > > > > > >     module_param(spi_max, uint, 0400);
> > > > > > >     MODULE_PARM_DESC(spi_max, "Set a lower SPI max transfer size");
> > > > > > > @@ -181,6 +183,74 @@ void tinydrm_xrgb8888_to_rgb565(u16 *dst, void *vaddr,
> > > > > > >     EXPORT_SYMBOL(tinydrm_xrgb8888_to_rgb565);
> > > > > > >     /**
> > > > > > > + * tinydrm_xrgb8888_to_gray8 - Convert XRGB8888 to grayscale
> > > > > > > + * @dst: 8-bit grayscale destination buffer
> > > > > > > + * @fb: DRM framebuffer
> > > > > > > + *
> > > > > > > + * Drm doesn't have native monochrome or grayscale support.
> > > > > > > + * Such drivers can announce the commonly supported XR24 format to userspace
> > > > > > > + * and use this function to convert to the native format.
> > > > > > > + *
> > > > > > > + * Monochrome drivers will use the most significant bit,
> > > > > > > + * where 1 means foreground color and 0 background color.
> > > > > > > + *
> > > > > > > + * ITU BT.601 is used for the RGB -> luma (brightness) conversion.
> > > > > > > + *
> > > > > > > + * Returns:
> > > > > > > + * Zero on success, negative error code on failure.
> > > > > > > + */
> > > > > > > +int tinydrm_xrgb8888_to_gray8(u8 *dst, struct drm_framebuffer *fb)
> > > > > > I think it's ok to use this for backwards compat reasons with userspace
> > > > > > that simply expects xrgb8888, but adding monochrome fourcc codes to drm
> > > > > > isn't hard, and I think would be better to do that here.
> > > > > > 
> > > > > > There's also my comment from earlier that too much of tinydrm is in
> > > > > > tinydrm, and maybe we should move some of it out (we already have
> > > > > > tinydrm_xrgb8888_to_rgb565). But looks ok otherwise.
> > > > > I'm not sure that tinydrm_xrgb8888_to_rgb565 and tinydrm_xrgb8888_to_gray8
> > > > > does fit in the core as-is, because they work around the fact that
> > > > > cma memory has uncached reads by buffering one pixel line at a time.
> > > > > 
> > > > > I have been aware that the cma library isn't really a prefect match for
> > > > > tinydrm because of this (but it was easy to use). For tiny displays this
> > > > > isn't really a performance problem to speak of, but I see now that it
> > > > > limits which gpus that I can import buffers from, since cma requires
> > > > > continous memory. This is probably the reason the Grain Media driver
> > > > > couldn't use tinydrm or that it needed that shmem thing.
> > > > Hm, might be good to port tinydrm to plain gem shmem objects? We have a
> > > > bunch of gem shmem based simple drivers know, so if helpers aren't as
> > > > comprehensive as with cma that can be fixed (and I'm trying other driver
> > > > submitters to volunteer for that already).
> > > > 
> > > > > I'll see if I can use gem/prime code from vgem coupled with prime import
> > > > > code from udl. The spi core claims to do dma transfers on vmalloc
> > > > > addreses, so it seems possible.
> > > > Still feeling guilty for volunteering you for everything, but would be
> > > > awesome if you can make it happen. I think there's other drivers in-flight
> > > > (or even merged) which could benefit from parts of this at least.
> > > I'm volunteering myself this time :-)
> > > Much better to write a library that smart people will improve upon,
> > > than to put something in tinydrm that nobody looks at.
> > Awesome.
> > 
> > > A couple of questions:
> > > 
> > > - Can the pages be vmap'ed at object creation time, or should this be
> > >    an explicit function call? I see some drivers being explicit, but
> > >    tinydrm will always need that virtual address.
> > On 32bit kernels (still exists) vmap space can be severly constrained. I
> > think we should only do that if necessary. It's also not the cheapest
> > thing to do.
> > 
> > > - Several drivers have cache flags on the gem object used when mmap'ing,
> > > like udl here:
> > > 
> > > static void update_vm_cache_attr(struct udl_gem_object *obj,
> > >                   struct vm_area_struct *vma)
> > > {
> > >      /* non-cacheable as default. */
> > >      if (obj->flags & UDL_BO_CACHEABLE) {
> > >          vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
> > >      } else if (obj->flags & UDL_BO_WC) {
> > >          vma->vm_page_prot =
> > >              pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
> > >      } else {
> > >          vma->vm_page_prot =
> > >              pgprot_noncached(vm_get_page_prot(vma->vm_flags));
> > >      }
> > > }
> > > 
> > > Should I do this in the gem shmem library?
> > > drm_driver->gem_create_object can be used by the driver to set the flag.
> > Cache flags is a can of worms, since it's all very ill-defined. Especially
> > as soon as you get into buffer sharing. The primary reason for that are:
> > - whether something is coherent or uncached for dma depends upon the
> >    platform/device and is deeply hidden in the platform/architecture code.
> >    But for gpus you need to manage this manually, for performance reasons.
> >    drm drivers get to do a lot of ad-hoc hand-rolling as a result.
> > 
> > - Some drm drivers (specifically i915) break the platforms assumption
> >    about cache coherency for dma by making it tuneable on a per-op or at
> >    least per-buffer basis.
> > 
> > Best option might be to have support for cached/wc/uncached and let
> > drivers change that as you describe. Not sure what the default should be
> > (as said, it's a mess and pretty much not possible to be fully generic).
> > 
> > > - And I assume it's better to create a new library instead of
> > >    extending the CMA library? Even though most of the code will be the same.
> > >    Like this: drm_gem_shmem_helper.{c,h} and drm_fb_shmem_helper.{c,h}
> > Yeah, I'd go with a separate library. If you can exactly reuse code and it
> > makes sense to do, then we can always put the shared code into a
> > drm_gem.c (that kinda contains both helper and core code) or
> > drm_fbdev_helper.c (for the fbdev setup stuff). But dma_alloc_coherent vs
> > shmem backed gem buffers are rather different.
> 
> Another question:
> 
> I have looked at how drm drivers use vmap() and found consistency in
> the flags and prot arguments except for i915 and udl. Should I follow
> the majority and always use VM_MAP and pgprot_writecombine(PAGE_KERNEL) ?

I think the vmap pgprots should match what we do for userspace mmap, but
we definitely need to review all the drivers.

About VM_MAP or not, I had no idea what's that for. A quick grep indicates
it's just for pretty-printing, I think you can unconditionally set that
flag for all cases.

> etnaviv_gem_vmap_impl()
>     return vmap(pages, obj->base.size >> PAGE_SHIFT,
>             VM_MAP, pgprot_writecombine(PAGE_KERNEL));
> 
> exynos_drm_fbdev_update()
>     exynos_gem->kvaddr = (void __iomem *) vmap(exynos_gem->pages, nr_pages,
>                 VM_MAP, pgprot_writecombine(PAGE_KERNEL));
> 
> i915_gem_object_map()
>     switch (type) {
>     case I915_MAP_WB:
>         pgprot = PAGE_KERNEL;
>         break;
>     case I915_MAP_WC:
>         pgprot = pgprot_writecombine(PAGE_KERNEL_IO);
>         break;
>     }
>     addr = vmap(pages, n_pages, 0, pgprot);

I'd have expected that i915 won't be a good candidate for shmem helpers?
But happy to be surprised ... maybe we eventually could even have a simple
shrinker logic in those helpers for drivers with real gpus (that's all on
your list except rockchip and udl).
-Daniel

> 
> msm_gem_get_vaddr_locked()
>         msm_obj->vaddr = vmap(pages, obj->size >> PAGE_SHIFT,
>                 VM_MAP, pgprot_writecombine(PAGE_KERNEL));
> 
> omap_gem_vaddr()
>         omap_obj->vaddr = vmap(pages, obj->size >> PAGE_SHIFT,
>                 VM_MAP, pgprot_writecombine(PAGE_KERNEL));
> 
> rockchip_gem_alloc_iommu()
>         rk_obj->kvaddr = vmap(rk_obj->pages, rk_obj->num_pages, VM_MAP,
>                       pgprot_writecombine(PAGE_KERNEL));
> 
> rockchip_gem_prime_vmap()
>     if (rk_obj->pages)
>         return vmap(rk_obj->pages, rk_obj->num_pages, VM_MAP,
>                 pgprot_writecombine(PAGE_KERNEL));
> 
> tegra_bo_mmap()
>         return vmap(obj->pages, obj->num_pages, VM_MAP,
>                 pgprot_writecombine(PAGE_KERNEL));
> 
> udl_gem_vmap()
>     obj->vmapping = vmap(obj->pages, page_count, 0, PAGE_KERNEL);
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch