External DMA to gpu

Thu Oct 4 08:15:52 UTC 2018

Hi Pekka,

> > > I suppose that means you still do a copy from the gbm_bo/dmabuf into a
> > > window surface? If you used zwp_linux_dmabuf manually from your Wayland
> > > client, you could avoid even that copy. It has the same caveat as below
> > > though.
> >
> > I don't think so. The grabber does direct DMA to the VRAM, making the
> > texture should be zero copy. Or am I missing something?
>
> Below you say you use glEGLImageTargetTexture2DOES(). That gets you a
> GL texture. To actually get that GL texture on screen, you have to do a
> GL drawing command to copy the pixels into an EGLSurface created from a
> wl_surface. That's the copy I'm referring to and which would be
> avoidable if you don't have to e.g. convert the color format in the app.
>
> Or are you using some other tricks?
>
> Once the pixels are on a wl_surface, the compositor will do one more
> copy to get those into a framebuffer, unless the requirements for
> scanning out directly from the client buffer are met. But I would guess
> it is more important to optimize the grabber-to-VRAM path than the
> wl_surface-to-scanout path which is likely just VRAM-to-VRAM so pretty
> good already.

If I have to use a shader for colorspace conversion I cannot use this
approach, right?

> > > > My only problem left is that glEGLImageTargetTexture2DOES() does only
> > > > accept ARGB8888 and not RGB888, which means I have to waste a lot of
> > > > PCIe bandwidth. Any ideas how to get around this? Or what would be a
> > > > more appropriate place to post this question?
> > >
> > > Yeah, I suppose support for true 24-bit-storage formats is rare
> > > nowadays.
> > >
> > > The format list advertised via zwp_linux_dmabuf, visible via e.g.
> > > weston-info, can tell you what you could use directly. After all, a
> > > Wayland compositor does the same EGLImage import as you do in the
> > > simple case.
> > >
> > > You could probably use the GPU to convert from 24-bit to 32-bit format
> > > though, by importing the image as R8 format instead of RGB888 and
> > > pretend the width is 3x. Then you could use a fragment shader to sample
> > > the real R, G and B separately and write out a 32-bit format image for
> > > display.
> >
> > Is there any example code for a gl noob? I already did some research but
> > didn't find anything useful.
>
> Nothing much come to mind. Weston uses similar tricks to convert YUV
> data to RGB by lying to EGL and GL that the incoming buffer is R8 or
> RG88 and using a fragment shader to compute the proper RGB values. It
> is really just about lying to EGL when you import the dmabuf: instead of
> the actual pixel format, you use R8 and adjust the width/height/stride
> to match so that you can sample each byte correctly. Then in the
> fragment shader, you compute the correct texture coordinates to read
> each of R, G and B values for an output pixel and then combine those
> into an output color.
>
> Reading YUV is more tricky than reading 24-bit RGB, because YUV is
> usually arranged in multiple planes, some of which are sub-sampled,
> e.g. half resolution.

Thanks, that was very helpful, as always. This is what we came up
with, and it works nicely:
 float x_int = floor(3840.0 * vTexCoord.x) * 3.0;\n"
 float r = texture2D(uTexture, vec2((x_int + 0.0) / (3840.0 * 3.0),
vTexCoord.y)).r;
 float g = texture2D(uTexture, vec2((x_int + 1.0) / (3840.0 * 3.0),
vTexCoord.y)).r;
 float b = texture2D(uTexture, vec2((x_int + 2.0) / (3840.0 * 3.0),
vTexCoord.y)).r;
 gl_FragColor = vec4(r, g, b, 1.0);            //  add alpha component

We have to pass the horizontal resolution to the shader, I suppose
there is no way around this, right?

I was afraid that the unaligned access in the shader would have some
performace  penalty. But in fact performace is better than the 32-bit
version. Thumbs up!

Cheers
Dirk