How to design a DRM KMS driver exposing 2D compositing?

Mon Aug 11 09:09:11 PDT 2014

On Mon, Aug 11, 2014 at 05:35:31PM +0200, Daniel Vetter wrote:
> On Mon, Aug 11, 2014 at 03:47:22PM +0300, Pekka Paalanen wrote:
> > > > What if I cannot even pick a maximum number of planes, but wanted to
> > > > (as the hardware allows) let the 2D compositing scale up basically
> > > > unlimited while becoming just slower and slower?
> > > > 
> > > > I think at that point one would be looking at a rendering API really,
> > > > rather than a KMS API, so it's probably out of scope. Where is the line
> > > > between KMS 2D compositing with planes vs. 2D composite rendering?
> > > 
> > > I think kms should still be real-time compositing - if you have to
> > > internally render to a buffer and then scan that one out due to lack of
> > > memory bandwidth or so that very much sounds like a rendering api. Ofc
> > > stuff like writeback buffers blurry that a bit. But hw writeback is still
> > > real-time.
> > 
> > Agreed, that's a good and clear definition, even if it might make my
> > life harder.
> > 
> > I'm still not completely sure, that using an intermediate buffer means
> > sacrificing real-time (i.e. being able to hit the next vblank the user
> > space is aiming for) performance, maybe the 2D engine output rate
> > fluctuates so that the scanout block would have problems but a buffer
> > can still be completed in time. Anyway, details.
> > 
> > Would using an intermediate buffer be ok if we can still maintain
> > real-time? That is, say, if a compositor kicks the atomic update e.g.
> > 7 ms before vblank, we would still hit it even with the intermediate
> > buffer? If that is actually possible, I don't know yet.
> 
> I guess you could hide this in the kernel if you want. After all the
> entire point of kms is to shovel the memory management into the kernel
> driver's responsibility. But I agree with Rob that if there are
> intermediate buffers, it would be fairly neat to let userspace know about
> them.
> 
> So I don't think the intermediate buffer thing would be a no-go for kms,
> but I suspect that will only happen when the videocore can't hit the next
> frame reliably. And that kind of stutter is imo not good for a kms driver.
> I guess you could forgo vblank timestamp support and just go with
> super-variable scanout times, but I guess that will make the video
> playback people unhappy - they already bitch about the sub 1% inaccuracy
> we have in our hdmi clocks.
> 
> > > > Should I really be designing a driver-specific compositing API instead,
> > > > similar to what the Mesa OpenGL implementations use? Then have user
> > > > space maybe use the user space driver part via OpenWFC perhaps?
> > > > And when I mention OpenWFC, you probably notice, that I am not aware of
> > > > any standard user space API I could be implementing here. ;-)
> > > 
> > > Personally I'd expose a bunch of planes with kms (enough so that you can
> > > reap the usual benefits planes bring wrt video-playback and stuff like
> > > that). So perhaps something in line with what current hw does in hw and
> > > then double it a bit or twice - 16 planes or so. Your driver would reject
> > > any requests that need intermediate buffers to store render results. I.e.
> > > everything that can't be scanned out directly in real-time at about 60fps.
> > > The fun with kms planes is also that right now we have 0 standards for
> > > z-ordering and blending. So would need to define that first.
> > 
> > I do not yet know where that real-time limit is, but I'm guessing it
> > could be pretty low. If it is, we might start hitting software
> > compositing (like Pixman) very often, which is too slow to be usable.
> 
> Well for other drivers/stacks we'd fall back to GL compositing. pixman
> would obviously be terribly. Curious question: Can you provoke the
> hw/firmware to render into abitrary buffers or does it only work together
> with real display outputs?
> 
> So I guess the real question is: What kind of interface does videocore
> provide? Note that kms framebuffers are super-flexible and you're freee to
> add your own ioctl for special framebuffers which are rendered live by the
> vc. So that might be a possible way to expose this if you can't tell the
> vc which buffers to render into explicitly.

We should maybe think about exposing this display engine writeback
stuff in some decent way. Maybe a property on the crtc (or plane when
doing per-plane writeback) where you attach a target framebuffer for
the write. And some virtual connectors/encoders to satisfy the kms API
requirements.

With DSI command mode I suppose it would be possible to even mix display
and writeback uses of the same hardware pipeline so that the writeback
doesn't disturb the display. But I'm not sure there would any nice way
to expose that in kms. Maybe just expose two crtcs, one for writeback
and one for display and multiplex in the driver.

-- 
Ville Syrjälä
Intel OTC