Best practices for client side buffer management

Mon Jun 22 12:08:39 UTC 2020

On Mon, 22 Jun 2020 11:46:41 +0300 Pekka Paalanen <ppaalanen at gmail.com> said:

> On Fri, 19 Jun 2020 11:21:34 +0100
> Carsten Haitzler (The Rasterman) <raster at rasterman.com> wrote:
> 
> > On Fri, 19 Jun 2020 13:24:12 +1000 Brad Robinson
> > <brobinson at toptensoftware.com> said:
> > 
> > > Hi All,
> > > 
> > > I'm fairly new to Wayland and Linux GUI programming in general, but doing
> > > some experiments to figure out how to integrate it into my custom UI
> > > toolkit library and have a couple of questions about client side buffer
> > > management.
> > > 
> > > Firstly, this is how I'm allocating the backing memory for client side
> > > buffer pools.  This is C# p-invoking to libc, and basically it's using
> > > mkstemp() to get a temp file, ftruncate() to set its length, mmap() to map
> > > it and then unlink() once mapped so temp files aren't left behind.  Any
> > > issues with this approach?
> > > 
> > >             // Get temp file
> > >             var sb = new
> > > StringBuilder(System.IO.Path.Join(System.IO.Path.GetTempPath(),
> > > "mmXXXXXX"));
> > >             int fd = mkstemp(sb);
> > >             ftruncate(fd, (ulong)capacity);  
> > 
> > i assume GetTempPath() will be looking at /tmp ... and /tmp may not be a
> > ramdisk. it may be a real disk... in which case your buffers may be getting
> > written to an actual disk. don't use /tmp.
> 
> Hi,
> 
> that's true. The trick we have been using is create the file in
> $XDG_RUNTIME_DIR which is practically always a tmpfs, but OTOH it might
> not be the best place to store large pixel buffers.

correct. also just opening files in /dev/shm ... but shm_open acts as a
portable front-end to that.

> > you might wan to to loop at shm_open or memfd 
> 
> Right.
> 
> > or libdrm for specific drivr
> > allocation calls like drm_intel_bo_alloc_tiled, drm_intel_bo_map/unmap etc.
> > the
> 
> No, please do not do that!
> 
> That will make your program specific to a certain hardware driver,
> which makes it specific to that hardware. Plus, you will have to learn how
> to program specifically for that driver. It's not worth it.

we do... but we also have other paths to alloc memory so this is a "if on intel
and that works, then use this". also have one for vc4... :) it certainly should
not be the only method :)

> The wl_shm Wayland interface is for software rendering anyway. No-one
> expects a wl_shm-based buffer to be directly usable by a display or a
> GPU driver, it will always be copied by a compositor. So trying to
> shove "hardware buffers" through wl_shm is only picking the worst of
> all options, as direct CPU access to them is often sub-optimal, perhaps
> even prohibitively slow.

oh this wont be for wl-shm ... :) sorry. i didn't mention that.

> > latter libdrm ones wo9uld allow your buffers to possibly be scanned out
> > directly to the screen or used as textures directly without copies, but will
> > need careful handling, so do this only as an advanced step.
> 
> If you want to use hardware acceleration or allow direct scanout, use
> the proper APIs intended for them: GBM, EGL, Vulkan; if you need to
> pass hardware buffers manually through Wayland, use zwp_linux_dmabuf_v1
> extension instead of wl_shm.

yeah. we use linux-dmabuf protocol with these intel and vc4 buffers, not
wl_shm. :)

> > 
> > >             // Map it
> > >             var addr = mmap(IntPtr.Zero, (ulong)capacity, Prot.Read |
> > > Prot.Write, Map.Shared, fd, 0);
> > > 
> > >             // Unlink it (so temp files not left behind)
> > >             unlink(sb.ToString());
> > > 
> > > Secondly I'm wondering about practical strategies for managing client side
> > > buffers.  The toolkit in question basically needs arbitrarily sized
> > > buffers to render whatever size the window happens to be.  Seems like to
> > > use a buffer pool for this would require some sort of heap manager to
> > > manage what's in each pool.  I'm wondering if there's any recommendations
> > > or best practices for how to deal with this.  eg: create one big pool and
> > > explicitly manage what's in there as a heap, use lots of little pools with
> > > one buffer in each, a combination of both, something else?  
> > 
> > resizes of windows are less common (in general) than rendering to them. here
> > i'd go for a scheme of N buffers in a ring per window. so you have buffers
> > A, B, C and you first render to A then display it, then next frame B, then
> > C, then A, then B, then C. You could get away without C. as the buffers
> > retain their state you can take advantage of this and only partially render
> > part of a buffer for updates "since 1 or 2 frames ago" (depending if you do
> > double or triple buffering). as its predictable ABCABCABC you can just keep
> > a "Sliding window of the update regions of the past N frames" and merge
> > those into the "current amount to update" but always store per-frame update
> > rectangle regions before this merge-at-render-time. 3 buffers allows you to
> > be rendering a 3rd buffer while 1 buffer is queued to be displayed and one
> > is still being displayed. if you find you need a 4th buffer, perhaps just
> > overdraw the 3rd on you just did and "update it" ... or just block or don't
> > update yet as you are getting too far ahead of the compositor.
> 
> Such a strict sequence of buffers is wasteful and not necessary. You
> don't need that to keep track of damage for doing partial updates. The
> slight complication in tracking the damage is very much worth it over
> the use of more buffers than you really need.
> 
> Instead, allocate buffers on-demand, and re-use buffers as soon as the
> compositor you are connected to releases them. If you have more than one
> buffer idle, you can destroy the extra ones, perhaps after some
> timeout. This will automatically let your application to use the
> minimum required number of buffers at all times.
> 
> In general:
> 
> If you have a) something new to show, and b) wl_surface.frame callback
> has returned (unless you want to override an earlier update), and c)
> you have a buffer available or you have not exceeded the maximum number
> of buffers you are willing to allocate (4 is a good number), then you
> can repaint.
> 
> When you repaint, you first need an available buffer. If you don't have
> any ready, allocate a new one. Repaint, submit the update via Wayland,
> and mark the buffer as busy in your own bookkeeping.
> 
> When the compositor replies with wl_buffer.release, mark the buffer as
> available again. You may have to submit one or more further updates
> with other buffers before this happens. In some cases it might happen
> before you need to repaint again, in which case you'll only need one
> buffer in total.
> 
> When you submit a buffer to a Wayland compositor, it gives the
> compositor permission to read the buffer at any time, as many times as
> it wants to, until it tells you with wl_buffer.release that it will not
> be looking at that buffer again. You must not write to a buffer that
> might be read by the compositor, as that can cause misrendering on
> screen (e.g. your repaint is shown unfinished).
> 
> You also should not destroy a wl_buffer that is being kept busy by the
> compositor. If you are closing a window, destroy the wl_surface first,
> and wait for the wl_buffer.release events before destroying the wl_buffers.
> 
> Also, every buffer you submit must be fully drawn, also outside of the
> areas you mark as damage. The compositor may be reading more than just
> the damage you submit.
> 
> 
> Thanks,
> pq

-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
Carsten Haitzler - raster at rasterman.com