Unix Device Memory Allocation project

Mon Jan 23 07:38:49 UTC 2017

On Mon, Jan 16, 2017 at 11:54:14PM +0100, Marek Olšák wrote:
> Thanks for all the feedback. Things are much clearer now.
> 
> Yeah, we can use the BO modifiers for simple 2D images / planes if
> that's the general direction. I think we can even stuff the
> compression data buffer offset into those 64 bits, considering it's
> not very large (e.g. below 4GB and low bits are unused due to
> alignment).

For compression data the idea is to have an aux plane, with offset/stride.
At least that's what the plan for i915 is.
-Daniel

> 
> For OpenCL at least, we have to keep using the 256-bytes-large per-BO
> metadata to describe more complex allocations.
> 
> Marek
> 
> 
> On Wed, Jan 4, 2017 at 5:59 PM, Daniel Stone <daniel at fooishbar.org> wrote:
> > Hi Christian,
> >
> > On 4 January 2017 at 16:02, Christian König <deathsimple at vodafone.de> wrote:
> >> Am 04.01.2017 um 16:47 schrieb Rob Clark:
> >>> If the position of the different parts of the buffer are somewhere
> >>> required to be a function of w/h/bpp/etc then I'm not sure if there is
> >>> a strong advantage to treating them as separate BOs.. although I
> >>> suppose it doesn't preclude it either.  As far as plumbing it through
> >>> mesa/st, it seems convenient to have a single buffer.  (We have kind
> >>> of a hack to deal w/ multi-planar yuv, but I'd rather not propagate
> >>> that.. but I've not thought through those details so much yet.)
> >>
> >> Well I don't want to ruin your day, but there are different requirements
> >> from different hardware.
> >>
> >> For example the UVD engine found in all AMD graphics cards since r600 must
> >> have both planes in a single BO because the memory controller can only
> >> handle a rather small offset between the planes.
> >
> > This is, to a large extent, also true of Intel.
> >
> >> On the other hand I know of embedded MPEG2/H264 decoders where the different
> >> planes must be on different memory channels. In this case I can imagine
> >> you want one BO for each plane, because otherwise the device must stitch
> >> together one buffer object from two different memory regions (of course
> >> possible, but rather ugly).
> >
> > Not just embedded, but quite a few platforms where the ratio of
> > required to available memory bandwidth is ... somewhat different to
> > larger discrete systems. Striping allocations such that luma and
> > chroma live on different memory channels isn't uncommon.
> >
> > But I think this is all orthogonal. If you keep auxiliary planes in
> > separate BOs to metadata, you can still handle both cases. How to
> > place buffers is purely an _allocation_ concern, where single vs.
> > multiple BO is purely about addressing them. So your allocator API may
> > become a little more complex - something which only device-specific
> > userspace will ever address - whilst keeping a unified
> > addressing/handle system for the generic parts of userspace which
> > shouldn't have to care about whether the underlying hardware demands a
> > small offset or a completely separate allocation.
> >
> > Having API pegged to the single-underlying-BO concept has been a giant
> > pain for those who can't use single BOs. I don't see anything good
> > coming of the idea for cross-device/cross-vendor sharing either, since
> > it encodes yet more magic implicit detail into buffer sharing. Since
> > that detail ultimately has to be resolved _somewhere_, it's a problem
> > avoided rather than a problem solved.
> >
> > Cheers,
> > Daniel
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch