[Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions

Marek Olšák maraeo at gmail.com
Wed Feb 5 16:09:06 PST 2014


The synchronization for non-coherent persistent mappings can also be done using:

glMemoryBarrier(GL_CLIENT_MAPPED_BUFFER_BARRIER_BIT);

In which case you don't know the range either. However I fully support
the addition of coherent persistent mappings to GL. It's perfect for
uploading data without the GL API overhead.

Marek

On Thu, Feb 6, 2014 at 12:49 AM, Jose Fonseca <jfonseca at vmware.com> wrote:
> I hadn't looked at GL_ARB_buffer_storage. I need to read more closely, but at a glance i looks like GL_MAP_PERSISTENT_BIT alone is okay (app needs to call FlushMappedBufferRange must be called to guarantee coherence) but if GL_MAP_COHERENCE_BIT is set we are indeed in face of the same issue... :-(
>
> Even worse, being part of GL 4.4 and there being no way for the implementation to fail GL_MAP_COHERENCE_BIT mappings, it means there is no way to avoid supporting it...
>
> Jose
>
> Note to self: my time would be better spent on reviewing extensions before they are ratified, than ranting after the fact...
>
>
> ----- Original Message -----
>> However, GL_ARB_buffer_storage (OpenGL 4.4) with GL_MAP_PERSISTENT_BIT
>> isn't much different. The only difference I see between
>> ARB_buffer_storage and AMD_pinned_memory is that AMD_pinned_memory
>> allows mapping CPU memory to the GPU address space permanently, while
>> ARB_buffer_storage allows mapping GPU memory to the CPU address
>> permanently. At the end of the day, both the GPU and the CPU can read
>> and modify the same buffer and all they need to use for
>> synchronization is fences.
>>
>> Marek
>>
>> On Wed, Feb 5, 2014 at 8:10 PM, Jose Fonseca <jfonseca at vmware.com> wrote:
>> >
>> >
>> > ----- Original Message -----
>> >>
>> >>
>> >> ----- Original Message -----
>> >> > On 05.02.2014 18:08, Jose Fonseca wrote:
>> >> > > I honestly hope that GL_AMD_pinned_memory doesn't become popular. It
>> >> > > would
>> >> > > have been alright if it wasn't for this bit in
>> >> > > https://urldefense.proofpoint.com/v1/url?u=http://www.opengl.org/registry/specs/AMD/pinned_memory.txt&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0A&m=pA%2FnK9X3xx0wAlMUZ24PfQ1mW6wAMdTUujz%2Bx7LRwCA%3D%0A&s=ebbe1f51deb46c81578b3c125b16e31b5f4b28c1d47e283bc9ef588e2707024d
>> >> > > which says:
>> >> > >
>> >> > >      2) Can the application still use the buffer using the CPU
>> >> > >      address?
>> >> > >
>> >> > >          RESOLVED: YES. However, this access would be completely
>> >> > >          non synchronized to the OpenGL pipeline, unless explicit
>> >> > >          synchronization is being used (for example, through glFinish
>> >> > >          or
>> >> > >          by
>> >> > >          using
>> >> > >          sync objects).
>> >> > >
>> >> > > And I'm imagining apps which are streaming vertex data doing precisely
>> >> > > just
>> >> > > that...
>> >> > >
>> >> >
>> >> > I don't understand your concern, this is exactly the same behavior
>> >> > GL_MAP_UNSYCHRONIZED_BIT has, and apps are supposedly using that
>> >> > properly. How does apitrace handle it?
>> >>
>> >> GL_AMD_pinned_memory it's nothing like GL_ARB_map_buffer_range's
>> >> GL_MAP_UNSYCHRONIZED_BIT:
>> >>
>> >> - When an app touches memory returned by
>> >> glMapBufferRange(GL_MAP_UNSYCHRONIZED_BIT) it will communicate back to the
>> >> OpenGL driver which bytes it actually touched via the
>> >> glFlushMappedBufferRange (unless the apps doesn't care about performance
>> >> and
>> >> doesn't call glFlushMappedBufferRange at all, which is silly as it will
>> >> force the OpenGL driver to assumed the whole range changed)
>> >>
>> >>   In this case, the OpenGL driver (hence apitrace) should get all the
>> >>   information it needs about which bytes were updated betwen
>> >>   glMap/glUnmap.
>> >>
>> >> - When an app touches memory bound via GL_AMD_pinned_memory outside
>> >> glMap/glUnmap, there are be _no_ hints whatsever.  The OpenGL driver might
>> >> not care as the memory is shared between CPU and GPU, so all is good as
>> >> far
>> >> is it is concerned, but all the changes the app does are invisible at an
>> >> API
>> >> level, hence apitrace will not be able to catch them unless it does
>> >> onerous
>> >> heuristics.
>> >>
>> >>
>> >> So while both extensions allow unsynchronized access, but lack of
>> >> synchronization is not my concern. My concern is that GL_AMD_pinned_memory
>> >> allows *hidden* access to GPU memory.
>> >
>> > Just for the record, the challenges GL_AMD_pinned_memory presents to
>> > Apitrace are much similar to the old-fashioned OpenGL user array pointers:
>> > an app is free to change the contents of memory pointed by user arrays
>> > pointers at any point in time, except during a draw call.  This means that
>> > before every draw call, Apitrace needs to scavenge all the user memory
>> > pointers and write their contents to the trace file, just in case the app
>> > changed it..
>> >
>> > In order to support GL_AMD_pinned_memory, for every draw call Apitrace
>> > would also need to walk over bound GL_AMD_pinned_memory (and nowadays
>> > there are loads of bound points!), and check if data changed, and
>> > serialize in the trace file if it did...
>> >
>> >
>> > I never care much about performance of Apitrace with user array pointers:
>> > it is an old paradigm; only old apps use it, or programmers which don't
>> > particular care about performance -- either way, a performance conscious
>> > app developer would use VBOs hence never hit the problem at all.  My
>> > displeasure with GL_AMD_pinned_memory is that it essentially flips
>> > everything on its head -- it encourages a paradigm which apitrace will
>> > never be able to handle properly.
>> >
>> >
>> > People often complain that OpenGL development tools are poor compared with
>> > Direct3D's.  An important fact they often miss is that Direct3D API is
>> > several orders of mangnitude tool friendlier: it's clear that Direct3D
>> > API's cares about things like allowing to query all state back, whereas
>> > OpenGL is more fire and forget and never look back -- the main concern in
>> > OpenGL is ensuring that state can go from App to Driver fast, but little
>> > thought is often given to ensuring that one can read whole state back, or
>> > ensuring that one can intercept all state as it goes between the app and
>> > the driver...
>> >
>> >
>> > In this particular case, if the answer for "Can the application still use
>> > the buffer using the CPU address?" was a NO, the world would be a much
>> > better place.
>> >
>> >
>> > Jose
>> > _______________________________________________
>> > mesa-dev mailing list
>> > mesa-dev at lists.freedesktop.org
>> > https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-dev&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0A&m=x3Py6SaAuizlHQhinD9Ig4nikUdXTWMc9RZ5CxQDi9M%3D%0A&s=4fe812f4242b6f3e2d4c7fde43bc25f5a3b4eb1c04ea4381b9f3a13e881a67cf
>>


More information about the mesa-dev mailing list