[waffle] [PATCH 0/7] nacl backend implementation

Wed Feb 4 23:47:02 PST 2015

On 02/03/2015 10:07 PM, Chad Versace wrote:
> On 01/30/2015 02:33 AM, Tapani Pälli wrote:
>> Hi;
>>
>> As a Finn I interpret silence as 'yes' :)
>
> A safe assumption in many open source cultures :)
>
>> So I started to hack on swap completion event support. I believe it
>> can be useful for other backends also since then application can
>> throttle rendering and receive information on swaps.
>>
>> I made early hacky prototype that fixes the issues I've seen with
>> nacl backend:
>> http://cgit.freedesktop.org/~tpalli/waffle/log/?h=hacky
>>
>> I have a few questions on implementation details for the
>> 'waffle_swap_complete'. This is my proposal for the API:
>>
>> typedef void (*waffle_swapbuffers_cb) (struct waffle_window *);
>>
>> bool waffle_set_swap_complete_handler(waffle_swapbuffers_cb func);
>> (would fail if backend does not support this)
>>
>> Optionally we could put timestamp there also as callback data but I'm
>> not sure what format is preferred for that. Any ideas here would be
>> good. I'm not sure the OML triple (ust, msc, sbc) like in
>> GLX_INTEL_swap_event make sense or should it be something simpler?
>
> We shouldn't return timestamp information. Some Waffle platforms do not
> provide that information. And for some that do, like GLX_INTEL_swap_event,
> the OML triple is often lie. For example, NVidia hardware doesn't monitor
> usr, msc, sbc because it requires an interrupt to detect. I think the same is true
> for recent Intel GPUs too, but I'm not sure.
>
> For Waffle to return an accurate OML triple, the compositor must provide that
> information to Waffle. But, this opens up a can of worms. What exactly indicates
> swap buffer completion? When the compositor client's GL command stream has completed execution?
> When the compositor's GL command stream for compositing the client buffer has completed
> execution? When the compositor has scheduled a post to display that will present
> the client's back buffer? When the display hardware has completed the requested
> presentation? Argh!!!!!!! How do you choose?!?!

Right, it seems quite a problematic api as a whole.

> The sad state of the world is, if you choose one of the above to mean
> "completion of swap buffers", Waffle will be able to query the chosen
> completion timestamps on only a proper subset, possibly empty, of Waffle's
> supported platforms. No choice is supported on all display systems.

I agree and simple callback is enough for nacl which seems the only user 
for now.

>> Then, whenever waffle_wndow_swap_buffers(window) gets called, backend
>> will call callback if it is set. It should be noted that it can
>> happen from a separate thread.
>>
>> How does this sound?
>
> It seems that two solutions are available:
>
>    S1. Provide waffle_set_swap_buffers_callback().
>
>    S2. On Waffle's NaCl backend, implement waffle_swap_buffers() to block
>        until the back buffer has been composited onto the web page. (I believe
>        this is possible to implement without a worker thread).

I guess it is possible but not without introducing extra 
latency/slowness for the application thread which I was trying to avoid. 
Ideally app could continue doing something else while swap goes on.

> Let's analyze both solutions, keeping in mind two obstacles:
>
>    O1. In NaCl, it is illegal to render to the default framebuffer until
>         PPB_Graphics3D::SwapBuffers invokes its callback.
>
>    O2. If a NaCl app blocks on the main thread, the UI becomes unresponsive.
>
> Waffle's API must protect against O1. Corrupt rendering is unacceptable.
>
> Case !O2
> =========
> Suppose that we don't care about O2. Let the UI hang. Then the simplest
> solution is S2. Let waffle_window_swap_buffers() block. Then we don't have
> to worry about the semantics of a waffle_window_swap_buffers_callback on
> the different platforms.

I will spend some time prototyping this blocking case to have a stronger 
opinion.

> Case O2
> =======
> Suppose now that we do care about O2. Perhaps we care because we want to
> write an interactive GL testcase. Then Waffle's API must provide enough
> machinery to allow the application to avoid blocking on the main thread,
> and that machinery should be as simple as possible.
>
> Subcase O2 + S2
> ---------------
> If we choose solution 2, then the testcase/app will need to delegate
> waffle_window_swap_buffers() to a child "presentation" thread. The app will need to
> implement its own cross-thread synchronization between the main thread and
> the presentation thread.
>
> For a small testcase of demo app, requiring the user to implement this
> synchronization feels like overkill. Real NaCl games must use a
> separate rendering and presentation threads anyway if they do
> non-trivial rendering. However, I don't believe real games are not Waffle's
> concern. Waffle is for little apps and testcases.
>
> However, S2 keeps the Waffle API simple and is easy to implement.

I agree simple sounds good. IMO Waffle has a good chance to be a porting 
layer for a bigger application too, overall porting effort feels quite 
small.

> Subcase O2 + S1
> ---------------
> If we choose solution 1, the benefits is that the testcase/app can remain
> single-threaded. And S1 is slightly easier to implement than S2.
>
> Regarding thread-safety issues with the callback, on NaCl I believe there
> are no safety issues. The NaCl docs guarantee that the "the callback
> function will be executed on the calling thread".
>
> On all platforms, Waffle should document that invocation of the callback
> indicates that it is safe to resume to rendering to the default framebuffer.
> On non-NaCl platforms, that means Waffle can immediately invoke the callback
> immediately after calling glX/wgl/eglSwapBuffers.
>
> Regarding thread-safety issues on non-NaCl platforms, I believe we have to
> make compromises the prevent them.
>      C1. If Waffle invokes the callback from the same thread that called
>          waffle_window_swap_buffers(), then the app must
>          ping-pong between threads to avoid overflowing the stack.
>      C2. If Waffle invokes the callback from a separate thread, then Waffle will
>          need to spawn a new thread on each call to waffle_window_swap_buffers().

I think these options sound quite awful. I will try the blocking option 
and see if it can be dealt with painlessly.

> ====
>
> Tapani, I don't know yet what the best choice is. Please review my arguments
> and see if I missed something.
>
> If we choose S1, because of the problems inherent in C1 it seems to me that non-NaCl apps
> that use the waffle_window_swap_buffers callback will have to manage multiple
> threads. If we choose S2, then NaCl apps will have to manage multiple threads.
>
> In my view, the only real benefit of S1 is that it possibly eliminates the need
> for applications to manage multiple thread. But, since that's not really the case
> for non-NaCl apps, my opinion is leading to the simpler solution, S2, blocking
> waffle_window_swap_buffers. But I'm not certain yet.
>
> What do you think?

Right, as above I will try the blocking solution to have a stronger 
opinion. I'm sure it can be done but quickly thinking it does not sound 
as elegant as the swap thread (IMO).

// Tapani