Add support for multi-threaded playback

Tue Oct 30 05:24:36 PDT 2012

On Fri, Oct 26, 2012 at 7:53 PM, Imre Deak <imre.deak at intel.com> wrote:
> Hi,
>
> On Fri, 2012-10-26 at 18:49 +0100, José Fonseca wrote:
>> Imre,
>>
>> Thanks for this. I took your mt-trace patches and did some follow on changes:
>>
>> - port it to macosx and windows (implement and use C++11-like
>> threading primitives)
>>
>> - make threads synchronous i.e., respect the ordering of the calls in
>> the trace, otherwise there would be random results and race
>> conditions, as the locking is not captured in the trace.
>
> I thought the workqueue approach already guaranteed this:
>
> if (prev_thread_id != call->thread_id) {
>      if (thread_wq)
>           thread_wq->flush();
>      thread_wq = get_work_queue(call->thread_id);
>      prev_thread_id = call->thread_id;
> }
> thread_wq->queue_work(render_work);
>
> So if the new call's thread ID is different than the previous one we
> wait until all calls in the previous thread finish. Could you explain
> what extra synchronization we need here?

I was getting random heap corruption until I added
https://github.com/apitrace/apitrace/commit/55c2ece87567ec831b9cf5c2a4a9d30035c093bb

That is, there were some race conditions due to the parsing and the
retracing happening concurrently.

This could be fixed with more careful locking on all the parser
internal structures. But I saw no point of pursing that road. Instead
I chose to pass the responsibility of parsing the trace to the thread
that executed the last call, which achieves the same more efficiently
(no thread switching per call, no mutex locking per call).

In short, there is now only one active thread at any single instance.
Therefore race conditions are impossible. And for single threaded
traces this gracefully degrades to exactly what we were doing before
(i.e, performance for single threaded traces is exactly the same).

>> - make it faster -- the parsing is done in the thread that is
>> executing, so there is less thread switching.
>
> I'd have to think more how this improves things. Afaics on multi-core at
> least there shouldn't be much task switching, except for the above
> synchronization points.

Whereas before it was necessary to lock a mutex on every call, now a
mutex is only locked whenever thread_id changes.

That was particularly noticeable for applications that use immediate
vertex data (glVertex and friends) which have a lot of calls.

Furthermore, there is no change whatsoever when a trace is single
threaded (parsing and retracing happens on the main thread just as
before), therefore multithreaded retracing is on all the time --
there's no longer any option to enable it anymore.

Jose