[Mesa-dev] [PATCH] radeon/llvm: Use LLVM C API for compiling LLVM IR to ISA.

Sun Apr 21 21:39:57 PDT 2013

On Sat, Apr 20, 2013 at 02:20:23PM +0200, Christian König wrote:
> Am 20.04.2013 09:27, schrieb Mathias Fröhlich:
> > Hi Tom,
> >
> > May be I need to tell where the problem really appears in real life.
> > OpenSceneGraph has some nifty features regarding multi channel rendering.
> > Assume a setup of multiple full screen views running on different graphics
> > boards into a single mashine composing a view into a single scene.
> > Now the recommended way to do this with osg is to set up a X screen per
> > graphics board. Even if this spans multiple monitors/projectors. Set up a GL
> > graphics context per graphics board and set up a viewport per projector in the
> > graphics contexts. Rendering happens now in parallel for each graphics
> > context. I do drive such a thing here with two radeons and three monitors for
> > testing and here the problem appears.
> >
> > When I start the favourite flight simulation application of my choice with this
> > setup, then it crashes almost immediately without llvm_start_multithreaded
> > being called. Wheres it works stable if we ensure llvm being multithreaded.
> >
> > So, I tried to distill a piglit testcase out of this somehow huger setup with
> > flightgear, OpenSceneGraph, multiple gpu's and what not.
> >
> > On Friday, April 19, 2013 20:08:54 Tom Stellard wrote:
> >> On Wed, Apr 17, 2013 at 07:54:32AM +0200, Mathias Fröhlich wrote:
> >>> Tom,
> >>>
> >>>> -class LLVMEnsureMultithreaded {
> >>>> -public:
> >>>> -   LLVMEnsureMultithreaded()
> >>>> -   {
> >>>> -      llvm_start_multithreaded();
> >>>> -   }
> >>>> -};
> >>>> -
> >>>> -static LLVMEnsureMultithreaded lLVMEnsureMultithreaded;
> >>> Removing this leads to crashes in llvm with applications that concurrently
> >>> work on different gl contexts.
> >> The test you wrote still passes with this patch.  Do you see that
> >> we are now calling the C API version of llvm_start_multithreaded(),
> >> LLVMStartMutithreaded() from inside radeon_llvm_compile() protected by a
> >> static variable?
> > Oh, no I did not see this. I did not realize that the llvm_start_multithreaded
> > call is not just plain C. So I thought grepping for the call I used is
> > sufficient.
> >
> > But negative. If I really apply your patch and try to run this with the above
> > setup I get the crashes. The same with the piglit test here.
> >
> > Too bad, that reproducing races is racy for itself.
> > With the piglit test I get about 2/3 of the runs either glibc memory
> > corruption aborts. Or one of the below asserts from llvm:
> >
> > bool llvm::llvm_start_multithreaded(): Assertion `!multithreaded_mode &&
> > "Already multithreaded!"' failed.
> >
> > void
> > llvm::PassRegistry::removeRegistrationListener(llvm::PassRegistrationListener*):
> > Assertion `I != Impl->Listeners.end() && "PassRegistrationListener not
> > registered!"' failed.
> >
> > bool llvm::sys::SmartMutex<mt_only>::release() [with bool mt_only = true]:
> > Assertion `((recursive && acquired) || (acquired == 1)) && "Lock not acquired
> > before release!"' failed.
> >
> > So the biggest problem IIRC was that use of llvm::sys::SmartMutex<mt_only>
> > which is spread around here and there in llvm. The pass registry was (is?) one
> > of the users for that. If you did not tell llvm to run multithreaded these
> > locks get noops and you concurrently access containers and that ...
> >
> > Looking at the first assert, the llvm guys have made this problem even worse
> > IMO since I looked at this before. We need to check for multithreading being
> > enabled before trying to set this. Both of which being racy for itself in this
> > way and all of them not being save against already happening llvm access from
> > an other thread and an other foreign use.
> >
> >> Sorry about that. I didn't have piglit commit access at the time, and
> >> I forgot about the patch.  I fixed a few things and sent v3 to the list.
> > The same here. Thanks for this.
> >
> >>> Regarding the point where this funciton is called I had choosen static
> >>> initialization time since llvm requires this function to be called single
> >>> threaded which we cannot guarantee in any case. Keep in mind that you need
> >>> to ensure this function called non concurrently even against applications
> >>> that itself already use the llvm libs in some way while the driver is
> >>> loaded. But the best bet is to do this in the dynamic loder which is
> >>> itself serialized, so I could avoid calling this function concurrently by
> >>> initialization of different contexts. That should at least shield against
> >>> applications that itself do the same trick by calling this funtion in the
> >>> dlopen phase in some static initializer ...
> >>> We may get around part of this problem with dlopening the driver with
> >>> better isolation but up to now the problem can get that far.
> >> This is a tricky problem, and I'm not sure that radeon_llvm_compile() is
> >> the best place to call llvm_start_multithreaded().  Maybe it would be
> >> better to move this into gallivm, because this problem could affect any
> >> driver that uses the gallivm code, which includes: llvmpipe, r300g, r600g,
> >> radeonsi, and i915g.  What do you think?
> > Yep, an other place would be better.
> >
> > I do not know the llvm tools well enough, but If I move the current c++ code
> > into src/gallium/auxiliary/gallivm/lp_bld_misc.cpp it works for me (TM).
> > Seriously, I know of one guy who wants to use llvmpipe with windows and he
> > would benefit from the c++ solution since the gcc extension solution is clearly
> > not available with the cl windows build then. And with the lack of a windows
> > test system I do also not start implementing DllMain.
> > That could be a short term solution ...
> >
> >
> > Appart from that, I mentioned better symbol isolation somehow.
> >
> > How would this improove the situation with the initialization?
> > If we would know that this current driver module is the only user of the this
> > particular llvm instance, then we could really place a mutex somewhere and
> > serialize the call to lvm::llvm_start_multithreaded() by ourselves as we then
> > know that we are the only user. This would then happen before the first use of
> > llvm by the driver module in question. No dlopen time static initializers that
> > usually lead to ordering problems earlier or later ...
> >
> > So, I have a patch set floating around here for about half a year that changes
> > the RTLD_GLOBAL arguments to dlopen to RTLD_LOCAL|RTLD_DEEPBIND which should
> > (? I still have to definitely investigate this) provide us the private copy of
> > the used libraries/modules. Also up to now I did not take the time to really
> > analyze if we rely on some globals being really shared between driver modules
> > or different types of drivers that we then also have multiple times in memory.
> > Note that this does *not* affect the shared glapi topics as they are all above
> > the loaded driver modules at the level of libGL.so.1.
> > So for example I can think of some need for a single libdrm_radeon to make cl-
> > gl buffer sharing work for example? Do we?
> >
> > Alternatively, you might be able to pack every use of llvm at least in radeon
> > statically into libllvmradeon and again hide all non driver api used stuff from
> > the outside. Then you would also have your private llvm instance that you can
> > initialize with your own lock to serialize. That is then more a radeon alone
> > solution as the other drivers need to do something similar themselves then.
> > And I realize that your recent code move work could point into this direction
> > at least for the radeon type drivers.
> >
> > Both symbol isolation methods will also make the driver used llvm version
> > independent of any possibly application used version that can lead to
> > incompatible symbol clashes. So you gain even more with that.
> >
> > Opinions?
> 
> Hi Tom & Mathias,
> 
> completely agree with Mathias here. I also suggested on IRC a couple of 
> weeks ago that libllvmradeon should definitely be static and hide all 
> internal symbols and only export those needed by the drivers. That 
> should isolate us mostly from the mess that comes with having multiple 
> LLVM instances (and probably also different versions) around at the same 
> time.
> 

How would you recommend doing this?  Version scripts?

> That would probably also remove the need to call 
> llvm_start_multithreaded from any library load routine, since then we 
> can control the order of initialization.
> 
> I would take a look into it myself, but I'm a bit busy with UVD right now.
> 

I should be able to spend some time on this over the next few days.

-Tom

> Christian.
> 
> > Mathias
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
>