[Mesa-dev] [RFC] opencl: mega-cl

Tue Feb 27 18:27:29 UTC 2018

Rob Clark <robdclark at gmail.com> writes:

> On Mon, Feb 26, 2018 at 6:12 PM, Francisco Jerez <currojerez at riseup.net> wrote:
>> Rob Clark <robdclark at gmail.com> writes:
>>
>>> On Sun, Feb 25, 2018 at 3:00 PM, Francisco Jerez <currojerez at riseup.net> wrote:
>>>> Seems like a serious hack to me to work around broken linking...  IMO we
>>>> should just fix the linking issue.  The symbols for the various GLSL
>>>> types need to be linked with the proper binding and visibility -- I
>>>> assume that the cause of your problem is that people are making
>>>> assumptions about the equality of GLSL types based on their memory
>>>> addresses *and* marking the symbols as hidden *and* passing pointers to
>>>> GLSL types across shared objects?  That sounds like a recipe for
>>>> disaster.
>>>
>>> tbh, maybe hack, or I think more likely, maybe a good idea.. I'm not
>>> terribly sold on the idea of dynamically loading pipe driver and
>>> linking a lot of shared code into N different pipe_${driver}.so on
>>> disk, since the # of drivers seems to be greater than # of state
>>> trackers.. not to mention multiple copies of shared gallium code in
>>> memory due to being statically linked into both state tracker and
>>> driver.
>>>
>>
>> It would probably make sense to link that shared code dynamically into
>> the pipe-loader modules in order to avoid the duplication you're talking
>> about (e.g. by building them as a meson shared_module() instead of a
>> shared_library() in order to allow the common symbols to be undefined in
>> the pipe-loader module which will then be pulled in by the state
>> tracker) -- Actually that's likely to fix your problem too.
>
> yes, for pipe_loader_dynamic case, this is the only way I can see for
> it to work (although I think we still want to link statically in the
> pipe_loader_static case).

It's not the only way for it to work, the alternatives are annotating
the symbols with the correct visibility (as in the patch I sent you
earlier), or fixing the assumptions about glsl_type identity.

> Well that, or making the glsl_types symbols not -Bsymbolic.
>

I don't think CL uses -Bsymbolic.

> The potential problem here is, what if an app links in both (for ex.)
> CL + GL, but somehow from different mesa versions.  I suppose that
> might be a corner case, but it would fall over badly w/ nir/glsl_types
> in a .so.  But I think the way things are packaged in most distro's, a
> user could upgrade specifically the mesa-gl package but not the
> mesa-cl package.
>

The customary solution for this problem is to version the symbols
(e.g. in the various linker version scripts we have already in place,
annotating them with some sort of unique build identifier).  Or package
CL and GL together (which makes sense if it's possible for the GL and CL
implementations to link against the same dynamic library where the
compiler lives in order to reduce memory footprint).

> (Also, we don't really want to export glsl/nir symbols lest someone
> mistake them for a stable ABI.)

We export other symbols not part of any stable ABI, e.g. for CL/GL
interop extensions (which incidentally will fall over in the same
corner case you're describing above).

> I guess I am ok with the approach of (a) adding support for "mega-cl"
> using piple_loader_static plus (b) adding a libnir.so for the dynamic
> case, and letting users of pipe_loader_dynamic keep both pieces in the
> broken corner cases.
>
>>> That said, glsl_type's defn does seem to expect to only exist once in
>>> a process, ie. == is ptr comparison and when you link nir/glsl_types
>>> into both state tracker and driver, glsl_type ptrs are getting passed
>>> across that boundary.  I'm not really sure that is worth fixing (ie.
>>> why should it exist twice in a process in the first place?)
>>>
>>
>> *Shrug* It's kind of a micro-optimization with nasty side effects
>> (namely that you cannot pass GLSL IR or NIR across shared object
>> boundaries without things exploding) and dubious payoff, it would
>> probably be reasonable to simply drop the micro-optimization, but that
>> wasn't the solution I was suggesting (but rather, making the
>> optimization work by exporting the symbols GLSL type equality cares
>> about with the right linkage).
>
> well, I think it is long past too late to try to solve that.  If it
> were only in c++ code we could overload == but it is a mix of c++ and
> c, and tracking down all the places that do pointer comparisons seems
> like it would be tricky to avoid missing some and having subtle bugs
> as a result.
>

An approach to avoid subtle bugs would be to replace uses of "glsl_type
*" in nir_types.h with an opaque (e.g. struct) type, and then go through
the whole list of compiler errors and fix them by using appropriate
helpers instead of pointer comparisons.

>>> Maybe there are some linker tricks to solve this, idk.. that is a bit
>>> outside my area of expertise.
>>
>> More than adding new linker tricks I think you just need to make sure
>> that existing linker tricks we're doing right now ('-fvisibility=hidden'
>> and various linker scripts) don't break the build by forcing common
>> symbols to be local, along the lines of:
>>
>> | diff --git a/src/compiler/glsl_types.h b/src/compiler/glsl_types.h
>> | index ab0b2637649..9d8a5ea341c 100644
>> | --- a/src/compiler/glsl_types.h
>> | +++ b/src/compiler/glsl_types.h
>> | @@ -243,7 +243,7 @@ public:
>> |     /*@{*/
>> |  #undef  DECL_TYPE
>> |  #define DECL_TYPE(NAME, ...) \
>> | -   static const glsl_type *const NAME##_type;
>> | +   static const glsl_type *const NAME##_type __attribute__((visibility("default")));
>> |  #undef  STRUCT_TYPE
>> |  #define STRUCT_TYPE(NAME) \
>> |     static const glsl_type *const struct_##NAME##_type;
>> | diff --git a/src/gallium/targets/opencl/opencl.sym b/src/gallium/targets/opencl/opencl.sym
>> | index 9fcc57692b8..fadc2eb6c1d 100644
>> | --- a/src/gallium/targets/opencl/opencl.sym
>> | +++ b/src/gallium/targets/opencl/opencl.sym
>> | @@ -2,6 +2,10 @@
>> |       global:
>> |               cl*;
>> |               opencl_dri_*;
>> | +
>> | +             extern "C++" {
>> | +                     glsl_type::*;
>> | +             };
>> |       local:
>> |               *;
>> |  };
>> | diff --git a/src/gallium/targets/pipe-loader/pipe.sym b/src/gallium/targets/pipe-loader/pipe.sym
>> | index 605cb83d802..801b0f44d56 100644
>> | --- a/src/gallium/targets/pipe-loader/pipe.sym
>> | +++ b/src/gallium/targets/pipe-loader/pipe.sym
>> | @@ -7,6 +7,10 @@
>> |               # due to LLVM being initialized multiple times.
>> |               radeon_drm_winsys_create;
>> |               amdgpu_winsys_create;
>> | +
>> | +             extern "C++" {
>> | +                     glsl_type::*;
>> | +             };
>> |       local:
>> |               *;
>> |  };
>>
>
> iiuc, this would have the same corner case of moving nir/glsl_types to
> a separate .so when mixing state tracker versions in a process.
>
>>> To me, the mega-driver approach worked well for gl, and many of the
>>> other state trackers are already using libpipe_loader_static.. why not
>>> just fix the problems with that approach and abandon
>>> libpipe_loader_dynamic?  This seems like it would work out better for
>>> memory footprint since pages for drivers not used simply wouldn't get
>>> paged in.  And you aren't ending up with two copies of libgallium or
>>> libnir code in memory.
>>>
>>
>> The dynamic pipe loader would give you better disk and memory footprint
>> than the static one assuming that the problem with linking shared code
>> statically into each pipe-loader module was fixed, because that would
>> allow the pipe driver code to be shared in memory and disk among
>> different state trackers, but I guess that the benefit is slight at this
>> point, I understand your reluctance to fix it ;).
>>
>
> I guess I haven't measured, but I wonder if this is even true.  The
> cost increase is duplicating pipe_driver code in each state tracker
> (but I guess there are only 3 or 4 that typically end up on users
> filesys, and of those only mesa/st really needs to link in *all* the
> drivers, clover and the video state trackers are only supported by a
> few drivers each), but is that offset by de-duplicating shared code
> (gallium/aux, libnir, etc)?
>

Shared auxiliary code can be de-duplicated either way, whether you're
using the static or dynamic pipe loader, I doubt it would offset the
cost in the long run...

> I did have a thought earlier today.  Since we have glvnd, and
> opencl-icd (and I guess vdpau and vaapi have a sort of loader), where
> the state tracker isn't directly exporting the API symbols but rather
> just a jump-table used by the icd .so, we could actually perhaps do a
> mega-mega driver where all the state trackers and all the drivers are
> linked into a single .so.. That would reduce the disk and memory
> footprint the most.  (Well, ok, it would cost maybe some virtual
> address space, but pages associated w/ unused state trackers wouldn't
> be demand-paged in, so meh?)
>
> BR,
> -R
>
>
>>> BR,
>>> -R
>>>[...]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 227 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20180227/82bc039c/attachment.sig>