[Mesa-dev] [RFC] opencl: mega-cl

Rob Clark robdclark at gmail.com
Mon Feb 26 13:28:49 UTC 2018


On Mon, Feb 26, 2018 at 7:15 AM, Karol Herbst <kherbst at redhat.com> wrote:
> On Mon, Feb 26, 2018 at 1:10 PM, Emil Velikov <emil.l.velikov at gmail.com> wrote:
>> Hi guys,
>>
>> Having attempted a similar thing in the past, I think there are two
>> things at play here.
>> As such I'd recommend trying to keep them separate.
>>
>> 1) Having a single and/or modular - state-tracker <> pipe-driver setup
>> 2) "Hilarities" when having NIR code multiple times per process
>>
>> On 26 February 2018 at 01:54, Rob Clark <robdclark at gmail.com> wrote:
>>> On Sun, Feb 25, 2018 at 3:00 PM, Francisco Jerez <currojerez at riseup.net> wrote:
>>>> Seems like a serious hack to me to work around broken linking...  IMO we
>>>> should just fix the linking issue.  The symbols for the various GLSL
>>>> types need to be linked with the proper binding and visibility -- I
>>>> assume that the cause of your problem is that people are making
>>>> assumptions about the equality of GLSL types based on their memory
>>>> addresses *and* marking the symbols as hidden *and* passing pointers to
>>>> GLSL types across shared objects?  That sounds like a recipe for
>>>> disaster.
>>>
>>> tbh, maybe hack, or I think more likely, maybe a good idea.. I'm not
>>> terribly sold on the idea of dynamically loading pipe driver and
>>> linking a lot of shared code into N different pipe_${driver}.so on
>>> disk, since the # of drivers seems to be greater than # of state
>>> trackers.. not to mention multiple copies of shared gallium code in
>>> memory due to being statically linked into both state tracker and
>>> driver.
>>>
>> This is a) above.
>> If using a dynamic pipe-drivers across the tree one can achieve very
>> good disk util.
>> All the bits for DRI, VDPAU and others is there, we just need a
>> configure toggle.
>>
>> The call which approach to use will be left to the distribution.
>>
>>> That said, glsl_type's defn does seem to expect to only exist once in
>>> a process, ie. == is ptr comparison and when you link nir/glsl_types
>>> into both state tracker and driver, glsl_type ptrs are getting passed
>>> across that boundary.  I'm not really sure that is worth fixing (ie.
>>> why should it exist twice in a process in the first place?)
>>>
>> This seems like b) - a bug, IMHO, which we should fix regardless of the above.
>>
>> Why - it's possible to have an application use OpenCL, Vulkan (even
>> VDPAU, GL, etc.).
>> Thus effectively pulling the NIR codebase multiple times in the same process.
>>
>> Of the top of my head it sounds like we have a bunch of global
>> variables, which are causing the problem.
>> Or perhaps it's the screen sharing that bites us?
>>
>>> Maybe there are some linker tricks to solve this, idk.. that is a bit
>>> outside my area of expertise.
>>
>> Last time I've looked symbols were properly annotated.
>>
>> Rob, can you try dropping the freedreno symbol from
>> src/gallium/targets/dri/dri.sym.
>> I'm about ~90% sure that it will fix your problem.
>>
>
> the odd thing is, it works for Nouveau without issues.
>

Could you try this patch with nouveau:

https://paste.fedoraproject.org/paste/KnFojZ7HTFAppUS6VnEKNQ/raw

and see how many times "VOID TYPE" is printed?  Without mega-cl, I see
it 3 times (looks like pipe_msm.so is loaded then unloaded and then
loaded again?):

------------
(gdb) break glsl_types.cpp:69
No source file named glsl_types.cpp.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (glsl_types.cpp:69) pending.
(gdb) r
Starting program: /home/robclark/src/opencl-example/math-int add 6 7 13
Missing separate debuginfos, use: dnf debuginfo-install
glibc-2.26.9000-51.fc28.aarch64
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1, glsl_type::glsl_type (this=0xffffba03e8d0
<glsl_type::_void_type>, gl_type=1280,
    base_type=GLSL_TYPE_VOID, vector_elements=0, matrix_columns=0,
name=0xffffb9f71f50 "void")
    at ../src/compiler/glsl_types.cpp:69
69      printf("VOID TYPE!\n");
Missing separate debuginfos, use: dnf debuginfo-install
clang-libs-5.0.1-2.fc28.aarch64 elfutils-libelf-0.170-1.fc27.aarch64
expat-2.2.5-1.fc28.aarch64 libedit-3.1-20.20170329cvs.fc27.aarch64
libffi-3.1-14.fc27.aarch64 libgcc-7.2.1-6.fc28.aarch64
libstdc++-7.2.1-6.fc28.aarch64 llvm-libs-5.0.1-1.fc28.aarch64
ncurses-libs-6.0-15.20171125.fc28.aarch64
ocl-icd-2.2.11-4.fc27.aarch64
spirv-tools-libs-2018.1-0.2.20180205.git9e19fc0.fc28.aarch64
zlib-1.2.11-4.fc27.aarch64
(gdb) bt
#0  glsl_type::glsl_type (this=0xffffba03e8d0 <glsl_type::_void_type>,
gl_type=1280, base_type=GLSL_TYPE_VOID, vector_elements=0,
matrix_columns=0, name=0xffffb9f71f50 "void") at
../src/compiler/glsl_types.cpp:69
#1  0x0000ffffb9f22580 in __static_initialization_and_destruction_0
(__initialize_p=1, __priority=65535) at
../src/compiler/builtin_type_macros.h:32
#2  0x0000ffffb9f240b0 in _GLOBAL__sub_I_glsl_types.cpp(void) () at
../src/compiler/glsl_types.cpp:2364
#3  0x0000ffffbf6dccf4 in call_init.part () from /lib/ld-linux-aarch64.so.1
#4  0x0000ffffbf6dcde4 in _dl_init () from /lib/ld-linux-aarch64.so.1
#5  0x0000ffffbf6e0fd4 in dl_open_worker () from /lib/ld-linux-aarch64.so.1
#6  0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
#7  0x0000ffffbf6e06f0 in _dl_open () from /lib/ld-linux-aarch64.so.1
#8  0x0000ffffbf4e0044 in dlopen_doit () from /lib64/libdl.so.2
#9  0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
#10 0x0000ffffbf60b29c in _dl_catch_error () from /lib64/libc.so.6
#11 0x0000ffffbf4e0734 in _dlerror_run () from /lib64/libdl.so.2
#12 0x0000ffffbf4e00dc in dlopen@@GLIBC_2.17 () from /lib64/libdl.so.2
#13 0x0000ffffbf324518 in util_dl_open (filename=0xffffffffd938
"/home/robclark/src/mesa/debug/b/lib64/gallium-pipe/pipe_msm.so") at
../src/gallium/auxiliary/util/u_dl.c:48
#14 0x0000ffffbf2ca09c in pipe_loader_find_module
(driver_name=0x481fb0 "msm", library_paths=0xffffbf3e6f28
"/home/robclark/src/mesa/debug/b/lib64/gallium-pipe") at
../src/gallium/auxiliary/pipe-loader/pipe_loader.c:162
#15 0x0000ffffbf2ca824 in get_driver_descriptor (driver_name=0x481fb0
"msm", plib=0x459ec8) at
../src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c:150
#16 0x0000ffffbf2ca988 in pipe_loader_drm_probe_fd
(dev=0xffffffffe9e8, fd=4) at
../src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c:193
#17 0x0000ffffbf2caa8c in pipe_loader_drm_probe (devs=0x0, ndev=0) at
../src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c:231
#18 0x0000ffffbf2c9d74 in pipe_loader_probe (devs=0x0, ndev=0) at
../src/gallium/auxiliary/pipe-loader/pipe_loader.c:65
#19 0x0000ffffbf2bd3f8 in clover::platform::platform
(this=0xffffbf4d88f8 <(anonymous namespace)::_clover_platform>) at
../src/gallium/state_trackers/clover/core/platform.cpp:28
#20 0x0000ffffbf263af0 in __static_initialization_and_destruction_0
(__initialize_p=1, __priority=65535) at
../src/gallium/state_trackers/clover/api/platform.cpp:30
#21 0x0000ffffbf263b2c in _GLOBAL__sub_I_platform.cpp(void) () at
../src/gallium/state_trackers/clover/api/platform.cpp:143
#22 0x0000ffffbf6dccf4 in call_init.part () from /lib/ld-linux-aarch64.so.1
#23 0x0000ffffbf6dcde4 in _dl_init () from /lib/ld-linux-aarch64.so.1
#24 0x0000ffffbf6e0fd4 in dl_open_worker () from /lib/ld-linux-aarch64.so.1
#25 0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
#26 0x0000ffffbf6e06f0 in _dl_open () from /lib/ld-linux-aarch64.so.1
#27 0x0000ffffbf4e0044 in dlopen_doit () from /lib64/libdl.so.2
#28 0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
#29 0x0000ffffbf60b29c in _dl_catch_error () from /lib64/libc.so.6
#30 0x0000ffffbf4e0734 in _dlerror_run () from /lib64/libdl.so.2
#31 0x0000ffffbf4e00dc in dlopen@@GLIBC_2.17 () from /lib64/libdl.so.2
#32 0x0000ffffbf67bb0c in _initClIcd_real () from /lib64/libOpenCL.so.1
#33 0x0000ffffbf67dad4 in clGetPlatformIDs () from /lib64/libOpenCL.so.1
#34 0x0000000000401684 in clSimpleInitGpuDevice
(device_id=0xfffffffff3f8) at cl_simple.c:149
#35 0x0000000000400ec8 in main (argc=5, argv=0xfffffffff568) at math-int.c:36
(gdb) c
Continuing.
VOID TYPE!
warning: Temporarily disabling breakpoints for unloaded shared library
"/home/robclark/src/mesa/debug/b/lib64/gallium-pipe/pipe_msm.so"

Breakpoint 1, glsl_type::glsl_type (this=0xffffba03e8d0
<glsl_type::_void_type>, gl_type=1280, base_type=GLSL_TYPE_VOID,
vector_elements=0, matrix_columns=0, name=0xffffb9f71f50 "void")
    at ../src/compiler/glsl_types.cpp:69
69      printf("VOID TYPE!\n");
(gdb) bt
#0  glsl_type::glsl_type (this=0xffffba03e8d0 <glsl_type::_void_type>,
gl_type=1280, base_type=GLSL_TYPE_VOID, vector_elements=0,
matrix_columns=0, name=0xffffb9f71f50 "void") at
../src/compiler/glsl_types.cpp:69
#1  0x0000ffffb9f22580 in __static_initialization_and_destruction_0
(__initialize_p=1, __priority=65535) at
../src/compiler/builtin_type_macros.h:32
#2  0x0000ffffb9f240b0 in _GLOBAL__sub_I_glsl_types.cpp(void) () at
../src/compiler/glsl_types.cpp:2364
#3  0x0000ffffbf6dccf4 in call_init.part () from /lib/ld-linux-aarch64.so.1
#4  0x0000ffffbf6dcde4 in _dl_init () from /lib/ld-linux-aarch64.so.1
#5  0x0000ffffbf6e0fd4 in dl_open_worker () from /lib/ld-linux-aarch64.so.1
#6  0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
#7  0x0000ffffbf6e06f0 in _dl_open () from /lib/ld-linux-aarch64.so.1
#8  0x0000ffffbf4e0044 in dlopen_doit () from /lib64/libdl.so.2
#9  0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
#10 0x0000ffffbf60b29c in _dl_catch_error () from /lib64/libc.so.6
#11 0x0000ffffbf4e0734 in _dlerror_run () from /lib64/libdl.so.2
#12 0x0000ffffbf4e00dc in dlopen@@GLIBC_2.17 () from /lib64/libdl.so.2
#13 0x0000ffffbf324518 in util_dl_open (filename=0xffffffffd938
"/home/robclark/src/mesa/debug/b/lib64/gallium-pipe/pipe_msm.so") at
../src/gallium/auxiliary/util/u_dl.c:48
#14 0x0000ffffbf2ca09c in pipe_loader_find_module
(driver_name=0x4799e0 "msm", library_paths=0xffffbf3e6f28
"/home/robclark/src/mesa/debug/b/lib64/gallium-pipe") at
../src/gallium/auxiliary/pipe-loader/pipe_loader.c:162
#15 0x0000ffffbf2ca824 in get_driver_descriptor (driver_name=0x4799e0
"msm", plib=0x4795c8) at
../src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c:150
#16 0x0000ffffbf2ca988 in pipe_loader_drm_probe_fd
(dev=0xffffffffe9e8, fd=4) at
../src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c:193
#17 0x0000ffffbf2caa8c in pipe_loader_drm_probe (devs=0x481f00,
ndev=2) at ../src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c:231
#18 0x0000ffffbf2c9d74 in pipe_loader_probe (devs=0x481f00, ndev=2) at
../src/gallium/auxiliary/pipe-loader/pipe_loader.c:65
#19 0x0000ffffbf2bd434 in clover::platform::platform
(this=0xffffbf4d88f8 <(anonymous namespace)::_clover_platform>) at
../src/gallium/state_trackers/clover/core/platform.cpp:31
#20 0x0000ffffbf263af0 in __static_initialization_and_destruction_0
(__initialize_p=1, __priority=65535) at
../src/gallium/state_trackers/clover/api/platform.cpp:30
#21 0x0000ffffbf263b2c in _GLOBAL__sub_I_platform.cpp(void) () at
../src/gallium/state_trackers/clover/api/platform.cpp:143
#22 0x0000ffffbf6dccf4 in call_init.part () from /lib/ld-linux-aarch64.so.1
#23 0x0000ffffbf6dcde4 in _dl_init () from /lib/ld-linux-aarch64.so.1
#24 0x0000ffffbf6e0fd4 in dl_open_worker () from /lib/ld-linux-aarch64.so.1
#25 0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
#26 0x0000ffffbf6e06f0 in _dl_open () from /lib/ld-linux-aarch64.so.1
#27 0x0000ffffbf4e0044 in dlopen_doit () from /lib64/libdl.so.2
#28 0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
#29 0x0000ffffbf60b29c in _dl_catch_error () from /lib64/libc.so.6
#30 0x0000ffffbf4e0734 in _dlerror_run () from /lib64/libdl.so.2
#31 0x0000ffffbf4e00dc in dlopen@@GLIBC_2.17 () from /lib64/libdl.so.2
#32 0x0000ffffbf67bb0c in _initClIcd_real () from /lib64/libOpenCL.so.1
#33 0x0000ffffbf67dad4 in clGetPlatformIDs () from /lib64/libOpenCL.so.1
#34 0x0000000000401684 in clSimpleInitGpuDevice
(device_id=0xfffffffff3f8) at cl_simple.c:149
#35 0x0000000000400ec8 in main (argc=5, argv=0xfffffffff568) at math-int.c:36
(gdb) c
Continuing.
VOID TYPE!
[New Thread 0xffffb99d51e0 (LWP 8508)]
[New Thread 0xffffb3fff1e0 (LWP 8509)]
[New Thread 0xffffb91d41e0 (LWP 8510)]
[New Thread 0xffffb89d31e0 (LWP 8511)]
[Thread 0xffffb91d41e0 (LWP 8510) exited]
[Thread 0xffffb89d31e0 (LWP 8511) exited]
[Thread 0xffffb3fff1e0 (LWP 8509) exited]
[Thread 0xffffb99d51e0 (LWP 8508) exited]

Thread 1 "math-int" hit Breakpoint 1, glsl_type::glsl_type
(this=0xffffbf4dacf0 <glsl_type::_void_type>, gl_type=1280,
base_type=GLSL_TYPE_VOID, vector_elements=0, matrix_columns=0,
name=0xffffbf405ae0 "void")
    at ../src/compiler/glsl_types.cpp:69
69      printf("VOID TYPE!\n");
(gdb) bt
#0  glsl_type::glsl_type (this=0xffffbf4dacf0 <glsl_type::_void_type>,
gl_type=1280, base_type=GLSL_TYPE_VOID, vector_elements=0,
matrix_columns=0, name=0xffffbf405ae0 "void") at
../src/compiler/glsl_types.cpp:69
#1  0x0000ffffbf3ddee0 in __static_initialization_and_destruction_0
(__initialize_p=1, __priority=65535) at
../src/compiler/builtin_type_macros.h:32
#2  0x0000ffffbf3dfa10 in _GLOBAL__sub_I_glsl_types.cpp(void) () at
../src/compiler/glsl_types.cpp:2364
#3  0x0000ffffbf6dccf4 in call_init.part () from /lib/ld-linux-aarch64.so.1
#4  0x0000ffffbf6dcde4 in _dl_init () from /lib/ld-linux-aarch64.so.1
#5  0x0000ffffbf6e0fd4 in dl_open_worker () from /lib/ld-linux-aarch64.so.1
#6  0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
#7  0x0000ffffbf6e06f0 in _dl_open () from /lib/ld-linux-aarch64.so.1
#8  0x0000ffffbf4e0044 in dlopen_doit () from /lib64/libdl.so.2
#9  0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
#10 0x0000ffffbf60b29c in _dl_catch_error () from /lib64/libc.so.6
#11 0x0000ffffbf4e0734 in _dlerror_run () from /lib64/libdl.so.2
#12 0x0000ffffbf4e00dc in dlopen@@GLIBC_2.17 () from /lib64/libdl.so.2
#13 0x0000ffffbf67bb0c in _initClIcd_real () from /lib64/libOpenCL.so.1
#14 0x0000ffffbf67dad4 in clGetPlatformIDs () from /lib64/libOpenCL.so.1
#15 0x0000000000401684 in clSimpleInitGpuDevice
(device_id=0xfffffffff3f8) at cl_simple.c:149
#16 0x0000000000400ec8 in main (argc=5, argv=0xfffffffff568) at math-int.c:36
(gdb) c
Continuing.
VOID TYPE!
There are 1 platforms.
There are 1 GPU devices.
------------

With mega-cl I see it just once:

------------
Starting program: /home/robclark/src/opencl-example/math-int add 6 7 13
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0xffffb9c931e0 (LWP 9570)]
[New Thread 0xffffb3fff1e0 (LWP 9571)]
[New Thread 0xffffb94921e0 (LWP 9572)]
[New Thread 0xffffb8c911e0 (LWP 9573)]
[Thread 0xffffb94921e0 (LWP 9572) exited]
[Thread 0xffffb8c911e0 (LWP 9573) exited]
[Thread 0xffffb9c931e0 (LWP 9570) exited]
[Thread 0xffffb3fff1e0 (LWP 9571) exited]

Thread 1 "math-int" hit Breakpoint 1, glsl_type::glsl_type
(this=0xffffbf495240 <glsl_type::_void_type>, gl_type=1280,
base_type=GLSL_TYPE_VOID, vector_elements=0, matrix_columns=0,
name=0xffffbf32f0e0 "void")
    at ../src/compiler/glsl_types.cpp:69
69          printf("VOID TYPE!\n");
(gdb) bt
#0  glsl_type::glsl_type (this=0xffffbf495240 <glsl_type::_void_type>,
gl_type=1280, base_type=GLSL_TYPE_VOID, vector_elements=0,
matrix_columns=0, name=0xffffbf32f0e0 "void") at
../src/compiler/glsl_types.cpp:69
#1  0x0000ffffbf19fe94 in __static_initialization_and_destruction_0
(__initialize_p=1, __priority=65535) at
../src/compiler/builtin_type_macros.h:32
#2  0x0000ffffbf1a19c4 in _GLOBAL__sub_I_glsl_types.cpp(void) () at
../src/compiler/glsl_types.cpp:2364
#3  0x0000ffffbf6dccf4 in call_init.part () from /lib/ld-linux-aarch64.so.1
#4  0x0000ffffbf6dcde4 in _dl_init () from /lib/ld-linux-aarch64.so.1
#5  0x0000ffffbf6e0fd4 in dl_open_worker () from /lib/ld-linux-aarch64.so.1
#6  0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
#7  0x0000ffffbf6e06f0 in _dl_open () from /lib/ld-linux-aarch64.so.1
#8  0x0000ffffbf4e0044 in dlopen_doit () from /lib64/libdl.so.2
#9  0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
#10 0x0000ffffbf60b29c in _dl_catch_error () from /lib64/libc.so.6
#11 0x0000ffffbf4e0734 in _dlerror_run () from /lib64/libdl.so.2
#12 0x0000ffffbf4e00dc in dlopen@@GLIBC_2.17 () from /lib64/libdl.so.2
#13 0x0000ffffbf67bb0c in _initClIcd_real () from /lib64/libOpenCL.so.1
#14 0x0000ffffbf67dad4 in clGetPlatformIDs () from /lib64/libOpenCL.so.1
#15 0x0000000000401684 in clSimpleInitGpuDevice
(device_id=0xfffffffff3f8) at cl_simple.c:149
#16 0x0000000000400ec8 in main (argc=5, argv=0xfffffffff568) at math-int.c:36
(gdb) c
Continuing.
VOID TYPE!
There are 1 platforms.
There are 1 GPU devices.
------------


More information about the mesa-dev mailing list