[Mesa-dev] [RFC] opencl: mega-cl

Karol Herbst kherbst at redhat.com
Mon Feb 26 13:47:59 UTC 2018


On Mon, Feb 26, 2018 at 2:28 PM, Rob Clark <robdclark at gmail.com> wrote:
> On Mon, Feb 26, 2018 at 7:15 AM, Karol Herbst <kherbst at redhat.com> wrote:
>> On Mon, Feb 26, 2018 at 1:10 PM, Emil Velikov <emil.l.velikov at gmail.com> wrote:
>>> Hi guys,
>>>
>>> Having attempted a similar thing in the past, I think there are two
>>> things at play here.
>>> As such I'd recommend trying to keep them separate.
>>>
>>> 1) Having a single and/or modular - state-tracker <> pipe-driver setup
>>> 2) "Hilarities" when having NIR code multiple times per process
>>>
>>> On 26 February 2018 at 01:54, Rob Clark <robdclark at gmail.com> wrote:
>>>> On Sun, Feb 25, 2018 at 3:00 PM, Francisco Jerez <currojerez at riseup.net> wrote:
>>>>> Seems like a serious hack to me to work around broken linking...  IMO we
>>>>> should just fix the linking issue.  The symbols for the various GLSL
>>>>> types need to be linked with the proper binding and visibility -- I
>>>>> assume that the cause of your problem is that people are making
>>>>> assumptions about the equality of GLSL types based on their memory
>>>>> addresses *and* marking the symbols as hidden *and* passing pointers to
>>>>> GLSL types across shared objects?  That sounds like a recipe for
>>>>> disaster.
>>>>
>>>> tbh, maybe hack, or I think more likely, maybe a good idea.. I'm not
>>>> terribly sold on the idea of dynamically loading pipe driver and
>>>> linking a lot of shared code into N different pipe_${driver}.so on
>>>> disk, since the # of drivers seems to be greater than # of state
>>>> trackers.. not to mention multiple copies of shared gallium code in
>>>> memory due to being statically linked into both state tracker and
>>>> driver.
>>>>
>>> This is a) above.
>>> If using a dynamic pipe-drivers across the tree one can achieve very
>>> good disk util.
>>> All the bits for DRI, VDPAU and others is there, we just need a
>>> configure toggle.
>>>
>>> The call which approach to use will be left to the distribution.
>>>
>>>> That said, glsl_type's defn does seem to expect to only exist once in
>>>> a process, ie. == is ptr comparison and when you link nir/glsl_types
>>>> into both state tracker and driver, glsl_type ptrs are getting passed
>>>> across that boundary.  I'm not really sure that is worth fixing (ie.
>>>> why should it exist twice in a process in the first place?)
>>>>
>>> This seems like b) - a bug, IMHO, which we should fix regardless of the above.
>>>
>>> Why - it's possible to have an application use OpenCL, Vulkan (even
>>> VDPAU, GL, etc.).
>>> Thus effectively pulling the NIR codebase multiple times in the same process.
>>>
>>> Of the top of my head it sounds like we have a bunch of global
>>> variables, which are causing the problem.
>>> Or perhaps it's the screen sharing that bites us?
>>>
>>>> Maybe there are some linker tricks to solve this, idk.. that is a bit
>>>> outside my area of expertise.
>>>
>>> Last time I've looked symbols were properly annotated.
>>>
>>> Rob, can you try dropping the freedreno symbol from
>>> src/gallium/targets/dri/dri.sym.
>>> I'm about ~90% sure that it will fix your problem.
>>>
>>
>> the odd thing is, it works for Nouveau without issues.
>>
>
> Could you try this patch with nouveau:
>
> https://paste.fedoraproject.org/paste/KnFojZ7HTFAppUS6VnEKNQ/raw
>
> and see how many times "VOID TYPE" is printed?  Without mega-cl, I see
> it 3 times (looks like pipe_msm.so is loaded then unloaded and then
> loaded again?):
>

actually 3 times. I guess I was just lucky that it doesn't cause any issues?

> ------------
> (gdb) break glsl_types.cpp:69
> No source file named glsl_types.cpp.
> Make breakpoint pending on future shared library load? (y or [n]) y
> Breakpoint 1 (glsl_types.cpp:69) pending.
> (gdb) r
> Starting program: /home/robclark/src/opencl-example/math-int add 6 7 13
> Missing separate debuginfos, use: dnf debuginfo-install
> glibc-2.26.9000-51.fc28.aarch64
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
>
> Breakpoint 1, glsl_type::glsl_type (this=0xffffba03e8d0
> <glsl_type::_void_type>, gl_type=1280,
>     base_type=GLSL_TYPE_VOID, vector_elements=0, matrix_columns=0,
> name=0xffffb9f71f50 "void")
>     at ../src/compiler/glsl_types.cpp:69
> 69      printf("VOID TYPE!\n");
> Missing separate debuginfos, use: dnf debuginfo-install
> clang-libs-5.0.1-2.fc28.aarch64 elfutils-libelf-0.170-1.fc27.aarch64
> expat-2.2.5-1.fc28.aarch64 libedit-3.1-20.20170329cvs.fc27.aarch64
> libffi-3.1-14.fc27.aarch64 libgcc-7.2.1-6.fc28.aarch64
> libstdc++-7.2.1-6.fc28.aarch64 llvm-libs-5.0.1-1.fc28.aarch64
> ncurses-libs-6.0-15.20171125.fc28.aarch64
> ocl-icd-2.2.11-4.fc27.aarch64
> spirv-tools-libs-2018.1-0.2.20180205.git9e19fc0.fc28.aarch64
> zlib-1.2.11-4.fc27.aarch64
> (gdb) bt
> #0  glsl_type::glsl_type (this=0xffffba03e8d0 <glsl_type::_void_type>,
> gl_type=1280, base_type=GLSL_TYPE_VOID, vector_elements=0,
> matrix_columns=0, name=0xffffb9f71f50 "void") at
> ../src/compiler/glsl_types.cpp:69
> #1  0x0000ffffb9f22580 in __static_initialization_and_destruction_0
> (__initialize_p=1, __priority=65535) at
> ../src/compiler/builtin_type_macros.h:32
> #2  0x0000ffffb9f240b0 in _GLOBAL__sub_I_glsl_types.cpp(void) () at
> ../src/compiler/glsl_types.cpp:2364
> #3  0x0000ffffbf6dccf4 in call_init.part () from /lib/ld-linux-aarch64.so.1
> #4  0x0000ffffbf6dcde4 in _dl_init () from /lib/ld-linux-aarch64.so.1
> #5  0x0000ffffbf6e0fd4 in dl_open_worker () from /lib/ld-linux-aarch64.so.1
> #6  0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
> #7  0x0000ffffbf6e06f0 in _dl_open () from /lib/ld-linux-aarch64.so.1
> #8  0x0000ffffbf4e0044 in dlopen_doit () from /lib64/libdl.so.2
> #9  0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
> #10 0x0000ffffbf60b29c in _dl_catch_error () from /lib64/libc.so.6
> #11 0x0000ffffbf4e0734 in _dlerror_run () from /lib64/libdl.so.2
> #12 0x0000ffffbf4e00dc in dlopen@@GLIBC_2.17 () from /lib64/libdl.so.2
> #13 0x0000ffffbf324518 in util_dl_open (filename=0xffffffffd938
> "/home/robclark/src/mesa/debug/b/lib64/gallium-pipe/pipe_msm.so") at
> ../src/gallium/auxiliary/util/u_dl.c:48
> #14 0x0000ffffbf2ca09c in pipe_loader_find_module
> (driver_name=0x481fb0 "msm", library_paths=0xffffbf3e6f28
> "/home/robclark/src/mesa/debug/b/lib64/gallium-pipe") at
> ../src/gallium/auxiliary/pipe-loader/pipe_loader.c:162
> #15 0x0000ffffbf2ca824 in get_driver_descriptor (driver_name=0x481fb0
> "msm", plib=0x459ec8) at
> ../src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c:150
> #16 0x0000ffffbf2ca988 in pipe_loader_drm_probe_fd
> (dev=0xffffffffe9e8, fd=4) at
> ../src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c:193
> #17 0x0000ffffbf2caa8c in pipe_loader_drm_probe (devs=0x0, ndev=0) at
> ../src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c:231
> #18 0x0000ffffbf2c9d74 in pipe_loader_probe (devs=0x0, ndev=0) at
> ../src/gallium/auxiliary/pipe-loader/pipe_loader.c:65
> #19 0x0000ffffbf2bd3f8 in clover::platform::platform
> (this=0xffffbf4d88f8 <(anonymous namespace)::_clover_platform>) at
> ../src/gallium/state_trackers/clover/core/platform.cpp:28
> #20 0x0000ffffbf263af0 in __static_initialization_and_destruction_0
> (__initialize_p=1, __priority=65535) at
> ../src/gallium/state_trackers/clover/api/platform.cpp:30
> #21 0x0000ffffbf263b2c in _GLOBAL__sub_I_platform.cpp(void) () at
> ../src/gallium/state_trackers/clover/api/platform.cpp:143
> #22 0x0000ffffbf6dccf4 in call_init.part () from /lib/ld-linux-aarch64.so.1
> #23 0x0000ffffbf6dcde4 in _dl_init () from /lib/ld-linux-aarch64.so.1
> #24 0x0000ffffbf6e0fd4 in dl_open_worker () from /lib/ld-linux-aarch64.so.1
> #25 0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
> #26 0x0000ffffbf6e06f0 in _dl_open () from /lib/ld-linux-aarch64.so.1
> #27 0x0000ffffbf4e0044 in dlopen_doit () from /lib64/libdl.so.2
> #28 0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
> #29 0x0000ffffbf60b29c in _dl_catch_error () from /lib64/libc.so.6
> #30 0x0000ffffbf4e0734 in _dlerror_run () from /lib64/libdl.so.2
> #31 0x0000ffffbf4e00dc in dlopen@@GLIBC_2.17 () from /lib64/libdl.so.2
> #32 0x0000ffffbf67bb0c in _initClIcd_real () from /lib64/libOpenCL.so.1
> #33 0x0000ffffbf67dad4 in clGetPlatformIDs () from /lib64/libOpenCL.so.1
> #34 0x0000000000401684 in clSimpleInitGpuDevice
> (device_id=0xfffffffff3f8) at cl_simple.c:149
> #35 0x0000000000400ec8 in main (argc=5, argv=0xfffffffff568) at math-int.c:36
> (gdb) c
> Continuing.
> VOID TYPE!
> warning: Temporarily disabling breakpoints for unloaded shared library
> "/home/robclark/src/mesa/debug/b/lib64/gallium-pipe/pipe_msm.so"
>
> Breakpoint 1, glsl_type::glsl_type (this=0xffffba03e8d0
> <glsl_type::_void_type>, gl_type=1280, base_type=GLSL_TYPE_VOID,
> vector_elements=0, matrix_columns=0, name=0xffffb9f71f50 "void")
>     at ../src/compiler/glsl_types.cpp:69
> 69      printf("VOID TYPE!\n");
> (gdb) bt
> #0  glsl_type::glsl_type (this=0xffffba03e8d0 <glsl_type::_void_type>,
> gl_type=1280, base_type=GLSL_TYPE_VOID, vector_elements=0,
> matrix_columns=0, name=0xffffb9f71f50 "void") at
> ../src/compiler/glsl_types.cpp:69
> #1  0x0000ffffb9f22580 in __static_initialization_and_destruction_0
> (__initialize_p=1, __priority=65535) at
> ../src/compiler/builtin_type_macros.h:32
> #2  0x0000ffffb9f240b0 in _GLOBAL__sub_I_glsl_types.cpp(void) () at
> ../src/compiler/glsl_types.cpp:2364
> #3  0x0000ffffbf6dccf4 in call_init.part () from /lib/ld-linux-aarch64.so.1
> #4  0x0000ffffbf6dcde4 in _dl_init () from /lib/ld-linux-aarch64.so.1
> #5  0x0000ffffbf6e0fd4 in dl_open_worker () from /lib/ld-linux-aarch64.so.1
> #6  0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
> #7  0x0000ffffbf6e06f0 in _dl_open () from /lib/ld-linux-aarch64.so.1
> #8  0x0000ffffbf4e0044 in dlopen_doit () from /lib64/libdl.so.2
> #9  0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
> #10 0x0000ffffbf60b29c in _dl_catch_error () from /lib64/libc.so.6
> #11 0x0000ffffbf4e0734 in _dlerror_run () from /lib64/libdl.so.2
> #12 0x0000ffffbf4e00dc in dlopen@@GLIBC_2.17 () from /lib64/libdl.so.2
> #13 0x0000ffffbf324518 in util_dl_open (filename=0xffffffffd938
> "/home/robclark/src/mesa/debug/b/lib64/gallium-pipe/pipe_msm.so") at
> ../src/gallium/auxiliary/util/u_dl.c:48
> #14 0x0000ffffbf2ca09c in pipe_loader_find_module
> (driver_name=0x4799e0 "msm", library_paths=0xffffbf3e6f28
> "/home/robclark/src/mesa/debug/b/lib64/gallium-pipe") at
> ../src/gallium/auxiliary/pipe-loader/pipe_loader.c:162
> #15 0x0000ffffbf2ca824 in get_driver_descriptor (driver_name=0x4799e0
> "msm", plib=0x4795c8) at
> ../src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c:150
> #16 0x0000ffffbf2ca988 in pipe_loader_drm_probe_fd
> (dev=0xffffffffe9e8, fd=4) at
> ../src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c:193
> #17 0x0000ffffbf2caa8c in pipe_loader_drm_probe (devs=0x481f00,
> ndev=2) at ../src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c:231
> #18 0x0000ffffbf2c9d74 in pipe_loader_probe (devs=0x481f00, ndev=2) at
> ../src/gallium/auxiliary/pipe-loader/pipe_loader.c:65
> #19 0x0000ffffbf2bd434 in clover::platform::platform
> (this=0xffffbf4d88f8 <(anonymous namespace)::_clover_platform>) at
> ../src/gallium/state_trackers/clover/core/platform.cpp:31
> #20 0x0000ffffbf263af0 in __static_initialization_and_destruction_0
> (__initialize_p=1, __priority=65535) at
> ../src/gallium/state_trackers/clover/api/platform.cpp:30
> #21 0x0000ffffbf263b2c in _GLOBAL__sub_I_platform.cpp(void) () at
> ../src/gallium/state_trackers/clover/api/platform.cpp:143
> #22 0x0000ffffbf6dccf4 in call_init.part () from /lib/ld-linux-aarch64.so.1
> #23 0x0000ffffbf6dcde4 in _dl_init () from /lib/ld-linux-aarch64.so.1
> #24 0x0000ffffbf6e0fd4 in dl_open_worker () from /lib/ld-linux-aarch64.so.1
> #25 0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
> #26 0x0000ffffbf6e06f0 in _dl_open () from /lib/ld-linux-aarch64.so.1
> #27 0x0000ffffbf4e0044 in dlopen_doit () from /lib64/libdl.so.2
> #28 0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
> #29 0x0000ffffbf60b29c in _dl_catch_error () from /lib64/libc.so.6
> #30 0x0000ffffbf4e0734 in _dlerror_run () from /lib64/libdl.so.2
> #31 0x0000ffffbf4e00dc in dlopen@@GLIBC_2.17 () from /lib64/libdl.so.2
> #32 0x0000ffffbf67bb0c in _initClIcd_real () from /lib64/libOpenCL.so.1
> #33 0x0000ffffbf67dad4 in clGetPlatformIDs () from /lib64/libOpenCL.so.1
> #34 0x0000000000401684 in clSimpleInitGpuDevice
> (device_id=0xfffffffff3f8) at cl_simple.c:149
> #35 0x0000000000400ec8 in main (argc=5, argv=0xfffffffff568) at math-int.c:36
> (gdb) c
> Continuing.
> VOID TYPE!
> [New Thread 0xffffb99d51e0 (LWP 8508)]
> [New Thread 0xffffb3fff1e0 (LWP 8509)]
> [New Thread 0xffffb91d41e0 (LWP 8510)]
> [New Thread 0xffffb89d31e0 (LWP 8511)]
> [Thread 0xffffb91d41e0 (LWP 8510) exited]
> [Thread 0xffffb89d31e0 (LWP 8511) exited]
> [Thread 0xffffb3fff1e0 (LWP 8509) exited]
> [Thread 0xffffb99d51e0 (LWP 8508) exited]
>
> Thread 1 "math-int" hit Breakpoint 1, glsl_type::glsl_type
> (this=0xffffbf4dacf0 <glsl_type::_void_type>, gl_type=1280,
> base_type=GLSL_TYPE_VOID, vector_elements=0, matrix_columns=0,
> name=0xffffbf405ae0 "void")
>     at ../src/compiler/glsl_types.cpp:69
> 69      printf("VOID TYPE!\n");
> (gdb) bt
> #0  glsl_type::glsl_type (this=0xffffbf4dacf0 <glsl_type::_void_type>,
> gl_type=1280, base_type=GLSL_TYPE_VOID, vector_elements=0,
> matrix_columns=0, name=0xffffbf405ae0 "void") at
> ../src/compiler/glsl_types.cpp:69
> #1  0x0000ffffbf3ddee0 in __static_initialization_and_destruction_0
> (__initialize_p=1, __priority=65535) at
> ../src/compiler/builtin_type_macros.h:32
> #2  0x0000ffffbf3dfa10 in _GLOBAL__sub_I_glsl_types.cpp(void) () at
> ../src/compiler/glsl_types.cpp:2364
> #3  0x0000ffffbf6dccf4 in call_init.part () from /lib/ld-linux-aarch64.so.1
> #4  0x0000ffffbf6dcde4 in _dl_init () from /lib/ld-linux-aarch64.so.1
> #5  0x0000ffffbf6e0fd4 in dl_open_worker () from /lib/ld-linux-aarch64.so.1
> #6  0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
> #7  0x0000ffffbf6e06f0 in _dl_open () from /lib/ld-linux-aarch64.so.1
> #8  0x0000ffffbf4e0044 in dlopen_doit () from /lib64/libdl.so.2
> #9  0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
> #10 0x0000ffffbf60b29c in _dl_catch_error () from /lib64/libc.so.6
> #11 0x0000ffffbf4e0734 in _dlerror_run () from /lib64/libdl.so.2
> #12 0x0000ffffbf4e00dc in dlopen@@GLIBC_2.17 () from /lib64/libdl.so.2
> #13 0x0000ffffbf67bb0c in _initClIcd_real () from /lib64/libOpenCL.so.1
> #14 0x0000ffffbf67dad4 in clGetPlatformIDs () from /lib64/libOpenCL.so.1
> #15 0x0000000000401684 in clSimpleInitGpuDevice
> (device_id=0xfffffffff3f8) at cl_simple.c:149
> #16 0x0000000000400ec8 in main (argc=5, argv=0xfffffffff568) at math-int.c:36
> (gdb) c
> Continuing.
> VOID TYPE!
> There are 1 platforms.
> There are 1 GPU devices.
> ------------
>
> With mega-cl I see it just once:
>
> ------------
> Starting program: /home/robclark/src/opencl-example/math-int add 6 7 13
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> [New Thread 0xffffb9c931e0 (LWP 9570)]
> [New Thread 0xffffb3fff1e0 (LWP 9571)]
> [New Thread 0xffffb94921e0 (LWP 9572)]
> [New Thread 0xffffb8c911e0 (LWP 9573)]
> [Thread 0xffffb94921e0 (LWP 9572) exited]
> [Thread 0xffffb8c911e0 (LWP 9573) exited]
> [Thread 0xffffb9c931e0 (LWP 9570) exited]
> [Thread 0xffffb3fff1e0 (LWP 9571) exited]
>
> Thread 1 "math-int" hit Breakpoint 1, glsl_type::glsl_type
> (this=0xffffbf495240 <glsl_type::_void_type>, gl_type=1280,
> base_type=GLSL_TYPE_VOID, vector_elements=0, matrix_columns=0,
> name=0xffffbf32f0e0 "void")
>     at ../src/compiler/glsl_types.cpp:69
> 69          printf("VOID TYPE!\n");
> (gdb) bt
> #0  glsl_type::glsl_type (this=0xffffbf495240 <glsl_type::_void_type>,
> gl_type=1280, base_type=GLSL_TYPE_VOID, vector_elements=0,
> matrix_columns=0, name=0xffffbf32f0e0 "void") at
> ../src/compiler/glsl_types.cpp:69
> #1  0x0000ffffbf19fe94 in __static_initialization_and_destruction_0
> (__initialize_p=1, __priority=65535) at
> ../src/compiler/builtin_type_macros.h:32
> #2  0x0000ffffbf1a19c4 in _GLOBAL__sub_I_glsl_types.cpp(void) () at
> ../src/compiler/glsl_types.cpp:2364
> #3  0x0000ffffbf6dccf4 in call_init.part () from /lib/ld-linux-aarch64.so.1
> #4  0x0000ffffbf6dcde4 in _dl_init () from /lib/ld-linux-aarch64.so.1
> #5  0x0000ffffbf6e0fd4 in dl_open_worker () from /lib/ld-linux-aarch64.so.1
> #6  0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
> #7  0x0000ffffbf6e06f0 in _dl_open () from /lib/ld-linux-aarch64.so.1
> #8  0x0000ffffbf4e0044 in dlopen_doit () from /lib64/libdl.so.2
> #9  0x0000ffffbf60b200 in _dl_catch_exception () from /lib64/libc.so.6
> #10 0x0000ffffbf60b29c in _dl_catch_error () from /lib64/libc.so.6
> #11 0x0000ffffbf4e0734 in _dlerror_run () from /lib64/libdl.so.2
> #12 0x0000ffffbf4e00dc in dlopen@@GLIBC_2.17 () from /lib64/libdl.so.2
> #13 0x0000ffffbf67bb0c in _initClIcd_real () from /lib64/libOpenCL.so.1
> #14 0x0000ffffbf67dad4 in clGetPlatformIDs () from /lib64/libOpenCL.so.1
> #15 0x0000000000401684 in clSimpleInitGpuDevice
> (device_id=0xfffffffff3f8) at cl_simple.c:149
> #16 0x0000000000400ec8 in main (argc=5, argv=0xfffffffff568) at math-int.c:36
> (gdb) c
> Continuing.
> VOID TYPE!
> There are 1 platforms.
> There are 1 GPU devices.
> ------------


More information about the mesa-dev mailing list