[Mesa-dev] radv vs vulkan loader
Gustaw Smolarczyk
wielkiegie at gmail.com
Fri Dec 16 17:12:28 UTC 2016
2016-12-16 17:57 GMT+01:00 Emil Velikov <emil.l.velikov at gmail.com>:
> On 16 December 2016 at 15:27, Gustaw Smolarczyk <wielkiegie at gmail.com> wrote:
>> 2016-12-16 14:50 GMT+01:00 Emil Velikov <emil.l.velikov at gmail.com>:
>>>
>>> On 5 October 2016 at 23:12, Gustaw Smolarczyk <wielkiegie at gmail.com>
>>> wrote:
>>> > 2016-10-06 0:05 GMT+02:00 Emil Velikov <emil.l.velikov at gmail.com>:
>>> >> On 5 October 2016 at 21:45, Gustaw Smolarczyk <wielkiegie at gmail.com>
>>> >> wrote:
>>> >>> Hello,
>>> >>>
>>> >>> I have encountered a following problem while trying to use radv
>>> >>> through LunarG's vulkan loader.
>>> >>>
>>> >>> It seems that the loader dlopens() the ICD library twice. First, it
>>> >>> looks up "vk_icdNegotiateLoaderICDInterfaceVersion" symbol, which
>>> >>> seems to be the new mechanism used to determine the version of ICD
>>> >>> interface. Since radv doesn't provide it, it fall backs to the older
>>> >>> scheme. Unfortunately, it seems that the loader first unloads the ICD
>>> >>> and then loads it again. That causes issues in LLVM libraries' command
>>> >>> line registration which happens while initializing global objects with
>>> >>> constructors. To be more specific, "asan-instrument-assembly"
>>> >>> registered in libLLVMX86AsmPrinter.so registers for the second time
>>> >>> and causes the following message:
>>> >>>
>>> >>>
>>> >>> $ vulkaninfo
>>> >>> ===========
>>> >>> VULKAN INFO
>>> >>> ===========
>>> >>>
>>> >>> Vulkan API Version: 1.0.26
>>> >>>
>>> >>>
>>> >>> : CommandLine Error: Option 'asan-instrument-assembly' registered more
>>> >>> than once!
>>> >>> LLVM ERROR: inconsistency in registered CommandLine options
>>> >>>
>>> >>> I have "fixed" the problem by manually removing
>>> >>> libLLVMX86AsmPrinter.so from the list of llvm dependencies to radv. It
>>> >>> seems that it was the only library registering any command line
>>> >>> option.
>>> >>>
>>> >>> I am not sure who is to blame for this situation. It's possible that
>>> >>> advertising the new ICD entry point would fix it. LLVM is really
>>> >>> fragile about its command line registration framework. Last, the
>>> >>> vulkan loader could try not to unnecessarily dlclose and dlopen the
>>> >>> ICD library.
>>> >>>
>>> >> From a quick read it sounds like something that should be fixed in
>>> >> LLVM. Namely: if a library sets up a state it should cleanup after
>>> >> itself.
>>> >>
>>> >> That aside, does the radv vulkan driver have unresolved/undefined
>>> >> symbols (check via $ldd -r libvulkan_foo.so) with your workaround ? If
>>> >> not we {c,sh}ould drop the library from the link chain. Alternatively
>>> >> you can try static linking LLVM by using --disable-llvm-shared-libs at
>>> >> mesa configure time.
>>> >
>>> > I see no relocation errors after doing ldd -r with my workaround.
>>> >
>>> > I think the problem lays with how llvm-config is called. We enable
>>> > AMDGPU target and want the AsmPrinter module for it, so we enable
>>> > asmprinter component. However, this enables asmprinter for all enabled
>>> > targets. Since X86 target is enabled by default, this brings
>>> > X86AsmPrinter into the list of libraries.
>>> >
>>> > llvmpipe/gallium need the X86 target, but radv could probably be built
>>> > without it.
>>> >
>>> Pardon for missing your reply here.
>>>
>>> In general I agree that one shouldn't link with libraries which they don't
>>> need.
>>>
>>> At the same time:
>>> - a library is should tear down all the state that it sets up.
>>> Afaict the LLVM module sets it up "asan-instrument-assembly" thus it
>>> is the one responsible to unregister it.
>>
>>
>> Yes, I also think this should be really fixed in LLVM. There is however an
>> easy work-around for mesa that I have posted a few days ago [1].
>>
>>>
>>>
>>> - Split shared LLVM wasn't a supported/recommended//good idea, last
>>> time I've heard.
>>
>>
>> This is how llvm is built on gentoo by default [2]. Because of that it
>> possibly affects all gentoo users.
>>
>>>
>>> Please use single LLVM shared lib or [separate] static LLVM libs.
>>
>>
>> I will check what happens when I dlopen/dlclose/dlopen both separate and
>> monolithic shared LLVM libraries.
>>
> Based of git log, I cannot see any justification/information why would
> anyone want to enable SHARED_LIBS.
> To take this even more fun with ~3.7 series gentoo carries patch
> (backport?) which adds the functionality in the first place.
>
> Seems like devs have missed the warning/notice message [1] that the
> option is for LLVM developers ?
>
> I see your point and concern esp. how trivial the workaround on mesa side it.
> At the same time this is not something one should be using/doing in
> the first place, because of the reason(s) you're noticed.
>
> Thanks
> Emil
>
> [1] http://llvm.org/docs/CMake.html
I am not entirely sure why, but the problem doesn't occur for me
anymore. However I suspect that it happens because I have reverted
from amdgpu.ko to radeon.ko kernel module (was buggy on SI hardware)
and as such am unable to properly test vulkan anymore. The
vulkan_radeon.so ICD is still being loaded but it doesn't list any
possible physical device (as expected).
It did however load successfully without the strange LLVM argument
registration issues, even though I can see that the ICD was dlopened 3
times... Not sure it is fixed (possibly on the LLVM side) or it only
manifests if the ICD lists at least a single physical device.
I might try to build the 4.9 kernel soon and then will revisit it once again.
Regards,
Gustaw
More information about the mesa-dev
mailing list