<html>
<head>
<base href="https://bugs.freedesktop.org/">
</head>
<body>
<p>
<div>
<b><a class="bz_bug_link
bz_status_NEW "
title="NEW - [polaris10] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580"
href="https://bugs.freedesktop.org/show_bug.cgi?id=108272#c12">Comment # 12</a>
on <a class="bz_bug_link
bz_status_NEW "
title="NEW - [polaris10] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580"
href="https://bugs.freedesktop.org/show_bug.cgi?id=108272">bug 108272</a>
from <span class="vcard"><a class="email" href="mailto:jan.vesely@rutgers.edu" title="Jan Vesely <jan.vesely@rutgers.edu>"> <span class="fn">Jan Vesely</span></a>
</span></b>
<pre>Hi,
sorry for the delay. somehow I missed the notifications.
(In reply to jamespharvey20 from <a href="show_bug.cgi?id=108272#c11">comment #11</a>)
<span class="quote">> When I originally filed this, I assumed it was 1 bug since I tried 2 things
> with OpenCL, and both failed with opencl-mesa but worked with opencl-amd.
>
> Jan Vesely was correct that there were two separate problems.
>
> I'm hoping Jan Vesely can give guidance on whether to leave this bug open
> for any of the reasons below, or if I should close it and potentially open
> up 1-2 new bugs.
>
> The original luxmark bug (segfault) is solved, but that exposes 2 new
> opencl-mesa bugs when running luxmark.
>
> The original IndigoBenchmark bug (segfault) isn't solved, but as explained
> below, I understand if we have to consider that unsolvable for now.
>
> I don't think this affects any of these bugs, but I'll mention a few weeks
> ago, I switched back to my Asus Radeon R9 390. The same behaviors discussed
> in this entire bug report occur. (i.e. 18.2.3 and before crash luxmark.)
> If someone really wants me to do so, I can switch back to the RX 580 to test
> 18.2.4, but I'm betting since it works properly with the R9 390 that the
> problem is fixed.
>
> ORIGINAL LUXMARK <a class="bz_bug_link
bz_status_VERIFIED bz_closed"
title="VERIFIED FIXED - Optimizations"
href="show_bug.cgi?id=1">BUG #1</a>
> -----------------------------------------
>
> Using mesa 18.2.4, the luxmark segfault is solved.</span >
As this was the first bug. I'd close this one and open new bugs for both indigo
and incorrect rendering in luxmark.
<span class="quote">>
> NEW - LUXMARK <a class="bz_bug_link
bz_status_RESOLVED bz_closed"
title="RESOLVED FIXED - submake invocation failure does not cause build failure"
href="show_bug.cgi?id=2">BUG #2</a>
> ------------------------------------
>
> Jan Vesely's comment on 2018-10-09 mentions: "bumping MAX_GLOBAL_BUFFERS to
> 32 allows luxmark to run, albeit still with many incorrect pixels -- libclc
> rounding conversions are incorrect."
>
> That's what I'm seeing out of 18.2.4. Using LuxBall HDR (Simple Benchmark):
>
> MESA 18.2.4: 40626 (Image validation OK (65739 different pixels, 10.27%)
>
> AMDGPU-PRO: 15739 (Image validation OK (5736 different pixels, 0.90%)
>
> There's no typos there. opencl-mesa scores almost unbelievably higher than
> opencl-amd, but the different pixels percentage increases by a factor of
> 11.4.
>
> As Jan's other comment on 2018-10-09 mentions, the image looks garbled and
> the results are incorrect.
>
> Not sure if this bug should be left open for this issue, or if I should
> create a new bug. (Or, if there is a bug already open for it.) Or, if mesa
> will say it's purely libclc's problem, and to go to them about it.</span >
I'd say this is probably a purely libclc problem, but feel free to open the bug
against clover on freedesktop. 10% is rather good I usually saw ~30% wrong
pixels on my machines.
<span class="quote">>
> NEW - LUXMARK <a class="bz_bug_link
bz_status_RESOLVED bz_closed"
title="RESOLVED FIXED - Should provide empty local.conf file"
href="show_bug.cgi?id=3">BUG #3</a>
> ------------------------------------
>
> Although luxmark can now benchmark, when doing so, all input becomes
> unusably awful. It reminds me of when Windows has too many things open,
> suddenly decided it can't cope, and you're waiting to see if it's going to
> recover or crash. Keystrokes take too long to be printed, and the mouse
> becomes slow and jumpy. Top shows cpu and memory usage are fine, which was
> my first thought. BTW, running xf86-video-amdgpu 18.1.0, and when I
> upgraded mesa, it was both mesa and opencl-mesa.
>
> In comparison, if I use opencl-amd, input is not affected. I wouldn't even
> know the GPU is being slammed.
>
> Using the program radeontop, I can see when using mesa, "Graphics pipe",
> "Texture Addresser", and "Shader Interpolator" are between 95-100%, usually
> 98-100%.
>
> When using opencl-amd, radeontop shows the same. (Granted, Vertex Grouper +
> Tesselator / Shader Export/Scan Converter/Depth Block/Color Block bounce
> between 5-20% vs on opencl-mesa, they bounce between 1-5%.)</span >
This sounds like GPU priority/scheduling problem. I haven't looked into whether
it can be solved via opening lower priority pipe for compute, or we need to
enable advanced features like CWSR. Please open a separate bug. Hogging a large
portion of the GPU might explain some of that high score.
<span class="quote">>
> INDIGO BUG
> ------------------
>
> I edited 18.2.4's si_get.c to be very short:
>
> snprintf(sscreen->renderer_string, sizeof(sscreen->renderer_string),
> "%s",
> chip_name);
>
> And compiled/installed it, but it didn't affect the crash.
>
> IndigoBenchmark said they're statically linking with LLVM 3.4, which is
> quite old. But, it runs fine with opencl-amd, and only crashes on
> opencl-mesa. I just posted a followup "where do we go from here"-ish
> comment there which has to be moderator approved so isn't showing yet.
> <a href="https://www.indigorenderer.com/forum/viewtopic.php?f=37&t=14986">https://www.indigorenderer.com/forum/viewtopic.php?f=37&t=14986</a>
>
> Part of me thinks it needs to be given up on, being a closed-source
> precompiled binary statically linked against LLVM 3.4.
>
> Part of me thinks since it only crashes with opencl-mesa, and runs perfectly
> fine with opencl-amd, there's probably (but not definitely) a bug in
> opencl-mesa.
>
> But, I understand since they don't seem to be paying this any attention, we
> may have to give up on the Indigo Bug as being unable to be realistically
> investigated further.</span >
Can you check if indigo exports any LLVM symbols? It might be that we end up
using those instead of the new ones from libLLVM.*
If that's the case one solution would be to link mesa/clover with static LLVM.
Enabling symbol versioning for LLVM should work as well.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>