<html>
<head>
<base href="https://bugs.freedesktop.org/">
</head>
<body>
<p>
<div>
<b><a class="bz_bug_link
bz_status_NEW "
title="NEW - radv: VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT and bringing down initial pipeline compile times"
href="https://bugs.freedesktop.org/show_bug.cgi?id=106246#c5">Comment # 5</a>
on <a class="bz_bug_link
bz_status_NEW "
title="NEW - radv: VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT and bringing down initial pipeline compile times"
href="https://bugs.freedesktop.org/show_bug.cgi?id=106246">bug 106246</a>
from <span class="vcard"><a class="email" href="mailto:sroland@vmware.com" title="Roland Scheidegger <sroland@vmware.com>"> <span class="fn">Roland Scheidegger</span></a>
</span></b>
<pre>FWIW with llvmpipe (gallivm) we found that LICM can have very high cost (in
particular the lcssa pass that comes with it). I think though it was mostly
related to the main shader loop, which you don't have with radeonsi.
Doing some experiments having early-cse near the beginning (after sroa) seemed
to help somewhat, as it tends to make the IR simpler for the later passes at a
small cost (albeit sroa itself can blow IR up quite a bit). sroa and early-cse
at the beginning is also close to what off-line llvm opt -O2 would do. Albeit
radeonsi already has the memssa version of early-cse before instcombine, so
maybe that's sufficient... The -time-passes and -debug-pass=Structure tell you
a lot what passes actually get run and how much time they need, these also work
for codegen (llc). Of course that requires you dumped the bitcode somewhere out
of the driver (but if it's just millions of small shaders I wouldn't really
expect much in any case).
If there's some guidelines which passes make sense to run in which order, I'd
be definitely quite interested in that...</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the QA Contact for the bug.</li>
</ul>
</body>
</html>