<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jan 22, 2015 at 10:02 AM, Michael Stahl <span dir="ltr"><<a href="mailto:mstahl@redhat.com" target="_blank">mstahl@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span>On 22.01.2015 15:31, Ashod Nakashian wrote:<br><br> > It seems that filter-showIncludes.awk is writing the dependency file .d<br> > in a .tmp file first, then it's renaming it to .d.<br> > Considering that the input is piped from stdin and .d either doesn't<br> > exist or will be overwritten (and is never an input,) the wisdom of<br> > writing to a temp file and then moving is lost on me.<br> <br> </span>the point is to have an atomic update of the make target, i.e. if the<br> build is interrupted for whatever reason, prevent an incompletely<br> written .d file with up-to-date time stamp that will very likely break<br> the next build.</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> <span><br> > A complete build generates close to 9000 .d files in CxxObject alone.<br> > Spawning processes and doing file I/O in the several thousands are bound<br> > to have a toll.<br> <br> </span>sure but filter-showIncludes is only run when the C++ compiler is also<br> run on the file and that will take a lot longer than forking a "mv" process.<br> <br><br></blockquote><div><br></div><div>I see.</div><div><br></div><div>My main issue is I/O contention. mv is a filesystem operation that is done synchronously (ultimately the FS must take a lock that is system-wide).</div><div>On a powerful rig I/O is the biggest bottleneck to building LO. If you have enough CPU cores to compile dozens of files in parallel, but not enough I/O ops, you'll spin the CPU a good deal of the time.</div><div>On my 6c/12t rig I could build LO from scratch with dbgutil with parallelism=16 in 68~70 minutes at best. This was on 256GB Samsung 830. Replace the 830 with the new 850 and what do you get?</div><div>Would you have guessed 34-35 minutes? In terms of bandwidth, the 850 isn't much faster than the 830. But I/O ops wise it's ~3x faster, especially with high queue count.</div><div><br></div><div>Spawning 2 processes and renaming a file is quick compare with compiling the source in question, but it hinders parallelism on high-core machines.</div><div><br></div><div>The temp file solution seems to be the simplest solution to the atomic update.</div><div>How about writing an EOF marker in .d files? We'll then update concat-deps (that's the consumer of .d, right?) to consider the file corrupt/invalid if it doesn't include the marker.</div><div><br></div><div>I know this is marginally more complex than mv .tmp .d, but so is the whole business with generating and processing the .d files in the first place, which serves to accelerate the build.</div><div>My numbers above show that there are significant gains by reducing I/O load during build (for high threaded builds).</div><div><br></div><div>Would the EOF marker solve the issue you raise? Would you agree it's worth the effort (which I'm volunteering) to reduce I/O?</div></div></div></div>