<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=Content-Type content="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 11 (filtered medium)">
<style>
<!--
/* Font Definitions */
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
text-align:justify;
text-justify:inter-ideograph;
font-size:10.5pt;
font-family:"Times New Roman";}
a:link, span.MsoHyperlink
{color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:Arial;
color:windowtext;}
/* Page Definitions */
@page Section1
{size:595.3pt 841.9pt;
margin:72.0pt 90.0pt 72.0pt 90.0pt;
layout-grid:15.6pt;}
div.Section1
{page:Section1;}
-->
</style>
</head>
<body lang=ZH-CN link=blue vlink=purple style='text-justify-trim:punctuation'>
<div class=Section1 style='layout-grid:15.6pt'>
<p class=MsoNormal><font size=1 face=Arial><span lang=EN-US style='font-size:
9.0pt;font-family:Arial'>Hi All<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=1 face=Arial><span lang=EN-US style='font-size:
9.0pt;font-family:Arial'>I find movnti + mfence is better than clflush as below
report shows (on core2 platform)<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=1 face=Arial><span lang=EN-US style='font-size:
9.0pt;font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal style='layout-grid-mode:char'><font size=1 face=Arial><span
lang=EN-US style='font-size:9.0pt;font-family:Arial'>Size(byte)
movnti(us) clflush (us) speedup<o:p></o:p></span></font></p>
<p class=MsoNormal style='layout-grid-mode:char'><font size=1 face=Arial><span
lang=EN-US style='font-size:9.0pt;font-family:Arial'>4k
3.01
3.56 1.182 <o:p></o:p></span></font></p>
<p class=MsoNormal style='layout-grid-mode:char'><font size=1 face=Arial><span
lang=EN-US style='font-size:9.0pt;font-family:Arial'>16k
12.01
14.23 1.184<o:p></o:p></span></font></p>
<p class=MsoNormal style='layout-grid-mode:char'><font size=1 face=Arial><span
lang=EN-US style='font-size:9.0pt;font-family:Arial'>32k
23.93
28.45 1.188 <o:p></o:p></span></font></p>
<p class=MsoNormal style='layout-grid-mode:char'><font size=1 face=Arial><span
lang=EN-US style='font-size:9.0pt;font-family:Arial'>64k
47.92
56.89 1.187<o:p></o:p></span></font></p>
<p class=MsoNormal style='layout-grid-mode:char'><font size=1 face=Arial><span
lang=EN-US style='font-size:9.0pt;font-family:Arial'>The code for two cases
(only care about alignment):<o:p></o:p></span></font></p>
<p class=MsoNormal style='layout-grid-mode:char'><font size=1 face=Arial><span
lang=EN-US style='font-size:9.0pt;font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal style='layout-grid-mode:char'><font size=1 face=Arial><span
lang=EN-US style='font-size:9.0pt;font-family:Arial'> Movnti +
mfence
clflush<o:p></o:p></span></font></p>
<p class=MsoNormal style='layout-grid-mode:char'><font size=1 face=Arial><span
lang=EN-US style='font-size:9.0pt;font-family:Arial'>For (i = 0; i < size; i
= i+ 64)
{
For (i = 0; i < size; i = i + 64)<o:p></o:p></span></font></p>
<p class=MsoNormal style='layout-grid-mode:char'><font size=1 face=Arial><span
lang=EN-US style='font-size:9.0pt;font-family:Arial'>
__asm__(“movq (addr + i),
%rax);
clflush(addr + i);<o:p></o:p></span></font></p>
<p class=MsoNormal style='layout-grid-mode:char'><font size=1 face=Arial><span
lang=EN-US style='font-size:9.0pt;font-family:Arial'> __asm__(“movntiq
%rax, (addr + i);<o:p></o:p></span></font></p>
<p class=MsoNormal style='layout-grid-mode:char'><font size=1 face=Arial><span
lang=EN-US style='font-size:9.0pt;font-family:Arial'>}<o:p></o:p></span></font></p>
<p class=MsoNormal style='layout-grid-mode:char'><font size=1 face=Arial><span
lang=EN-US style='font-size:9.0pt;font-family:Arial'>_<i><span
style='font-style:italic'>-asm</span></i>__ (“mfence”)<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=1 face=Arial><span lang=EN-US style='font-size:
9.0pt;font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=1 face=Arial><span lang=EN-US style='font-size:
9.0pt;font-family:Arial'>Movnti will invalidate cache line before writing data
into write combine buffer, at last we may use mfence to<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=1 face=Arial><span lang=EN-US style='font-size:
9.0pt;font-family:Arial'>drain out the left data in write combine buffer, and
behavior looks like clflush.<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=1 face=Arial><span lang=EN-US style='font-size:
9.0pt;font-family:Arial'>The approach is only fit for small page, when size is
bigger than about 128k(on my platform),<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=1 face=Arial><span lang=EN-US style='font-size:
9.0pt;font-family:Arial'>movnti + mfence approach get worse because read
instruction.<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=1 face=Arial><span lang=EN-US style='font-size:
9.0pt;font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=1 face=Arial><span lang=EN-US style='font-size:
9.0pt;font-family:Arial'>If </span></font><font size=1 face=Arial><span
lang=EN-US style='font-size:9.0pt;font-family:Arial'>theory</span></font><font
size=1 face=Arial><span lang=EN-US style='font-size:9.0pt;font-family:Arial'> is
right, we can get benefit from many flush operation in gem.<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=1 face=Arial><span lang=EN-US style='font-size:
9.0pt;font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=1 face=Arial><span lang=EN-US style='font-size:
9.0pt;font-family:Arial'>Thanks<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=1 face=Arial><span lang=EN-US style='font-size:
9.0pt;font-family:Arial'>Ma Ling<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=1 face=Arial><span lang=EN-US style='font-size:
9.0pt;font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=1 face=Arial><span lang=EN-US style='font-size:
9.0pt;font-family:Arial'><o:p> </o:p></span></font></p>
</div>
</body>
</html>