<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Nov 13, 2017 at 3:32 PM, Matt Turner <span dir="ltr"><<a href="mailto:mattst88@gmail.com" target="_blank">mattst88@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Wed, Oct 25, 2017 at 4:25 PM, Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> wrote:<br> > Some hardware (CHV, BXT) have special restrictions on register regions<br> > when doing integer multiplication. We want to respect those when we<br> > lower to DxW multiplication.<br> <br> </span>This is not a good commit message. I am very familiar with the CHV,<br> BXT restrictions you mention and I have no idea what this patch<br> accomplishes.<br> <br> The commit message should say what the problem is and give an example,<br> explain how you are fixing it, and how you can reproduce the problem.<br> I'm going to need at least some of that information before I can look<br> into the SNB regression.<br> </blockquote></div></div><div class="gmail_extra"><br></div><div class="gmail_extra">Yes, I could definitely have done better on this one. What I was trying to say, though too briefly, was that we have an issue with the little-core restriction that source and destination strides must match for integer MUL. If you did have a strided MULL (not common, but it can happen), we would reset the strides back to packed for some of the intermediate calculations and this would break things. The objective of this patch is to use the destination region (won't be scalar) for all temporary values so a valid MUL becomes a valid lowered MUL. The only way I know of to reproduce is to grab my subgroups work and run the integer multiply reduction tests.<br></div></div>