[Mesa-dev] [PATCH 09/32] i965/fs: Fix fs_inst::regs_written calculation for instructions with scalar dst.

Kenneth Graunke kenneth at whitecape.org
Fri Feb 6 17:21:10 PST 2015


On Saturday, February 07, 2015 02:10:19 AM Francisco Jerez wrote:
> Matt Turner <mattst88 at gmail.com> writes:
> 
> > On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez <currojerez at riseup.net> wrote:
> >> Scalar registers are required to have zero stride, fix the
> >> regs_written calculation not to assume that the instruction writes
> >> zero registers in that case.
> >> ---
> >>  src/mesa/drivers/dri/i965/brw_fs.cpp | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> index 0a82fb7..bafe438 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> @@ -126,7 +126,8 @@ fs_inst::init(enum opcode opcode, uint8_t exec_size, const fs_reg &dst,
> >>     case HW_REG:
> >>     case MRF:
> >>     case ATTR:
> >> -      this->regs_written = (dst.width * dst.stride * type_sz(dst.type) + 31) / 32;
> >> +      this->regs_written =
> >> +         CEILING(MAX2(dst.width * dst.stride, 1) * type_sz(dst.type), 32);
> >
> > I sent basically the same patch ("[PATCH 06/10] i965/fs: Consider
> > subregister writes to be writing a register.")
> >
> > I don't really care which gets committed, but I did have to think a
> > lot more about yours (and look up what CEILING did) to be sure it did
> > the same thing. Maybe you like mine better?
> 
> I find the helper function easier to understand and less error-prone
> than the open-coded version.  I admit though that the "easier to
> understand" part may not be the case for someone that is not familiar
> with CEILING(), as its meaning is far from obvious from its name.  Maybe
> we could come up with a more descriptive name?

Wow, I don't like the name of that macro at all.  I would expect it to
do the well known mathematical function, ceil().  It doesn't.

In most places in the driver, we do:

   ALIGN(value, 32) / 32;

which would be my preference.  I definitely dislike the old open coded
(val + 31) / 32 business.

Beyond the style...Matt put MAX2 around the whole expression, and you
put it around the inside.  Those might differ - is one more correct than
the other?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150206/11464fda/attachment.sig>


More information about the mesa-dev mailing list