[Mesa-dev] [PATCH 09/32] i965/fs: Fix fs_inst::regs_written calculation for instructions with scalar dst.
kenneth at whitecape.org
Fri Feb 6 17:21:10 PST 2015
On Saturday, February 07, 2015 02:10:19 AM Francisco Jerez wrote:
> Matt Turner <mattst88 at gmail.com> writes:
> > On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez <currojerez at riseup.net> wrote:
> >> Scalar registers are required to have zero stride, fix the
> >> regs_written calculation not to assume that the instruction writes
> >> zero registers in that case.
> >> ---
> >> src/mesa/drivers/dri/i965/brw_fs.cpp | 3 ++-
> >> 1 file changed, 2 insertions(+), 1 deletion(-)
> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> index 0a82fb7..bafe438 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> @@ -126,7 +126,8 @@ fs_inst::init(enum opcode opcode, uint8_t exec_size, const fs_reg &dst,
> >> case HW_REG:
> >> case MRF:
> >> case ATTR:
> >> - this->regs_written = (dst.width * dst.stride * type_sz(dst.type) + 31) / 32;
> >> + this->regs_written =
> >> + CEILING(MAX2(dst.width * dst.stride, 1) * type_sz(dst.type), 32);
> > I sent basically the same patch ("[PATCH 06/10] i965/fs: Consider
> > subregister writes to be writing a register.")
> > I don't really care which gets committed, but I did have to think a
> > lot more about yours (and look up what CEILING did) to be sure it did
> > the same thing. Maybe you like mine better?
> I find the helper function easier to understand and less error-prone
> than the open-coded version. I admit though that the "easier to
> understand" part may not be the case for someone that is not familiar
> with CEILING(), as its meaning is far from obvious from its name. Maybe
> we could come up with a more descriptive name?
Wow, I don't like the name of that macro at all. I would expect it to
do the well known mathematical function, ceil(). It doesn't.
In most places in the driver, we do:
ALIGN(value, 32) / 32;
which would be my preference. I definitely dislike the old open coded
(val + 31) / 32 business.
Beyond the style...Matt put MAX2 around the whole expression, and you
put it around the inside. Those might differ - is one more correct than
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 819 bytes
Desc: This is a digitally signed message part.
More information about the mesa-dev