[Mesa-dev] [PATCH 09/32] i965/fs: Fix fs_inst::regs_written calculation for instructions with scalar dst.
mattst88 at gmail.com
Sun Feb 8 11:20:57 PST 2015
On Sun, Feb 8, 2015 at 4:52 AM, Francisco Jerez <currojerez at riseup.net> wrote:
> Kenneth Graunke <kenneth at whitecape.org> writes:
>> On Saturday, February 07, 2015 02:10:19 AM Francisco Jerez wrote:
>>> Matt Turner <mattst88 at gmail.com> writes:
>>> > On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez <currojerez at riseup.net> wrote:
>>> >> Scalar registers are required to have zero stride, fix the
>>> >> regs_written calculation not to assume that the instruction writes
>>> >> zero registers in that case.
>>> >> ---
>>> >> src/mesa/drivers/dri/i965/brw_fs.cpp | 3 ++-
>>> >> 1 file changed, 2 insertions(+), 1 deletion(-)
>>> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
>>> >> index 0a82fb7..bafe438 100644
>>> >> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>>> >> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>>> >> @@ -126,7 +126,8 @@ fs_inst::init(enum opcode opcode, uint8_t exec_size, const fs_reg &dst,
>>> >> case HW_REG:
>>> >> case MRF:
>>> >> case ATTR:
>>> >> - this->regs_written = (dst.width * dst.stride * type_sz(dst.type) + 31) / 32;
>>> >> + this->regs_written =
>>> >> + CEILING(MAX2(dst.width * dst.stride, 1) * type_sz(dst.type), 32);
>>> > I sent basically the same patch ("[PATCH 06/10] i965/fs: Consider
>>> > subregister writes to be writing a register.")
>>> > I don't really care which gets committed, but I did have to think a
>>> > lot more about yours (and look up what CEILING did) to be sure it did
>>> > the same thing. Maybe you like mine better?
>>> I find the helper function easier to understand and less error-prone
>>> than the open-coded version. I admit though that the "easier to
>>> understand" part may not be the case for someone that is not familiar
>>> with CEILING(), as its meaning is far from obvious from its name. Maybe
>>> we could come up with a more descriptive name?
>> Wow, I don't like the name of that macro at all. I would expect it to
>> do the well known mathematical function, ceil(). It doesn't.
> It does, well, sort of... I'm not sure who came up with that macro, but
> it seems that what the author had in mind was indeed taking the ceiling
> of the fraction given by p/q for CEILING(p, q).
>> In most places in the driver, we do:
>> ALIGN(value, 32) / 32;
>> which would be my preference. I definitely dislike the old open coded
>> (val + 31) / 32 business.
> That only works for powers of two, and it still forces you to duplicate
> the denominator -- in this specific case it's okay but it starts to get
> annoying when the denominator is longer than two keystrokes. How about
> we rename CEILING() to DIV_ROUND_UP() as the kernel calls it?
I like that suggestion.
More information about the mesa-dev