[Mesa-dev] [PATCH 09/32] i965/fs: Fix fs_inst::regs_written calculation for instructions with scalar dst.

Matt Turner mattst88 at gmail.com
Sun Feb 8 11:20:57 PST 2015


On Sun, Feb 8, 2015 at 4:52 AM, Francisco Jerez <currojerez at riseup.net> wrote:
> Kenneth Graunke <kenneth at whitecape.org> writes:
>
>> On Saturday, February 07, 2015 02:10:19 AM Francisco Jerez wrote:
>>> Matt Turner <mattst88 at gmail.com> writes:
>>>
>>> > On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez <currojerez at riseup.net> wrote:
>>> >> Scalar registers are required to have zero stride, fix the
>>> >> regs_written calculation not to assume that the instruction writes
>>> >> zero registers in that case.
>>> >> ---
>>> >>  src/mesa/drivers/dri/i965/brw_fs.cpp | 3 ++-
>>> >>  1 file changed, 2 insertions(+), 1 deletion(-)
>>> >>
>>> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
>>> >> index 0a82fb7..bafe438 100644
>>> >> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>>> >> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>>> >> @@ -126,7 +126,8 @@ fs_inst::init(enum opcode opcode, uint8_t exec_size, const fs_reg &dst,
>>> >>     case HW_REG:
>>> >>     case MRF:
>>> >>     case ATTR:
>>> >> -      this->regs_written = (dst.width * dst.stride * type_sz(dst.type) + 31) / 32;
>>> >> +      this->regs_written =
>>> >> +         CEILING(MAX2(dst.width * dst.stride, 1) * type_sz(dst.type), 32);
>>> >
>>> > I sent basically the same patch ("[PATCH 06/10] i965/fs: Consider
>>> > subregister writes to be writing a register.")
>>> >
>>> > I don't really care which gets committed, but I did have to think a
>>> > lot more about yours (and look up what CEILING did) to be sure it did
>>> > the same thing. Maybe you like mine better?
>>>
>>> I find the helper function easier to understand and less error-prone
>>> than the open-coded version.  I admit though that the "easier to
>>> understand" part may not be the case for someone that is not familiar
>>> with CEILING(), as its meaning is far from obvious from its name.  Maybe
>>> we could come up with a more descriptive name?
>>
>> Wow, I don't like the name of that macro at all.  I would expect it to
>> do the well known mathematical function, ceil().  It doesn't.
>>
> It does, well, sort of...  I'm not sure who came up with that macro, but
> it seems that what the author had in mind was indeed taking the ceiling
> of the fraction given by p/q for CEILING(p, q).
>
>> In most places in the driver, we do:
>>
>>    ALIGN(value, 32) / 32;
>>
>> which would be my preference.  I definitely dislike the old open coded
>> (val + 31) / 32 business.
>>
> That only works for powers of two, and it still forces you to duplicate
> the denominator -- in this specific case it's okay but it starts to get
> annoying when the denominator is longer than two keystrokes.  How about
> we rename CEILING() to DIV_ROUND_UP() as the kernel calls it?

I like that suggestion.


More information about the mesa-dev mailing list