[Mesa-dev] [PATCH 00/57] i965/ir: Switch representation of register offsets and sizes to byte units.
Iago Toral
itoral at igalia.com
Tue Sep 13 06:58:41 UTC 2016
On Mon, 2016-09-12 at 13:32 -0700, Francisco Jerez wrote:
> Francisco Jerez <currojerez at riseup.net> writes:
>
> >
> > Iago Toral <itoral at igalia.com> writes:
> >
> > >
> > > On Fri, 2016-09-09 at 11:37 +0200, Iago Toral wrote:
> > > >
> > > > On Thu, 2016-09-08 at 11:36 +0200, Iago Toral wrote:
> > > > >
> > > > >
> > > > > On Wed, 2016-09-07 at 18:48 -0700, Francisco Jerez wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > This series reworks the representation of register region
> > > > > > offsets
> > > > > > in
> > > > > > the i965 IR to be universally byte-based instead of the
> > > > > > rather
> > > > > > awkward
> > > > > > split between reg_offset and subreg_offset we have in the
> > > > > > FS
> > > > > > back-
> > > > > > end
> > > > > > right now, or the reg_offset field currently used in the
> > > > > > VEC4 IR
> > > > > > which
> > > > > > doesn't allow better granularity than 32B. The most
> > > > > > immediate
> > > > > > motivation is to enable sub-GRF offsets in the VEC4 back-
> > > > > > end,
> > > > > > which
> > > > > > will be useful for various kinds of lowering and
> > > > > > instruction
> > > > > > splitting
> > > > > > required for FP64 support on VEC4 platforms.
> > > > > Thanks a lot for taking care of this!
> > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Patches 01-11 take care of scaling the regs_written and
> > > > > > regs_read
> > > > > > instruction methods on both back-ends and the reg_offset
> > > > > > register
> > > > > > field of VEC4 IR registers. The fs_reg::reg_offset and
> > > > > > ::subreg_offset fields are also unified into a single
> > > > > > register
> > > > > > field.
> > > > > > Because this part of the series is rather bulky I've tried
> > > > > > to
> > > > > > keep
> > > > > > the
> > > > > > changes as obvious and functionally equivalent as possible
> > > > > > at the
> > > > > > cost
> > > > > > of introducing not particularly clever code in some cases
> > > > > > that
> > > > > > could
> > > > > > be simplified with some knowledge of the context. Patches
> > > > > > 31-46
> > > > > > make
> > > > > > a second pass through the code touched in the first part of
> > > > > > the
> > > > > > series
> > > > > > in order to get rid of an amount of cruft.
> > > > > I have reviewed this part and intend to continue reviewing
> > > > > the rest
> > > > > in
> > > > > the following days. Might be a good idea to have another set
> > > > > of
> > > > > eyes
> > > > > reviewing the series or at least skimming through it just in
> > > > > case.
> > > > >
> > > > > You might want to look at least at the second comment I made
> > > > > to
> > > > > patch
> > > > > 2
> > > > > and the comment to patch 7, since these might be actual
> > > > > problems.
> > > > > All
> > > > > other comments are minor things or small clarifications and
> > > > > you can
> > > > > ignore them if you want.
> > > > >
> > > > > With the issues I point out in patches 2 and 7 addressed (or
> > > > > confirmation from your side that these are not real
> > > > > problems),
> > > > > patches
> > > > > 1-11 are:
> > > > >
> > > > > Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
> > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Patches 12-30 address an amount of bugs that became obvious
> > > > > > during
> > > > > > the
> > > > > > conversion to byte units, some of them seem worrying enough
> > > > > > that
> > > > > > it
> > > > > > might make sense to back-port them to stable releases.
> > > > I have just reviewed this part too and it looks good to me.
> > > >
> > > > You probably want to look at least at the comment to patch 25,
> > > > since
> > > > I
> > > > think that might be a genuine mistake. The comment to patch 27
> > > > points
> > > > out a possible problem (but even if it is a problem it is not
> > > > really
> > > > related this patch set). I find patch 28 a bit confusing since
> > > > it
> > > > seems
> > > > to be doing something that is not quite right and then fixing
> > > > it in
> > > > patch 29, I may be misunderstanding what is going on there, but
> > > > otherwise I think it would be best to squash both together.
> > > >
> > > > Other than these things, patches 12-31 are:
> > > >
> > > > Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
> > > Patches 32-46 are:
> > >
> > > Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
> > >
> > > I hope to finish reviewing the series tomorrow.
> > >
> > Thanks!
> >
> BTW, I forgot to point you at this branch which may be useful to you
> if
> you want to rebase your FP64 series on top:
>
> https://cgit.freedesktop.org/~currojerez/mesa/log/?h=i965-ir-byte-uni
> ts
Great, I'll use that, thanks!
> >
> > >
> > > >
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > Patches 47-57 go through the VEC4 back-end and address a
> > > > > > number
> > > > > > of
> > > > > > issues that would arise in existing optimization passes
> > > > > > with
> > > > > > non-GRF-aligned regions, which will be useful for FP64
> > > > > > support. It's
> > > > > > likely not complete and the handling of sub-GRF offsets
> > > > > > doesn't
> > > > > > attempt to be nearly as clever as the FS back-end, but they
> > > > > > should
> > > > > > make a substantial improvement over the current situation.
> > > > > >
> > > > > > [PATCH 01/57] i965/fs: Replace fs_reg::reg_offset with
> > > > > > fs_reg::offset
> > > > > > expressed in bytes.
> > > > > > [PATCH 02/57] i965/vec4: Replace dst/src_reg::reg_offset
> > > > > > with
> > > > > > dst/src_reg::offset expressed in bytes.
> > > > > > [PATCH 03/57] i965/ir: Remove backend_reg::reg_offset.
> > > > > > [PATCH 04/57] i965/fs: Replace fs_reg::subreg_offset with
> > > > > > fs_reg::offset expressed in bytes.
> > > > > > [PATCH 05/57] i965/fs: Add wrapper functions for
> > > > > > fs_inst::regs_read
> > > > > > and ::regs_written.
> > > > > > [PATCH 06/57] i965/vec4: Add wrapper functions for
> > > > > > vec4_instruction::regs_read and ::regs_written.
> > > > > > [PATCH 07/57] i965/fs: Replace fs_inst::regs_written with
> > > > > > ::size_written field in bytes.
> > > > > > [PATCH 08/57] i965/vec4: Replace
> > > > > > vec4_instruction::regs_written
> > > > > > with
> > > > > > ::size_written field in bytes.
> > > > > > [PATCH 09/57] i965/ir: Drop
> > > > > > backend_instruction::regs_written
> > > > > > field.
> > > > > > [PATCH 10/57] i965/fs: Replace fs_inst::regs_read with
> > > > > > ::size_read
> > > > > > using byte units.
> > > > > > [PATCH 11/57] i965/vec4: Replace
> > > > > > vec4_instruction::regs_read with
> > > > > > ::size_read using byte units.
> > > > > > [PATCH 12/57] i965/fs: Return more accurate read size from
> > > > > > fs_inst::size_read for IMM and UNIFORM files.
> > > > > > [PATCH 13/57] i965/fs: Return more accurate read size for
> > > > > > LINTERP
> > > > > > from fs_inst::size_read.
> > > > > > [PATCH 14/57] i965/fs: Handle arbitrary offsets in
> > > > > > brw_reg_from_fs_reg for MRF/VGRF registers.
> > > > > > [PATCH 15/57] i965/fs: Handle fixed HW GRF subnr in
> > > > > > reg_offset().
> > > > > > [PATCH 16/57] i965/fs: Take into account trailing padding
> > > > > > in
> > > > > > regs_written() and regs_read().
> > > > > > [PATCH 17/57] i965/fs: Take into account misalignment in
> > > > > > regs_written() and regs_read().
> > > > > > [PATCH 18/57] i965/vec4: Take into account misalignment in
> > > > > > regs_written() and regs_read().
> > > > > > [PATCH 19/57] i965/fs: Don't consider LOAD_PAYLOAD with
> > > > > > sub-GRF
> > > > > > offset to behave like a raw copy.
> > > > > > [PATCH 20/57] i965/fs: Don't consider LOAD_PAYLOAD with
> > > > > > stride >
> > > > > > 1
> > > > > > source to behave like a raw copy.
> > > > > > [PATCH 21/57] i965/fs: Compare full register offsets in
> > > > > > cmod
> > > > > > propagation pass.
> > > > > > [PATCH 22/57] i965/fs: Fix can_propagate_from()
> > > > > > source/destination
> > > > > > overlap check.
> > > > > > [PATCH 23/57] i965/fs: Fix LOAD_PAYLOAD handling in
> > > > > > register
> > > > > > coalesce
> > > > > > is_nop_mov().
> > > > > > [PATCH 24/57] i965/fs: Drop fs_inst::overwrites_reg() in
> > > > > > favor of
> > > > > > regions_overlap().
> > > > > > [PATCH 25/57] i965/fs: Stop using fs_reg::in_range() in
> > > > > > favor of
> > > > > > regions_overlap().
> > > > > > [PATCH 26/57] i965/vec4: Port regions_overlap() to the vec4
> > > > > > IR.
> > > > > > [PATCH 27/57] i965/vec4: Drop backend_reg::in_range() in
> > > > > > favor of
> > > > > > regions_overlap().
> > > > > > [PATCH 28/57] i965/fs: Take into account copy register
> > > > > > offset
> > > > > > during
> > > > > > compute-to-mrf.
> > > > > > [PATCH 29/57] i965/fs: Fix bogus sub-MRF offset calculation
> > > > > > in
> > > > > > compute-to-mrf.
> > > > > > [PATCH 30/57] i965/fs: Switch mask_relative_to() used in
> > > > > > compute-
> > > > > > to-
> > > > > > mrf to byte units.
> > > > > > [PATCH 31/57] i965/fs: Fix signedness of the return value
> > > > > > of
> > > > > > fs_inst::size_read().
> > > > > > [PATCH 32/57] i965/fs: Simplify byte_offset().
> > > > > > [PATCH 33/57] i965/fs: Simplify
> > > > > > get_fpu_lowered_simd_width() by
> > > > > > using
> > > > > > inequalities instead of rounding.
> > > > > > [PATCH 34/57] i965/fs: Simplify and fix buggy stride/offset
> > > > > > calculations using subscript().
> > > > > > [PATCH 35/57] i965/fs: Simplify result_live calculation in
> > > > > > dead_code_eliminate().
> > > > > > [PATCH 36/57] i965/fs: Simplify a bunch of
> > > > > > fs_inst::size_written
> > > > > > calculations by using component_size().
> > > > > > [PATCH 37/57] i965/fs: Simplify copy propagation
> > > > > > LOAD_PAYLOAD ACP
> > > > > > setup.
> > > > > > [PATCH 38/57] i965/fs: Change region_contained_in() to use
> > > > > > byte
> > > > > > units.
> > > > > > [PATCH 39/57] i965/fs: Move region_contained_in to the IR
> > > > > > header
> > > > > > and
> > > > > > fix for non-VGRF files.
> > > > > > [PATCH 40/57] i965/fs: Use region_contained_in() in
> > > > > > compute-to-
> > > > > > mrf
> > > > > > coalescing pass.
> > > > > > [PATCH 41/57] i965/fs: Get rid of fs_inst::set_smear().
> > > > > > [PATCH 42/57] i965/fs: Misc simplification.
> > > > > > [PATCH 43/57] i965/fs: Print fs_reg::offset field
> > > > > > consistently
> > > > > > for
> > > > > > all register files.
> > > > > > [PATCH 44/57] i965/vec4: Print src/dst_reg::offset field
> > > > > > consistently
> > > > > > for all register files.
> > > > > > [PATCH 45/57] i965/ir: Don't print ARF subnr values twice.
> > > > > > [PATCH 46/57] i965/ir: Update several stale comments.
> > > > > > [PATCH 47/57] i965/vec4: Simplify src/dst_reg to brw_reg
> > > > > > conversion
> > > > > > by using byte_offset().
> > > > > > [PATCH 48/57] i965/vec4: Change opt_vector_float to keep
> > > > > > track of
> > > > > > the
> > > > > > last offset seen in bytes.
> > > > > > [PATCH 49/57] i965/vec4: Check that the write offsets match
> > > > > > when
> > > > > > setting dependency controls.
> > > > > > [PATCH 50/57] i965/vec4: Compare full register offsets in
> > > > > > opt_register_coalesce nop move check.
> > > > > > [PATCH 51/57] i965/vec4: Don't coalesce registers with
> > > > > > overlapping
> > > > > > writes not matching the MOV source.
> > > > > > [PATCH 52/57] i965/vec4: Assign correct destination offset
> > > > > > to
> > > > > > rewritten instruction in register coalesce.
> > > > > > [PATCH 53/57] i965/vec4: Compare full register offsets in
> > > > > > cmod
> > > > > > propagation.
> > > > > > [PATCH 54/57] i965/vec4: Fix copy propagation for non-
> > > > > > register-
> > > > > > aligned regions.
> > > > > > [PATCH 55/57] i965/vec4: Don't spill non-GRF-aligned
> > > > > > register
> > > > > > regions.
> > > > > > [PATCH 56/57] i965/vec4: Assert that ATTR regions are
> > > > > > register-
> > > > > > aligned.
> > > > > > [PATCH 57/57] i965/vec4: Assert that pull constant load
> > > > > > offsets
> > > > > > are
> > > > > > 16B-aligned.
More information about the mesa-dev
mailing list