[Mesa-dev] [PATCH v3] workarounds: Update workaround names and platforms

Ben Widawsky benjamin.widawsky at intel.com
Wed Feb 10 06:04:00 UTC 2016


On Tue, Feb 09, 2016 at 10:35:45AM -0800, Matt Turner wrote:
> On Tue, Feb 9, 2016 at 9:44 AM, Sameer Kibey <sameer.kibey at intel.com> wrote:
> > Update the format in which workarounds are documented
> > in the source code. This allows mesa to be parsed
> > by the list-workarounds utility in intel-gpu-tools.
> 
> I don't know that I find this valuable.
> 
> Ben touched on one concern -- keeping it updated. But I have another,
> and that's whether the information is accurate, or useful at all.
> 
> 

I think the bottom line is it really doesn't hurt the existing situation. The
primary benefit is it gives us internally a way to easily compare the
workarounds for different drivers.

> > Signed-off-by: Sameer Kibey <sameer.kibey at intel.com>
> > ---
> >  src/mesa/drivers/dri/i965/brw_binding_tables.c | 3 ++-
> >  src/mesa/drivers/dri/i965/brw_blorp.cpp        | 2 ++
> >  src/mesa/drivers/dri/i965/brw_defines.h        | 2 ++
> >  src/mesa/drivers/dri/i965/brw_eu_emit.c        | 2 ++
> >  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 6 ++++++
> >  src/mesa/drivers/dri/i965/brw_pipe_control.c   | 2 ++
> >  src/mesa/drivers/dri/i965/gen6_queryobj.c      | 5 +++--
> >  src/mesa/drivers/dri/i965/gen8_depth_state.c   | 7 ++++---
> >  src/mesa/drivers/dri/i965/intel_batchbuffer.c  | 2 +-
> >  9 files changed, 24 insertions(+), 7 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c b/src/mesa/drivers/dri/i965/brw_binding_tables.c
> > index f3a0310..6dd35dd 100644
> > --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
> > +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
> > @@ -54,13 +54,14 @@ static uint32_t
> >  reserve_hw_bt_space(struct brw_context *brw, unsigned bytes)
> >  {
> >     /* From the Broadwell PRM, Volume 16, "Workarounds",
> > -    * WaStateBindingTableOverfetch:
> >      * "HW over-fetches two cache lines of binding table indices.  When
> >      *  using the resource streamer, SW needs to pad binding table pointer
> >      *  updates with an additional two cache lines."
> >      *
> >      * Cache lines are 64 bytes, so we subtract 128 bytes from the size of
> >      * the binding table pool buffer.
> > +    *
> > +    * WaStateBindingTableOverfetch:hsw,bdw,chv,bxt
> >      */
> >     if (brw->hw_bt_pool.next_offset + bytes >= brw->hw_bt_pool.bo->size - 128) {
> >        gen7_reset_hw_bt_pool_offsets(brw);
> > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp b/src/mesa/drivers/dri/i965/brw_blorp.cpp
> > index 1bc6d15..f798e29 100644
> > --- a/src/mesa/drivers/dri/i965/brw_blorp.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp
> > @@ -318,6 +318,8 @@ brw_hiz_op_params::brw_hiz_op_params(struct intel_mipmap_tree *mt,
> >      * SURFACE_STATE.Surface_Horizontal_Alignment should be 4 for Z24 surfaces,
> >      * not 8. But commit 1f112cc increased the alignment from 4 to 8, which
> >      * prevents the clobbering.
> > +    *
> > +    * WaHizAmbiguate8x4Aligned:hsw
> >      */
> >     depth.width = ALIGN(depth.width, 8);
> >     depth.height = ALIGN(depth.height, 4);
> > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h
> > index 01e0c99..2146172 100644
> > --- a/src/mesa/drivers/dri/i965/brw_defines.h
> > +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> > @@ -1762,6 +1762,8 @@ enum brw_message_target {
> >   * WaForceEnableNonCoherent, HDC memory access may have been overridden by the
> >   * kernel to be non-coherent (matching the behavior of the same BTI on
> >   * pre-Gen8 hardware) and BTI 255 may actually be an alias for BTI 253.
> > + *
> > + * WaForceEnableNonCoherent:bdw,chv,skl,kbl
> 
> This comment is just explaining that 255 on Gen8+ doesn't mean the
> same thing as it did before. It passingly mentions
> WaForceEnableNonCoherent, but that's a kernel driver workaround --
> nothing in Mesa.
> 
> Moreover, when I look that up in the workarounds database it says it
> applies to all BDW, CHV, KBL, but only SKL until D0. I'm not sure if
> we even care about D0. Was that publicly released? Mentioning that it
> applies to "skl" doesn't tell the whole story and might lead to us
> applying it when it's not needed.

There is definitely a huge gap regarding what the actual definition of a
workaround is. My definition is actually very different than most workarounds,
which consist of "set this bit". A workaround to me is like, geometry stage is
broken, so we have to do it in software.... but whatever.

I think your concerns about the accuracy are valid. I believe the kernel does
attempt to capture stepping info, and we do not. We (I, anyway) make an effort
to expunge workarounds for platforms that don't ship, so it's hopefully not even
a thing for us.

> 
> >   */
> >  #define GEN8_BTI_STATELESS_IA_COHERENT   255
> >  #define GEN8_BTI_STATELESS_NON_COHERENT  253
> > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > index 35d8039..7a6179a 100644
> > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > @@ -1891,6 +1891,8 @@ void brw_CMP(struct brw_codegen *p,
> >      *
> >      * It also applies to other Gen7 platforms (IVB, BYT) even though it isn't
> >      * mentioned on their work-arounds pages.
> > +    *
> > +    * WaCMPInstNullDstForcesThreadSwitch:ivb,hsw,vlv
> 
> Not related to your patch, but the ivbgt2 Bug database has not worked
> for me... ever? Try the link in the workarounds database for this one.
> Does it work for you? Maybe you've made contacts during this project
> that might know something?
> 
> >      */
> >     if (devinfo->gen == 7) {
> >        if (dest.file == BRW_ARCHITECTURE_REGISTER_FILE &&
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > index 1916a99..3d19605 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > @@ -1846,6 +1846,8 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width)
> >            * We choose to split into CMP(8) instructions since disabling
> >            * coissuing would affect CMP instructions not otherwise affected by
> >            * the errata.
> > +          *
> > +          * WaCMPInstFlagDepClearedEarly:ivb,hsw,vlv
> 
> Again, "ivb,hsw,vlv" is misleading (unless I'm just misunderstanding)
> because it only applies to HSW Mobile A1, which we certainly don't
> care about.
> 
> >            */
> >           if (dispatch_width == 16 && devinfo->gen == 7 && !devinfo->is_haswell) {
> 
> And in fact, you can see right here that we don't implement it for HSW.
> 
> >              if (dst.file == BRW_GENERAL_REGISTER_FILE) {
> > @@ -1932,6 +1934,8 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width)
> >            * should
> >            *
> >            *    "Force BFI instructions to be executed always in SIMD8."
> > +          *
> > +          * WaForceSIMD8ForBFIInstruction:hsw
> >            */
> >           if (dispatch_width == 16 && devinfo->is_haswell) {
> >              brw_set_default_exec_size(p, BRW_EXECUTE_8);
> > @@ -1954,6 +1958,8 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width)
> >            *
> >            * Otherwise we would be able to emit compressed instructions like we
> >            * do for the other three-source instructions.
> > +          *
> > +          * WaForceSIMD8ForBFIInstruction:hsw
> >            */
> >           if (dispatch_width == 16 &&
> >               (devinfo->is_haswell || !devinfo->supports_simd16_3src)) {
> > diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> > index b41e28e..51b20b2 100644
> > --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
> > +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> > @@ -67,6 +67,8 @@ gen8_add_cs_stall_workaround_bits(uint32_t *flags)
> >   *
> >   * Note that the kernel does CS stalls between batches, so we only need
> >   * to count them within a batch.
> > + *
> > + * WaCsStallAtEveryFourthPipecontrol:ivb,vlv
> 
> Now I'm confused, because this one also applies to HSW Mobile until B0
> (which we again don't care about), but it's not listed here. How can
> that be?

Sameer, can you address some of Matt's concerns about the accuracy of the
workarounds? I realize you don't know the mesa codebase well, but feel free to
ask questions on IRC if you have them.

> 
> 
> The Windows driver has a big list defining Wa* names in terms of some
> conditional that matches the equivalent of devinfo->gen >= xx. Other
> than doing something like that where the code itself defines what
> workarounds we implement, I don't see how this can be successful.

Yeah. It would be good if this wasn't the end of how we handle workarounds, I
hope it's the beginning of coming up with a good way to manage the situation.
I'm open to disagreement on the following, but: in my head, a QA or integration
person should be able to easily figure out if our driver implements some
workaround. Most importantly I'll reiterate that I don't think this actually
makes the situation any worse than it is now, and I have some pretty high hopes
that we'll be able to utilize some internal tools that do make this useful
(especially when you consider this work is going to list kernel workarounds as
well).

So I guess like I said in my last comment - I hope this pans out, but even if it
doesn't, the patch Sameer is hoping to land here is really innocuous.


-- 
Ben Widawsky, Intel Open Source Technology Center


More information about the mesa-dev mailing list