[Mesa-stable] [PATCH] Revert "i965: Stop aux data compare preventing program binary re-use"
Ben Widawsky
ben at bwidawsk.net
Thu Sep 24 10:14:56 PDT 2015
On Thu, Sep 24, 2015 at 11:41:15AM +0100, Emil Velikov wrote:
> Hi guys,
>
> On 2 September 2015 at 20:10, Pohjolainen, Topi
> <topi.pohjolainen at intel.com> wrote:
> > On Thu, Aug 27, 2015 at 10:05:14AM -0700, Ben Widawsky wrote:
> >> On Thu, Aug 27, 2015 at 10:51:59AM +0300, Pohjolainen, Topi wrote:
> >> > On Wed, Aug 26, 2015 at 03:46:05PM -0700, Ben Widawsky wrote:
> >> > > This reverts commit 1bba29ed403e735ba0bf04ed8aa2e571884fcaaf
> >> > > Author: Topi Pohjolainen <topi.pohjolainen at intel.com>
> >> > > Date: Thu Jun 25 14:00:41 2015 +0300
> >> > >
> >> > > i965: Stop aux data compare preventing program binary re-use
> >> > >
> >> > > This fixes an intermittent failure in
> >> > > piglit.spec.arb_pixel_buffer_object.texsubimage pbo.sklm64 (maybe other
> >> > > platforms as well, but it is harder to reproduce). I can usually hit the failure
> >> > > within 10 runs of the test. This is a very hairy commit to debug. I'll let Topi
> >> > > handle it, or else we should go with the revert. I am open to either. I got
> >> > > lucky that Jenkins caught this on a run.
> >> > >
> >> > > Here was the script I used for bisect:
> >> > >
> >> > > i=0
> >> > > while [ $i -lt 40 ] ; do
> >> > > ./bin/texsubimage pbo -auto -fbo > /dev/null 2>&1
> >> > > [[ $? != 0 ]] && echo fail && exit 1
> >> > > ((i++))
> >> > > done
> >> > >
> >> > > exit 0
> >> >
> >> > Should I use different piglit version than the current master? I'm asking
> >> > because I get this with both Mesa master and my patch reverted.
> >>
> >> My piglit was pretty old. I just updated that, but mesa was master as of
> >> yesterday (output is below)
> >>
> >> The one bit of advice I can add is that you make sure your system is updated to
> >> the very latest.
> >>
> >> >
> >> > testrunner at skl-y:~/topi/piglit$ ./bin/texsubimage pbo -auto -fbo
> >> > Using test set: Core formats
> >> > texsubimage failed
> >> > target: GL_TEXTURE_2D
> >> > internal format: GL_COMPRESSED_RGB_S3TC_DXT1_EXT
> >> > region: 68, 12 32 x 48
> >> > texsubimage failed
> >> > target: GL_TEXTURE_2D
> >> > internal format: GL_COMPRESSED_RGBA_S3TC_DXT1_EXT
> >> > region: 0, 28 116 x 20
> >> > texsubimage failed
> >> > target: GL_TEXTURE_2D
> >> > internal format: GL_COMPRESSED_RGBA_S3TC_DXT3_EXT
> >> > region: 16, 4 60 x 36
> >> > texsubimage failed
> >> > target: GL_TEXTURE_2D
> >> > internal format: GL_COMPRESSED_RGBA_S3TC_DXT5_EXT
> >> > region: 8, 0 104 x 60
> >> > Mesa: User error: GL_INVALID_OPERATION in glTexSubImage2D(out of bounds PBO access)
> >> > PIGLIT: {"result": "fail" }
> >>
> >> Interesting. It so happens I have a patch that purports to fix some of these
> >> things. I did not try this patch myself for this issue.
> >> http://patchwork.freedesktop.org/patch/54025/
> >>
> >> (I've been hanging on to this since I needed to do a bit of research to address
> >> Matt's feedback, specifically regarding render compression).
> >>
> >> >
> >> >
> >> > Could you include the console output of the failure you get?
> >> >
> >>
> >> Here are two sample failures (occurred in 7 runs)
> >>
> >> Using test set: Core formats
> >> texsubimage failed
> >> target: GL_TEXTURE_2D
> >> internal format: GL_INTENSITY
> >> region: 27, 1 13 x 61
> >> Mesa: User error: GL_INVALID_OPERATION in glTexSubImage3D(out of bounds PBO access)
> >
> > I got more and more a feeling that this is timing related. As the current
> > logic fails to ever re-use the items in the cache, the comparisons are
> > redundant and we can hardcode the search to always fail. This changes the
> > runtime dynamics closer to the patch being reverted:
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_state_cache.c b/src/mesa/drivers/dri/
> > index f5f279c..746fe7b 100644
> > --- a/src/mesa/drivers/dri/i965/brw_state_cache.c
> > +++ b/src/mesa/drivers/dri/i965/brw_state_cache.c
> > @@ -224,6 +224,8 @@ brw_try_upload_using_copy(struct brw_cache *cache,
> > continue;
> > }
> >
> > + return false;
> > +
> > if (cache->aux_compare[result_item->cache_id]) {
> > if (!cache->aux_compare[result_item->cache_id](item_aux, aux))
> > continue;
> >
> >
> > I can reproduce the error on IVB when I disable fast clears (I noticed that
> > the error is not SKL specific - it depends on the fast clears being disabled).
> Based on the discussion so far, I take it that it's unlikely that
> we'll revert the commit at this point ?
> Should we drop this patch for now ?
>
> Thanks
> Emil
Yes. So far as we can tell, the regression is real and fixed by the revert - but
it's only the one test, and only on SKL. The bug is being tracked here:
https://bugs.freedesktop.org/show_bug.cgi?id=91926
Topi and I discussed a few things...
1. Merging the patch.
2. Merging the patch to stable only
3. Doing nothing. There is a bugzilla people can find.
I am in favor of #3
More information about the mesa-stable
mailing list