[Mesa-dev] [PATCH] Revert "i965: Stop aux data compare preventing program binary re-use"

Ben Widawsky ben at bwidawsk.net
Fri Aug 28 09:34:12 PDT 2015


On Fri, Aug 28, 2015 at 10:15:30AM +0300, Pohjolainen, Topi wrote:
> On Fri, Aug 28, 2015 at 09:56:43AM +0300, Pohjolainen, Topi wrote:
> > On Thu, Aug 27, 2015 at 10:05:14AM -0700, Ben Widawsky wrote:
> > > On Thu, Aug 27, 2015 at 10:51:59AM +0300, Pohjolainen, Topi wrote:
> > > > On Wed, Aug 26, 2015 at 03:46:05PM -0700, Ben Widawsky wrote:
> > > > > This reverts commit 1bba29ed403e735ba0bf04ed8aa2e571884fcaaf
> > > > > Author: Topi Pohjolainen <topi.pohjolainen at intel.com>
> > > > > Date:   Thu Jun 25 14:00:41 2015 +0300
> > > > > 
> > > > >     i965: Stop aux data compare preventing program binary re-use
> > > > > 
> > > > > This fixes an intermittent failure in
> > > > > piglit.spec.arb_pixel_buffer_object.texsubimage pbo.sklm64 (maybe other
> > > > > platforms as well, but it is harder to reproduce). I can usually hit the failure
> > > > > within 10 runs of the test. This is a very hairy commit to debug. I'll let Topi
> > > > > handle it, or else we should go with the revert. I am open to either. I got
> > > > > lucky that Jenkins caught this on a run.
> > > > > 
> > > > > Here was the script I used for bisect:
> > > > > 
> > > > > i=0
> > > > > while [ $i -lt 40 ] ; do
> > > > >    ./bin/texsubimage pbo -auto -fbo > /dev/null 2>&1
> > > > >    [[ $? != 0 ]] && echo fail && exit 1
> > > > >    ((i++))
> > > > > done
> > > > > 
> > > > > exit 0
> > > > 
> > > > Should I use different piglit version than the current master? I'm asking
> > > > because I get this with both Mesa master and my patch reverted.
> > > 
> > > My piglit was pretty old. I just updated that, but mesa was master as of
> > > yesterday (output is below)
> > > 
> > > The one bit of advice I can add is that you make sure your system is updated to
> > > the very latest.
> > > 
> > > > 
> > > > testrunner at skl-y:~/topi/piglit$ ./bin/texsubimage pbo -auto -fbo
> > > > Using test set: Core formats
> > > > texsubimage failed
> > > >   target: GL_TEXTURE_2D
> > > >   internal format: GL_COMPRESSED_RGB_S3TC_DXT1_EXT
> > > >   region: 68, 12  32 x 48
> > > > texsubimage failed
> > > >   target: GL_TEXTURE_2D
> > > >   internal format: GL_COMPRESSED_RGBA_S3TC_DXT1_EXT
> > > >   region: 0, 28  116 x 20
> > > > texsubimage failed
> > > >   target: GL_TEXTURE_2D
> > > >   internal format: GL_COMPRESSED_RGBA_S3TC_DXT3_EXT
> > > >   region: 16, 4  60 x 36
> > > > texsubimage failed
> > > >   target: GL_TEXTURE_2D
> > > >   internal format: GL_COMPRESSED_RGBA_S3TC_DXT5_EXT
> > > >   region: 8, 0  104 x 60
> > > > Mesa: User error: GL_INVALID_OPERATION in glTexSubImage2D(out of bounds PBO access)
> > > > PIGLIT: {"result": "fail" }
> > > 
> > > Interesting. It so happens I have a patch that purports to fix some of these
> > > things. I did not try this patch myself for this issue.
> > > http://patchwork.freedesktop.org/patch/54025/
> > > 
> > > (I've been hanging on to this since I needed to do a bit of research to address
> > > Matt's feedback, specifically regarding render compression).
> > > 
> > > > 
> > > > 
> > > > Could you include the console output of the failure you get?
> > > > 
> > > 
> > > Here are two sample failures (occurred in 7 runs)
> > > 
> > > Using test set: Core formats
> > > texsubimage failed
> > >   target: GL_TEXTURE_2D
> > >   internal format: GL_INTENSITY
> > >   region: 27, 1  13 x 61
> > > Mesa: User error: GL_INVALID_OPERATION in glTexSubImage3D(out of bounds PBO access)
> > > 
> > > Using test set: Core formats
> > > texsubimage failed
> > >   target: GL_TEXTURE_2D
> > >   internal format: GL_INTENSITY
> > >   region: 80, 38  24 x 25
> > > texsubimage failed
> > >   target: GL_TEXTURE_2D
> > >   internal format: GL_LUMINANCE8
> > >   region: 50, 5  26 x 58
> > > Mesa: User error: GL_INVALID_OPERATION in glTexSubImage2D(out of bounds PBO access)
> > > PIGLIT: {"result": "fail" }
> > > 
> > > 
> > > > > 
> > > > > Cc: <mesa-stable at lists.freedesktop.org>
> > > > > Cc: Kenneth Graunke <kenneth at whitecape.org>
> > > > > Cc: Topi Pohjolainen <topi.pohjolainen at intel.com>
> > > > > Reported-by: Mark Janes <mark.a.janes at intel.com> (jenkins)
> > > > > Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
> > > > > ---
> > > > >  src/mesa/drivers/dri/i965/brw_state_cache.c | 52 ++++++++++++++++++-----------
> > > > >  1 file changed, 32 insertions(+), 20 deletions(-)
> > > > > 
> > > > > diff --git a/src/mesa/drivers/dri/i965/brw_state_cache.c b/src/mesa/drivers/dri/i965/brw_state_cache.c
> > > > > index fbc0419..e50d6a0 100644
> > > > > --- a/src/mesa/drivers/dri/i965/brw_state_cache.c
> > > > > +++ b/src/mesa/drivers/dri/i965/brw_state_cache.c
> > > > > @@ -200,23 +200,36 @@ brw_cache_new_bo(struct brw_cache *cache, uint32_t new_size)
> > > > >  }
> > > > >  
> > > > >  /**
> > > > > - * Attempts to find an item in the cache with identical data.
> > > > > + * Attempts to find an item in the cache with identical data and aux
> > > > > + * data to use
> > > > >   */
> > > > > -static const struct brw_cache_item *
> > > > > -brw_lookup_prog(const struct brw_cache *cache,
> > > > > -                enum brw_cache_id cache_id,
> > > > > -                const void *data, unsigned data_size)
> > > > > +static bool
> > > > > +brw_try_upload_using_copy(struct brw_cache *cache,
> > > > > +			  struct brw_cache_item *result_item,
> > > > > +			  const void *data,
> > > > > +			  const void *aux)
> > > > >  {
> > > > > -   const struct brw_context *brw = cache->brw;
> > > > > +   struct brw_context *brw = cache->brw;
> > > > >     unsigned i;
> > > > > -   const struct brw_cache_item *item;
> > > > > +   struct brw_cache_item *item;
> > > > >  
> > > > >     for (i = 0; i < cache->size; i++) {
> > > > >        for (item = cache->items[i]; item; item = item->next) {
> > > > > +	 const void *item_aux = item->key + item->key_size;
> > > > >  	 int ret;
> > > > >  
> > > > > -	 if (item->cache_id != cache_id || item->size != data_size)
> > > > > +	 if (item->cache_id != result_item->cache_id ||
> > > > > +	     item->size != result_item->size ||
> > > > > +	     item->aux_size != result_item->aux_size) {
> > > > > +	    continue;
> > > > > +	 }
> > > > > +
> > > > > +         if (cache->aux_compare[result_item->cache_id]) {
> > > > > +            if (!cache->aux_compare[result_item->cache_id](item_aux, aux))
> > > > > +               continue;
> > > > > +         } else if (memcmp(item_aux, aux, item->aux_size) != 0) {
> > > > >  	    continue;
> > > > > +	 }
> > > > >  
> > > > >           if (!brw->has_llc)
> > > > >              drm_intel_bo_map(cache->bo, false);
> > > > > @@ -226,11 +239,13 @@ brw_lookup_prog(const struct brw_cache *cache,
> > > > >  	 if (ret)
> > > > >  	    continue;
> > > > >  
> > > > > -	 return item;
> > 
> > I'm getting stronger feeling that this is somehow timing related. I forced
> > the cache lookup to always miss by replacing the line above by
> > 
> >          return NULL;
> > 
> > And with Ben's machine against gbm backend the error observed by Ben still
> > happens. This should be the safest mechanism as nothing is re-used.
> > 
> > I'll keep looking.
> 
> Actually, when I revert the patch, I still get:
> 
> ATTENTION: default value of option vblank_mode overridden by environment.
> ATTENTION: default value of option vblank_mode overridden by environment.
> Using test set: Core formats
> Mesa: User error: GL_INVALID_OPERATION in glTexSubImage3D(out of bounds PBO access)
> Unexpected GL error: GL_INVALID_OPERATION 0x502

Sorry for not being clearer. That's right. but that's not the failure I was
referring to. The INVALID_OPERATION will result in a piglit skip, not a failure.

The failure I am asking you to look at is different.


More information about the mesa-dev mailing list