[Mesa-dev] [PATCH] llvmpipe: fix a bug in opaque optimization

Thu Jun 27 06:04:51 PDT 2013

Am 27.06.2013 14:51, schrieb Roland Scheidegger:
> Am 27.06.2013 09:48, schrieb Jose Fonseca:
>> ----- Original Message -----
>>> From: Roland Scheidegger <sroland at vmware.com>
>>>
>>> If there are queries active the opaque optimization reseting the bin needs to
>>> be disabled.
>>> (Not really tested since the bug was discovered by code inspection not
>>> an actual test failure.)
>>> ---
>>>  src/gallium/drivers/llvmpipe/lp_query.c     |    2 ++
>>>  src/gallium/drivers/llvmpipe/lp_scene.h     |    2 ++
>>>  src/gallium/drivers/llvmpipe/lp_setup.c     |    4 ++++
>>>  src/gallium/drivers/llvmpipe/lp_setup_tri.c |   28
>>>  +++++++++++++--------------
>>>  4 files changed, 22 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/src/gallium/drivers/llvmpipe/lp_query.c
>>> b/src/gallium/drivers/llvmpipe/lp_query.c
>>> index ac6d09d..7bd090d 100644
>>> --- a/src/gallium/drivers/llvmpipe/lp_query.c
>>> +++ b/src/gallium/drivers/llvmpipe/lp_query.c
>>> @@ -134,6 +134,8 @@ llvmpipe_get_query_result(struct pipe_context *pipe,
>>>           if (pq->end[i] > *result) {
>>>              *result = pq->end[i];
>>>           }
>>> +         /* XXX if this would happen then querying this multiple times
>>> +          * would return a different result each time... */
>>>           if (*result == 0)
>>>              *result = os_time_get_nano();
>>>        }
>>
>> This chunk seems unrelated to the rest of the change and should be in a separate change.  Furthermore seems easily fixable -- just store the value on the put the value pq->count[i] itself.
>>
>> It also seems that if-statement should be after the loop.
> Ok I think I'm just going to nuke it instead. Given how this currently
> works (scene is set to active, query binned) this looks really
> impossible. (I am not sure if it actually makes a lot of sense to bin
> this since the value won't really be more exact than a single
> os_time_get_nano() value at the end of the scene anyway but as long as
> the scene is set to active this still couldn't happen.)

Hmm actually I guess this isn't true. The value isn't really a lot more
accurate but binning it ensures that if multiple ones are in the scene
then those later in the scene are really later if only by a tiny bit
(certainly not by the amount which would really indicate "true" time it
took for the work to be done between the multiple timestamps but I guess
it's better than nothing).
For somewhat accurate count it would probably be necessary to sum the
result from each bin and then divide by the number of bins when getting
the query result.

>>
>>> diff --git a/src/gallium/drivers/llvmpipe/lp_scene.h
>>> b/src/gallium/drivers/llvmpipe/lp_scene.h
>>> index 59cce7d..5501d40 100644
>>> --- a/src/gallium/drivers/llvmpipe/lp_scene.h
>>> +++ b/src/gallium/drivers/llvmpipe/lp_scene.h
>>> @@ -132,6 +132,8 @@ struct lp_scene {
>>>     /* The queries still active at end of scene */
>>>     struct llvmpipe_query *active_queries[LP_MAX_ACTIVE_BINNED_QUERIES];
>>>     unsigned num_active_queries;
>>> +   /* If queries were either active or there were begin/end query commands
>>> */
>>> +   boolean had_queries;
>>>  
>>>     /* Framebuffer mappings - valid only between begin_rasterization()
>>>      * and end_rasterization().
>>> diff --git a/src/gallium/drivers/llvmpipe/lp_setup.c
>>> b/src/gallium/drivers/llvmpipe/lp_setup.c
>>> index 49aead2..49b61c3 100644
>>> --- a/src/gallium/drivers/llvmpipe/lp_setup.c
>>> +++ b/src/gallium/drivers/llvmpipe/lp_setup.c
>>> @@ -237,6 +237,8 @@ begin_binning( struct lp_setup_context *setup )
>>>     setup->clear.zsmask = 0;
>>>     setup->clear.zsvalue = 0;
>>>  
>>> +   scene->had_queries = !!setup->active_binned_queries;
>>> +
>>>     LP_DBG(DEBUG_SETUP, "%s done\n", __FUNCTION__);
>>>     return TRUE;
>>>  }
>>> @@ -1237,6 +1239,7 @@ lp_setup_begin_query(struct lp_setup_context *setup,
>>>              return;
>>>           }
>>>        }
>>> +      setup->scene->had_queries |= TRUE;
>>>     }
>>>  }
>>>  
>>> @@ -1272,6 +1275,7 @@ lp_setup_end_query(struct lp_setup_context *setup,
>>> struct llvmpipe_query *pq)
>>>                 goto fail;
>>>              }
>>>           }
>>> +         setup->scene->had_queries |= TRUE;
>>>        }
>>>     }
>>>     else {
>>> diff --git a/src/gallium/drivers/llvmpipe/lp_setup_tri.c
>>> b/src/gallium/drivers/llvmpipe/lp_setup_tri.c
>>> index 6dc136c..dbfad46 100644
>>> --- a/src/gallium/drivers/llvmpipe/lp_setup_tri.c
>>> +++ b/src/gallium/drivers/llvmpipe/lp_setup_tri.c
>>> @@ -204,22 +204,22 @@ lp_setup_whole_tile(struct lp_setup_context *setup,
>>>     LP_COUNT(nr_fully_covered_64);
>>>  
>>>     /* if variant is opaque and scissor doesn't effect the tile */
>>> -   /*
>>> -    * Need to disable this optimization for layered rendering and cannot use
>>> -    * setup->layer_slot here to determine it, because it could incorrectly
>>> -    * reset the tile if a previous shader used layer_slot but not this one
>>> -    * (or maybe even "undo" clears). So determine this from presence of
>>> layers
>>> -    * instead (in which case layer_slot will have no effect).
>>> -    */
>>> -   if (inputs->opaque && scene->fb_max_layer == 0) {
>>> -      if (!scene->fb.zsbuf) {
>>> +   if (inputs->opaque) {
>>> +      /* Several things prevent this optimization from working:
>>
>> Could you please reformat the comment as an itemized list -- it would make it easier to read.
> Ok.
> 
>>
>>> +       * For layered rendering we can't determine if this covers the same
>>> layer
>>> +       * as previous rendering (or in case of clears those actually always
>>> cover
>>> +       * all layers so optimization is impossible). Need to use fb_max_layer
>>> and
>>> +       * not setup->layer_slot to determine this since even if there's
>>> currently
>>> +       * no slot assigned previous rendering could have used one.
>>> +       * If there were any Begin/End query commands in the scene then those
>>> +       * would get removed which would be very wrong. Furthermore, if
>>> queries
>>> +       * were just active we also can't do the optimization since to get
>>> +       * accurate query results we unfortunately need to execute the
>>> rendering
>>> +       * commands.
>>> +       */
>>> +      if (!scene->fb.zsbuf && scene->fb_max_layer == 0 &&
>>> !scene->had_queries) {
>>>           /*
>>>            * All previous rendering will be overwritten so reset the bin.
>>> -          * XXX This is wrong wrt to all queries arriving here (timestamp,
>>> -          * occlusion, ps invocations). Not counting stuff might be ok but
>>> it
>>> -          * will kill the begin/end query commands too which is definitely
>>> -          * wrong (and at this point we don't even know if there were any
>>> -          * such commands here).
>>>            */
>>>           lp_scene_bin_reset( scene, tx, ty );
>>>        }
>>> --
>>> 1.7.9.5
>>>
>>
>> Otherwise looks great.
>>
>> Jose
>>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>