[Mesa-dev] [PATCH 3/9] st/mesa: completely rewrite state atoms
Rob Clark
robdclark at gmail.com
Tue Jul 26 02:30:55 UTC 2016
On Mon, Jul 25, 2016 at 1:19 PM, Marek Olšák <maraeo at gmail.com> wrote:
> On Mon, Jul 25, 2016 at 5:42 PM, Rob Clark <robdclark at gmail.com> wrote:
>> On Mon, Jul 25, 2016 at 11:16 AM, Brian Paul <brianp at vmware.com> wrote:
>>> On 07/18/2016 07:11 AM, Marek Olšák wrote:
>>>>
>>>> @@ -183,49 +107,42 @@ static void check_attrib_edgeflag(struct st_context
>>>> *st)
>>>>
>>>> void st_validate_state( struct st_context *st, enum st_pipeline pipeline
>>>> )
>>>> {
>>>> - const struct st_tracked_state **atoms;
>>>> - struct st_state_flags *state;
>>>> - GLuint num_atoms;
>>>> - GLuint i;
>>>> + uint64_t dirty, pipeline_mask;
>>>> + uint32_t dirty_lo, dirty_hi;
>>>> +
>>>> + /* Get Mesa driver state. */
>>>> + st->dirty |= st->ctx->NewDriverState & ST_ALL_STATES_MASK;
>>>> + st->ctx->NewDriverState = 0;
>>>>
>>>> /* Get pipeline state. */
>>>> switch (pipeline) {
>>>> - case ST_PIPELINE_RENDER:
>>>> - atoms = render_atoms;
>>>> - num_atoms = ARRAY_SIZE(render_atoms);
>>>> - state = &st->dirty;
>>>> + case ST_PIPELINE_RENDER:
>>>> + check_attrib_edgeflag(st);
>>>> + check_program_state(st);
>>>> + st_manager_validate_framebuffers(st);
>>>> +
>>>> + pipeline_mask = ST_PIPELINE_RENDER_STATE_MASK;
>>>> break;
>>>> case ST_PIPELINE_COMPUTE:
>>>> - atoms = compute_atoms;
>>>> - num_atoms = ARRAY_SIZE(compute_atoms);
>>>> - state = &st->dirty_cp;
>>>> + pipeline_mask = ST_PIPELINE_COMPUTE_STATE_MASK;
>>>> break;
>>>> default:
>>>> unreachable("Invalid pipeline specified");
>>>> }
>>>>
>>>> - /* Get Mesa driver state. */
>>>> - st->dirty.st |= st->ctx->NewDriverState;
>>>> - st->dirty_cp.st |= st->ctx->NewDriverState;
>>>> - st->ctx->NewDriverState = 0;
>>>> -
>>>> - if (pipeline == ST_PIPELINE_RENDER) {
>>>> - check_attrib_edgeflag(st);
>>>> -
>>>> - check_program_state(st);
>>>> -
>>>> - st_manager_validate_framebuffers(st);
>>>> - }
>>>> -
>>>> - if (state->st == 0 && state->mesa == 0)
>>>> + dirty = st->dirty & pipeline_mask;
>>>> + if (!dirty)
>>>> return;
>>>>
>>>> - /*printf("%s %x/%x\n", __func__, state->mesa, state->st);*/
>>>> + dirty_lo = dirty;
>>>> + dirty_hi = dirty >> 32;
>>>>
>>>> - for (i = 0; i < num_atoms; i++) {
>>>> - if (check_state(state, &atoms[i]->dirty))
>>>> - atoms[i]->update( st );
>>>> - }
>>>> + /* Update states. */
>>>> + while (dirty_lo)
>>>> + atoms[u_bit_scan(&dirty_lo)]->update(st);
>>>> + while (dirty_hi)
>>>> + atoms[32 + u_bit_scan(&dirty_hi)]->update(st);
>>>>
>>>
>>> Could we just use the u_bit_scan64() function and void the hi/lo split?
>>
>> fwiw, we actually did discuss that on irc, but I guess no one
>> summarized on email thread..
>>
>> Marek's concern was that would generate worse code on 32b since it
>> would pull the right-shift into the loop.
>>
>> I'm not entirely sure if gcc would be clever enough in this case or
>> not. I guess someone needs to compare generated asm in both cases.
>> And either use u_bit_scan64() if the compiler is clever enough, or add
>> a comment explaining the reason.
>
> Yeah, I added this comment before the loops:
> "Don't use u_bit_scan64, it may be slower on 32-bit."
>
> On 32-bit, ffsll is an if-then-else expression with some arithmetic
> and shifting one bit to the left is another if-then-else expression.
fwiw, I did spend a bit of time this evening playing around with this,
and the dirty_hi/dirty_lo approach w/ 32b/i686 build works out to be
something like 12 instructions shorter for the loop body, ie. gcc
isn't clever enough (total instruction count increases by doubling the
loops but I think that doesn't matter).. given that this is a hot
spot in profiles that I've looked at, it might even be worth having
some #ifdef 64b / #else.. but ofc that could be left as a future
exercise if someone cares.. either way, r-b
BR,
-R
More information about the mesa-dev
mailing list