[Mesa-dev] [PATCH] mesa/math: Allocate memory for GLmatrix elements and its inverse contiguously
Thomas Helland
thomashelland90 at gmail.com
Thu Apr 19 19:32:55 UTC 2018
2018-04-19 20:08 GMT+02:00 Vlad Golovkin <vlad.golovkin.mail at gmail.com>:
> ---------- Forwarded message ----------
> From: Vlad Golovkin <vlad.golovkin.mail at gmail.com>
> Date: 2018-04-19 21:06 GMT+03:00
> Subject: Re: [Mesa-dev] [PATCH] mesa/math: Allocate memory for
> GLmatrix elements and its inverse contiguously
> To: Thomas Helland <thomashelland90 at gmail.com>
>
>
> 2018-04-17 8:55 GMT+03:00 Thomas Helland <thomashelland90 at gmail.com>:
>> Hi, and thanks for the patch =)
>>
>> Have you done any performance testing on this to verify it
>> gives us a speedup of any kind? I'm asking because it seems like
>> this might be something that a decent compiler should be able to do.
>> Performance related patches, at least in core mesa, usually have
>> some justification with benchmark numbers in the commit message.
>
> Hi,
> I examined the resulting assembly for these 3 functions and it turns
> out that compiler wasn't merging these two blocks of memory into one
> (which compiler does that?).
> gcc tried to unroll memcpys to a series of movs which may seem to
> partially defeat the purpose of this patch, but after copying the
> block corresponding to m->m it had to switch destination and source
> registers to the next block resulting in 2 wasted movs.
> As a result we can save malloc and free call (in _math_matrix_ctr and
> _math_matrix_dtr) and 2 movs (when compiler tries to avoid memcpy -
> best case) or 1 memcpy call (in the worst case). It may seem that 2nd
> malloc can place m->inv in memory right after m->m but: 1) compiler
> can't rely on that behaviour 2) allocator will insert some private
> data before each block leading to more cache misses.
> I made some testing with Torcs and Yo Frankie blender game and
> according to perf in Yo Frankie _math_matrix_copy overhead reduced by
> 0.03% - 0.04% while Torcs didn't see any improvement.
>
Good analysis! While the gains are not huge, its probably worthwhile.
With some of the comments adressed this has my RB.
I'll pull it down this weekend, and add the comments if you don't
beat me to it, and then I'll push with my RB once we are past the
18.1 branching. Thanks for the patch =)
> Sorry for the duplicate emails.
>
>> Some style comments below
>>
>> 2018-04-17 1:03 GMT+02:00 Vlad Golovkin <vlad.golovkin.mail at gmail.com>:
>>> When GLmatrix elements and its inverse are stored contiguously in memory it is possible to
>>> allocate, free and copy these fields with 1 function call instead of 2.
>>> ---
>>> src/mesa/math/m_matrix.c | 15 +++++++++------
>>> 1 file changed, 9 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/src/mesa/math/m_matrix.c b/src/mesa/math/m_matrix.c
>>> index 57a49533de..4ab78a1fb3 100644
>>> --- a/src/mesa/math/m_matrix.c
>>> +++ b/src/mesa/math/m_matrix.c
>>> @@ -1438,8 +1438,7 @@ _math_matrix_is_dirty( const GLmatrix *m )
>>> void
>>> _math_matrix_copy( GLmatrix *to, const GLmatrix *from )
>>> {
>>> - memcpy(to->m, from->m, 16 * sizeof(GLfloat));
>>> - memcpy(to->inv, from->inv, 16 * sizeof(GLfloat));
>>> + memcpy(to->m, from->m, 16 * 2 * sizeof(GLfloat));
>>> to->flags = from->flags;
>>> to->type = from->type;
>>> }
>>> @@ -1470,12 +1469,17 @@ _math_matrix_loadf( GLmatrix *mat, const GLfloat *m )
>>> void
>>> _math_matrix_ctr( GLmatrix *m )
>>> {
>>> - m->m = _mesa_align_malloc( 16 * sizeof(GLfloat), 16 );
>>> + m->m = _mesa_align_malloc( 16 * 2 * sizeof(GLfloat), 16 );
>>> if (m->m)
>>> + {
>>
>> Our style guides says to keep the curly bracket after an if on the same line.
>>
>>> + m->inv = m->m + 16;
>>> memcpy( m->m, Identity, sizeof(Identity) );
>>> - m->inv = _mesa_align_malloc( 16 * sizeof(GLfloat), 16 );
>>> - if (m->inv)
>>> memcpy( m->inv, Identity, sizeof(Identity) );
>>> + }
>>> + else
>>> + {
>>
>> } else {
>>
>> Although I see that this file defaults to;
>>
>> {
>> else {
>>
>> for some reason. Feel free to follow existing style, or adjust to my comments.
>> Also, if we want to do this change it deserves a comment in the source.
>>> + m->inv = NULL;
>>> + }
>>> m->type = MATRIX_IDENTITY;
>>> m->flags = 0;
>>> }
>>> @@ -1493,7 +1497,6 @@ _math_matrix_dtr( GLmatrix *m )
>>> _mesa_align_free( m->m );
>>> m->m = NULL;
>>>
>>> - _mesa_align_free( m->inv );
>>> m->inv = NULL;
>>> }
>>>
>>> --
>>> 2.14.1
>>>
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list